A fintech company has developed a machine learning model that classifies insurance claims into fraudulent or legitimate. Given the financial implications where the expense of processing a fraudulent claim surpasses the cost associated with investigating a potentially fraudulent claim, which evaluation metric is most appropriate for assessing the performance of this classification model to ensure optimal financial outcomes?