A data scientist for a medical diagnostic testing company has developed a machine learning (ML) model to identify patients who have a specific disease.The dataset that the scientist used to train the model is imbalanced.The dataset contains a large number of healthy patients and only a small number of patients who have the disease.The model should consider that patients who are incorrectly identified as positive for the disease will increase costs for the company.
Which metric will MOST accurately evaluate the performance of this model?
Recall
F1 score
Accuracy
Precision
Explanations:
Recall focuses on identifying as many true positives as possible, but it does not account for false positives, which are costly in this scenario.
F1 score is a balance between precision and recall, but it may not adequately reflect the cost of false positives in this case.
Accuracy is not a good metric for imbalanced datasets because it may overestimate performance by focusing on the majority class.
Precision is the most appropriate metric in this case, as it focuses on minimizing false positives, which directly impact costs.