Limitations of ROC on Imbalanced Data: Evaluation of LVAD Mortality Risk Scores

29 Oct 2020  ·  Faezeh Movahedi, Rema Padman, James F. Antaki ·

Objective: This study illustrates the ambiguity of ROC in evaluating two classifiers of 90-day LVAD mortality. This paper also introduces the precision recall curve (PRC) as a supplemental metric that is more representative of LVAD classifiers performance in predicting the minority class. Background: In the LVAD domain, the receiver operating characteristic (ROC) is a commonly applied metric of performance of classifiers. However, ROC can provide a distorted view of classifiers ability to predict short-term mortality due to the overwhelmingly greater proportion of patients who survive, i.e. imbalanced data. Methods: This study compared the ROC and PRC for the outcome of two classifiers for 90-day LVAD mortality for 800 patients (test group) recorded in INTERMACS who received a continuous-flow LVAD between 2006 and 2016 (mean age of 59 years; 146 females vs. 654 males) in which mortality rate is only %8 at 90-day (imbalanced data). The two classifiers were HeartMate Risk Score (HMRS) and a Random Forest (RF). Results: The ROC indicates fairly good performance of RF and HRMS classifiers with Area Under Curves (AUC) of 0.77 vs. 0.63, respectively. This is in contrast with their PRC with AUC of 0.43 vs. 0.16 for RF and HRMS, respectively. The PRC for HRMS showed the precision rapidly dropped to only 10% with slightly increasing sensitivity. Conclusion: The ROC can portray an overly-optimistic performance of a classifier or risk score when applied to imbalanced data. The PRC provides better insight about the performance of a classifier by focusing on the minority class.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here