Purpose: To establish reliable model that makes correct predictions with low uncertainty and makes wrong predictions with high uncertainty, we would like to compare the widely used uncertainty calculators and propose a new loss function.
Methods: For uncertainty estimation, Monte Carlo Dropout (MCDO) and Evidential Deep Learning (EDL) are mostly used in medical image analysis. However, there is no study about comparing these two basic uncertainty estimation mechanisms. In this study, we employed two metrics: Accuracy vs uncertainty (AvU) and point biserial correlation coefficient (PBCC) to find out which one is more appropriate for uncertainty calculation in medical image classification. AvU has been widely used for reliability evaluation. PBCC is a statistic used to estimate the degree of relationship between a naturally occurring dichotomous nominal scale and an interval scale. We firstly introduced PBCC measuring the correlation between certainty and correctness to evaluate the reliability of prediction model. And then to improve prediction models’ reliability, we proposed a novel loss function with cross entropy and our new regularization term based on MCDO that enlarges the margin between mean certainty of correct predictions and mean certainty of wrong predictions (MCW).
Results: For chest X-ray dataset (CXR), PBCC is 0.38 using MCDO and is 0.28 using EDL. For brain tumor MRI dataset (MRI), PBCC is 0.35 using MCDO and is 0.33 using EDL. The proposed loss function MCW achieved AUC of 0.96, AvU of 70.5 and PBCC of 0.42 for CXR, and achieved AUC of 0.86, AvU of 69.4 and PBCC of 0.42 for MRI.
Conclusion: In this study, we found that MCDO can achieve better reliability than EDL and PBCC is appropriate to evaluate reliability performance. The proposed MCW loss function which enlarges the margin of certainty between correct and wrong predictions can achieve higher reliability performance and good accuracy.
Funding Support, Disclosures, and Conflict of Interest: This work is supported by the Science and Technology Program of Shaanxi Province, China (No. 2021KW-01)
Image Analysis, Classifier Design, Modeling
IM/TH- Image Analysis (Single Modality or Multi-Modality): Computer-aided decision support systems (detection, diagnosis, risk prediction, staging, treatment response assessment/monitoring, prognosis prediction)