Purpose: To predict the increased risk of severe cardiotoxicities in breast cancer patients receiving chemoradiotherapy could be challenging due to the variations of current methods used to evaluate the cardiotoxicity symptoms, individual susceptibility and early markers of injury. In this study, we developed, for the first time to our knowledge, a Light Gradient Boosting Machine (LightGBM)-enabled predictive model to integrate patients’ chart from electronic medical records (EMRs) for early prediction of severe cardiotoxicities.
Methods: A total of 179 breast cancer patients were included. The patients were randomly partitioned to the training set and the validation set for LightGBM predictive model development and validation. The training features extracted from each patient include age, cancer stage, tumor size, tumor location, medical history, chemotherapy drugs, targeted therapy drugs, hormone therapy drugs, distant metastases, surgery, radiation therapy position, radiotherapy dose, electrocardiograph (ECG) signal score and left ventricular ejection fraction (LVEF) value before chemoradiotherapy. The utility of the LightGBM model constructed in predicting severe cardiotoxicities was evaluated by ROC analysis in the validation set.
Results: The AUC value of the LightGBM predictive model achieved in the validation set was 0.82. The sensitivity, specificity, F1 score, precision, and overall accuracy of LightGBM model in the validation set were 78.6%, 81.8%, 81.5%, 84.6% and 80.0%, respectively. The feature important analysis showed that age, the LVEF value before chemoradiotherapy, cancer position, targeted therapy, tumor stage and hormone therapy were the most valuable risk predicting factors.
Conclusion: The LightGBM framework proposed herein affords a means to use EMR data to individualize the prediction of severe cardiotoxicities at point of care of patients with breast cancer receiving chemoradiotherapy, which can facilitate the identification of patients for whom early intervention are warranted before the therapy, thus potentially improving the utility of chemoradiotherapy for breast cancer from a precision treatment perspective.
Feature Selection, Statistical Analysis, Modeling
TH- Dataset Analysis/Biomathematics: Machine learning techniques