Purpose: Treatment outcome prediction is first step towards plan adaptation and personalized treatment. We aimed to predict overall survival of patients with Oropharyngeal Cancer (OPC), and to further assess contributions of the identified predictors to overall survival using interpretable machine learning (ML) methods.
Methods: 519 patients with OPC from TCIA database were analyzed. 432 cases were used for training, 87 cases were used as an independent cohort. An Interpretable survival prediction model was created using Extreme Gradient Boosting (XGBoost) and SHapley Additive exPlanations (SHAP) algorithm, based on imaging features of gross tumor volume (GTV) extracted from planning CT using Pyradiomics. Patient characteristics, including ECOG_Performance, smoking or drinking status, etc were also considered as potential predictors. A hybrid dimensionality reduction algorithm, consisting of Least Absolute Selection Operator (Lasso) and Sequential Floating Backward Selection (SFBS), was utilized to remove redundant/irrelevant features. Finally, a unified additive feature importance (SHAP value) was computed for each feature in predicting the model output. The top-ranking features and their contribution to the model output, were studied using the linear/non-linear functions. The model performance was evaluated using the area-under-ROC-curve (AUC).
Results: Combined patient characteristics and radiomic features, the prediction model achieved AUCs of 0.945 and 0.846 for the training and independent cohort respectively. The top five features that were most correlated with overall survival were Chemotherapy, Ageatdiagnosis, Drinking, Smoking_PY, lbp-3D-k_gldm_LargeDependenceHighGrayLevelEmphasis. For those patients who had chemotherapy have higher SV which improved survival time, while older Ageatdiagonsis, heavy Drinking and Smoking_PY negatively impacted survival time.
Conclusion: We demonstrated the predictive value of combined patient characteristics and imaging features for overall survival of OPC patients. The identified predictors and their impact on survival, revealed by the explainable ML model, could greatly facilitate clinical decision making towards personalized treatment.
Funding Support, Disclosures, and Conflict of Interest: Supported by the National Natural Science Foundation of China(No.62001380); General Special Scientific Research Program of Shaanxi Provincial Education Department(20JK0910).
CT, Quantitative Imaging, Feature Selection
TH- Response Assessment: Radiomics/texture/feature-based response assessment