Purpose: Treatment outcome prediction is first step towards plan adaptation and personalized treatment. We aimed to explore and use interpretable machine learning (ML) methods, with a hope of adding more prognosis value, to predict overall survival of patients with Oropharyngeal Cancer (OPC).
Methods: 200 patients with OPC from TCIA database were pooled and analyzed. 70 cases were used for training, 30 cases were used for evaluation, the remaining cases were used as an independent validation cohort. Survival prediction model was created using Support vector machine (SVM), based on imaging features of gross tumor volume (GTV) extracted from planning CT using Pyradiomics. Patient characteristics, including ECOG_Performance, smoking/drinking status, et al were also considered as potential predictors to construct the prediction models. The shapley value (SV) of each feature was calculated according to the alliance game theory to determine its importance in the prediction model. A positive SV corresponds to the prediction result of the instance is positive, while a negative SV means a negative contribution to the prediction. The final prediction model performance was evaluated using the area-under-ROC-curve (AUC).
Results: The prediction model achieved AUC=0.83 combining both feature categories. Additionally, the correlations between both features categories and survival model can be captured by the SV from the interpretable ML model with interpretable graphs. The top five features that were most correlated with overall prediction were ECOG_Performance, smoking_PY, original_shape_Maximum2DDiameterRow, original_gldm_DependenceNonUniformityNormalized, wavelet-HLH_glszm_SizeZoneNonUniformityNormalized, etc.
Conclusion: We demonstrated the predictive value of combined patient characteristics and imaging features for OPC patients. The mostly relevant features to patient survival and their contribution to the predictive model were determined via interpretable machine learning models, which hold great promises of addition prognostic value compared to blackbox radiomics analysis. The interpretable ML could greatly improve the application of ML model in medical and clinical applications.
Funding Support, Disclosures, and Conflict of Interest: Supported by the National Natural Science Foundation of China(No.62001380); General Special Scientific Research Program of Shaanxi Provincial Education Department(20JK0910).
CT, Quantitative Imaging, Feature Selection
TH- Response Assessment: Radiomics/texture/feature-based response assessment