Purpose: We investigated the radiomics-based survival prediction of non-small-cell lung cancer (NSCLC) patients by focusing on subgroups with identical characteristics.
Methods: 304 NSCLC (Stages I-IV) patients treated with radiotherapy in our hospital were used. We extracted 107 radiomic features (14 shape features, 18 first-order statistical features, and 75 textural features) from the GTV drawn on the free-breathing treatment-planning CT image. Three feature selection methods [test-retest and multiple-segmentation (FS1), Pearson's correlation analysis (FS2), and a method that combined FS1 and FS2 (FS3)] were used to clarify how they affect survival prediction performance. The radiomic features selected by each selection method were combined with clinical features to form the explanatory variables in this analysis. A least absolute shrinkage and selection operator cox regression was used for subgroup analysis including each histological subtype and each T stage, where the most accurate feature selection method was adapted. Five-fold cross-validation was used to ensure the reliability of the model, and the predictive performance was evaluated with the Concordance-index (C-index) and Kaplan-Meier methods.
Results: In all data, the C-index were 0.64 (FS1), 0.65 (FS2), and 0.64 (FS3) for the training dataset and 0.62 (FS1), 0.63 (FS2), and 0.62 (FS3) for the test dataset. The subgroup analysis indicated that prediction models based on specific histological subtypes and T stages had higher C-index than prediction model based on all data, especially for model based on T4 (training dataset, 0.72; test dataset, 0.70). Moreover, prediction models based on each T stage in adenocarcinoma (ADC) had a higher C-index than prediction model of ADC.
Conclusion: Our results showed that feature selection methods moderately impacted the survival prediction performance. In addition, prediction models based on specific histological subtypes and T stages may improve the prediction performance. These results may prove useful for determining the optimal radiomics-based predication model.
Funding Support, Disclosures, and Conflict of Interest: This study was partially supported by the Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research (C) (19K08116).
TH- Dataset Analysis/Biomathematics: Machine learning techniques