Click here to

Session: Imaging: CT Radiomics and Clinical Applications [Return to Session]

Development of Machine Learning Based Algorithm for Prediction of Invasiveness of Early-Lung Adenocarcinoma by Using Chest Computed Tomography

Juyoung Lee1,2*, Seong Yong Park, M.D., Ph.D.3, Jin Sung Kim, Ph.D.1, (1) Department of Radiation Oncology, Yonsei University College of Medicine, Seodaemun-gu, Seoul, KR, (2) Department of Integrative Medicine, Yonsei University College of Medicine, Seoul, KR, (3) Department of Thoracic and Cardiovascular Surgery, Yonsei University College of Medicine, Seoul, KR


SU-IePD-TRACK 2-6 (Sunday, 7/25/2021) 12:30 PM - 1:00 PM [Eastern Time (GMT-4)]

Purpose: To locate and classify lung adenocarcinoma lesions, which are also called ground-glass nodules (GGN) from computed tomography (CT) images, are clinically helpful in determining the treatment plans and the extent of resection before surgery to lung cancer patients with high mortality rates. Herein we report a machine learning-based auto-detection and classification model of early lung adenocarcinoma by using chest CT images.

Methods: An image dataset of 190 chest CT scan datasets from the same number of patients who received the pulmonary resection surgery was used to train and validation of the model. To detect and the GGNs and distinguish them into two large groups: "non-invasive” (including atypical adenomatous hyperplasia, adenocarcinoma in situ, and minimally invasive adenocarcinoma) or "invasive” (invasive adenocarcinoma), the two-step combination model was used in this study. First, Fully Convolutional DenseNet was applied to train the GGN segmentation model from the CT images. Then the true positive cases were categorized into one of two groups by using radiomics, a method of comprehensively classifying the features of tumors by quantifying large amounts of image features.

Results: The mean age of the patients was 62 ± 9.6 years, and 122 (64.2%) were female. The segmentation model showed a dice score of 0.65, a true positive rate of 0.88 (36 out of 41), and averaged 1.5 false positives per case. Among 36 GGNs that the segmentation model detected, the area under the curve of the classification model scored 0.90.

Conclusion: Our preliminary machine learning-based two-step GGN segmentation and classification model required management of false-positive cases but showed excellent performance on the classification for true positive cases. Therefore, our model could be expected to be of great help in clinical use if the false positives are reduced.



    Lung, Quantitative Imaging, Classifier Design


    IM- Dataset Analysis/Biomathematics: Machine learning

    Contact Email