Click here to

Session: AI/ML Autoplanning, Autosegmentation, and Image Processing II [Return to Session]

Is Public Data Enough? A Comparison of Public and Institutional Deep Learning Models for Segmentation of 17 Organs-At-Risk in the Head and Neck

Brett Clark1,2, Nicholas Hardcastle2,3,4, Price Jackson2,4, Leigh Johnston1, James C Korte1,2, (1) Department of Biomedical Engineering, University of Melbourne, Melbourne, Australia (2) Department of Physical Science, Peter MacCallum Cancer Centre, Melbourne, Australia (3) Centre for Medical Radiation Physics, University of Wollongong, Wollongong, Australia (4) Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne


SU-F-BRB-6 (Sunday, 7/10/2022) 2:00 PM - 3:00 PM [Eastern Time (GMT-4)]

Ballroom B

Purpose: To evaluate if public data alone are sufficient for training deep learning based models to segment organs-at-risk (OARs) on institutional CT images of patients with head and neck (HN) cancer. We compare the performance of HN auto-segmentation models trained solely on publicly available or institutional data.

Methods: A public dataset (1144 patients) was created by pooling four individual datasets from the Cancer Imaging Archive consisting of segmentations for 17 HN OARs: brain, brainstem, mandible, oral cavity, spinal cord, left/right brachial plexus, cochlea, lens, optic nerve, parotid and submandibular. Training (456 patients) and test (114 patients) institutional datasets were retrospectively selected from clinical practice. High-resolution two-stage convolutional neural networks (CNNs) based upon the 3D U-Net architecture were trained on either the public or institutional training datasets and evaluated on the institutional test dataset. Model performance was evaluated using dice similarity coefficient (DSC), 95th percentile Hausdorff distance (95HD) and average symmetric surface distance (ASSD) metrics. Statistical differences between models were assessed using a Wilcoxon signed-rank test.

Results: 15 of 17 OARs performed significantly better under at least one metric when trained on institutional data as compared with public data. Left lens and left submandibular gland showed no significant difference across all metrics. Five OARs showed no significant difference when evaluated using the distance metrics (95HD, ASSD).

Conclusion: Public data may be sufficient to train auto-segmentation models for a small subset of HN OARs but the majority of OARs benefit from institutional training data. Further work is required to assess if these differences have a dosimetric impact. Transfer learning with a smaller institutional dataset may reduce the burden of manual segmentation and data curation.

Funding Support, Disclosures, and Conflict of Interest: This research was supported by an Australian Government Research Training Program (RTP) Scholarship.


Segmentation, CT, Radiation Therapy


IM/TH- Image Segmentation Techniques: Machine Learning

Contact Email