Click here to

Session: Machine Intelligence Efficacy and Quality I [Return to Session]

Inter-Observer variation of Target and Organ Contouring Before and After the Adoption of A deep-Learning Auto-Contouring Model for Localized Prostate Cancer

Y Wang*, S C Kamran, J A Efstathiou, Department of Radiation Oncology, Massachusetts General Hospital, Harvard Medical School, Boston, MA


MO-A-BRC-3 (Monday, 7/11/2022) 7:30 AM - 8:30 AM [Eastern Time (GMT-4)]

Ballroom C

Purpose: To investigate if the deployment of a deep-learning auto-contouring model reduced the inter-observer variation of target and organ contouring for prostate cancer patients with a radiopaque rectal hydrogel spacer.

Methods: The model, trained and validated by 145 patients, auto-contours target volumes, organs and spacer. It was tested on a cohort of independent retrospective patients. After clinical deployment, auto contour (AC) was reviewed and edited by physician before approved as planning contour (PC). Based on retrospective analysis, patients with <20 cm³ discrepancy between the prostate PC and AC were selected for this investigation. The retrospective cohort had 54 patients from two physicians [n(A)=40, n(B)=14], whereas the prospective cohort had 100 [n(A)=60, n(B)=40]. The inter-observer variation was evaluated, before and after the model adoption, using difference in prostate volume discrepancy, Dice similarity coefficient (DSC) and mean distance to agreement (MDA). A two-tailed t-test was performed to evaluate statistical significance. The structures with mean DSC>0.8 in the retrospective analysis (prostate, each femur, bladder, rectum and spacer) were examined.

Results: After model adoption, prostate volume discrepancy remained unchanged for physician A (-2.0 vs. -2.1 cm³) but reduced from 11.2 to 5.3 cm³ for physician B; DSC increased and MDA decreased for all structures. In both cohorts, statistically significant inter-observer variation was only found for prostate, but not other structures. Before model adoption, physician A’s advantages on DSC and MDA were 0.072 and -0.85 mm, respectively. After model adoption, these advantages reduced to 0.022 and -0.32 mm. The model deployment reduced the inter-observer variation of prostate contouring by 5.8 cm³ for volume discrepancy, 0.049 for DSC and 0.53 mm for MDA.

Conclusion: The geometric differences between PC and AC were smaller in the prospective cohort, indicating less than expected manual editing. The clinical deployment of the DL model reduced inter-observer variation of prostate contouring.


Image Processing, Observer Performance, Prostate Therapy


IM/TH- Image Segmentation Techniques: Machine Learning

Contact Email