Purpose: To comprehensively evaluate whether contours generated from Deep learning (DL) model can be safely used clinically in prostate radiotherapy.
Methods: A 3D U-Net based model was trained on 84 patients and an additional 23 patients with intact prostate were randomly selected for testing. Structures that were used in clinical treatment were further review and modified by an experienced radiation oncologist (RO) as ground-truth (GT). Contours generated by the DL model (AI) were compared with GT contours using dice similarity coefficient (DSC), 95% Hausdorff distance (HD95), and mean surface distance (MSD). To quantify interobserver variability, prostate was recontoured by additional two expert ROs in every case. GT and AI contours were evaluated by the 4th RO in a double-blind study. Finally, Varian RapidPlan was used to create plans using AI or GT contours. Then both plans were evaluated on GT contours to see if the use of AI contour would produce a comparable plan.
Results: AI demonstrates good geometrically accuracy on prostate with mean DSC, HD95, and MSD of 0.83±0.05, 0.60±0.18 cm, and 0.21±0.07 cm respectively. It shows no significant differences between the contours generated from three ROs (p>0.05). Organ at Risk (OARs) also showed good agreement with the GT. In the double-blind study, 95.7% of AI contours were scored as either “Great” (34.8%) or “Acceptable without changes” (60.9%). Totally 69.6% (16 of 23) AI contours were considered equal or better than their counterparts. No significant differences were found in OAR dose from the use of AI instead of GT contours except for bladder and Seminal Vesicle (P<0.05). However, all plans generated with AI contours satisfy the RTOG-0815 dose constraints even when evaluated using GT contours
Conclusion: The investigated 3D U-Net model provided reasonable contours that can be used directly in clinics.
Funding Support, Disclosures, and Conflict of Interest: NIH R43-EB027523, R44-CA254844 Varian Research Grant