Purpose: To identify the optimal number of well-curated patient datasets required to train a deep learning auto-segmentation model for upper abdominal organs.
Methods: Seventy pancreatic cancer patients with contrast-enhanced breath-hold CT images were randomly selected. Eight organs (duodenum, small bowel, large bowel, stomach, liver, spleen, left kidney, right kidney) were recontoured or edited under physician supervision to ensure accuracy and consistency. Thirty patients were reserved as an independent test set. The nnU-Net framework was selected to minimize design variabilities. Seven nnU-Net models were trained with incremental dataset sizes (10, 15, 20, 25, 30, 35, 40 patients and were quantitatively evaluated on the independent test set with Dice similarity coefficients (DSC) and mean surface distance (MSD). Student t-tests were conducted on the quantitative results between all nnU-Net models.
Results: The mean DSC scores remained identical for duodenum, small bowel, large bowel and stomach after patient number reached 25 while the standard deviation decreased. Mean DSC scores of liver and spleen achieved 0.96 when the model was trained with 15 patients. For nnU-Net model trained with 25 patients, the mean DSC and MSD between the automatic segmentation and the ground truth on contrast-enhanced CT images were 0.89±0.05/1.90±1.79mm for small bowel, 0.90±0.06/1.58±1.28mm for large bowel, 0.80±0.08/1.78±0.93mm for duodenum, 0.92±0.03/1.17±0.73mm for stomach, 0.96±0.01/1.08±0.50mm for liver, 0.96±0.02/0.71±0.66mm for spleen, 0.96±0.01/0.61±0.18mm for right kidney and 0.96±0.02/0.65±0.26mm for left kidney. No significant difference was observed between nnU-Net models trained with dataset larger than 25 patients (p > 0.05 for both DSC and MSD for all organs)
Conclusion: We examined the number of patients required to train a robust auto-segmentation model for organs in the upper abdomen. Our results shown that a small amount of well-curated data was sufficient for developing an upper-abdominal auto-segmentation model.
Not Applicable / None Entered.