Click here to

Session: AI Applications in Image Guided Adaptive Radiation Therapy [Return to Session]

Small Convolutional Neural Networks for Efficient 3D Medical Image Segmentation

A Celaya1, J Actor1,2, R Muthusivarajan1, E Gates1*, C Chung1, D Schellingerhout1, B Riviere2, D Fuentes1, (1) University Of Texas Md Anderson Cancer Center, Houston, TX, (2) Rice University, Houston, TX


TH-E-TRACK 4-5 (Thursday, 7/29/2021) 3:30 PM - 4:30 PM [Eastern Time (GMT-4)]

Purpose: To substantially reduce the computational overhead of convolutional neural networks (CNNs) to allow broader applicability in resource-constrained environments while maintaining the performance of full-sized networks.

Methods: We propose a novel architecture modification that keeps the number of feature maps constant at each level of a CNN. Usually, the number of feature maps doubles at each layer of downsampling. This network design strategy is motivated by the similarity between CNNs and multigrid methods for solving partial differential equations. We call a network that uses this proposed design strategy a “pocket network,” or PocketNet for short.We apply our PocketNet strategy to three popular segmentation architectures (U-Net, ResNet, and DenseNet) and compare each of the modified network's performance to its unmodified counterpart on the publicly available Neurofeedback Skull-stripped (NFBS) repository. The dataset consists of isotropic 3D T1-weighted MR images and ground-truth brain masks (i.e., segmentation) from 125 subjects. We used a five-fold cross-validation scheme to train and evaluate each architecture. Predictions for full-sized and pocket-sized networks are evaluated using the Sorensen-Dice Similarity Coefficient (Dice) and the Hausdorff distance.

Results: The pocket networks averaged a 97% reduction in the number of parameters and achieve segmentation accuracy comparable to the full architectures. The difference in Dice accuracy between the pocket and full architectures for NFBS segmentations is less than 0.003 for U-Net, ResNet, and DenseNet. The mean and maximum Hausdorff distances between the pocket and full architectures differed by less than 0.004 mm and 1.070 mm, respectively. Brain segmentations were visually indistinguishable.

Conclusion: PocketNets demonstrated faster training and inference times and lower memory requirements while retaining segmentation performance. These savings in time and memory provided by PocketNet can make powerful segmentation architectures more accessible in resource-constrained environments that do not have access to specialized computing hardware.

Funding Support, Disclosures, and Conflict of Interest: Jonas Actor and Evan Gates are both supported by a training fellowship from the Gulf Coast Consortia, on NLM Training Program in Biomedical Informatics & Data Science (T15LM007093).



    3D, Computer Vision, Segmentation


    IM/TH- image Segmentation: General (Most aspects)

    Contact Email