Click here to

Session: Radiomics [Return to Session]

A Framework for Reproducible and Scalable Radiomics Pipelines for Radiation Oncology Using Self-Contained Containers and Workflow Manager

W Choi1*, H Nourzadeh1, S Lee2, Y Chen1, N Ghassemi1, Y Vinogradskiy1, A Dicker1, (1) Thomas Jefferson University Hospital, Philadelphia, PA, (2) Marshall University, Huntington, WV


SU-H330-IePD-F9-3 (Sunday, 7/10/2022) 3:30 PM - 4:00 PM [Eastern Time (GMT-4)]

Exhibit Hall | Forum 9

Purpose: Radiomics is a fast-growing research area in medical physics. To date, it lacks standardized and reproducible pipelines. We designed a comprehensive, open-source, platform-independent framework to facilitate radiomics research reproducibility, speed, and clinical integration.

Methods: The proposed framework comprises processes for data import, transformation, segmentation, feature extraction, and other tasks. We implemented the processes using self-contained Docker containers. The pipeline processes are managed by Nextflow, a scalable workflow manager compatible with various computing platforms. Using the workflow manager and containers, the radiomics pipelines can be launched on-premises, in a public or private cloud. We built a Docker container for our in-house radiomics tools and employed other tools on the Docker hub, such as Chest Imaging Platform, XNAT, and PyRadiomics. The framework was evaluated by duration time, CPU, and memory usages on public datasets (TCGA-LUAD and BRCA) and an internal dataset (Lung Cancer Clinical Trial). Speedups were also calculated by dividing sequential processing time by parallel processing time to evaluate scalability. To facilitate streamlined communication between Radiation Oncology clinical software and the radiomics pipeline, the pipeline was integrated with MIM software using DICOM communication.

Results: We evaluated three pipelines for LUAD (N=53), BRCA (N=78), and LungTrial (N=48) datasets. DICOM-Retrieval, DICOM-to-NRRD, Segmentation, and Feature-Extraction processes make up the pipelines. These processes were used as needed. The speedups were 4.5, 24.3, and 18.6 for LUAD, BRCA, and LungTrial, respectively. Segmentation in the BRCA pipeline consumed the most CPU (1200%). Feature-Extraction process in the LungTrial used the largest memory space (1600MB) due to the large target volume (average 116.2cc).

Conclusion: The proposed framework provides a versatile and scalable platform for developing and analyzing radiomics across institutions. The framework can also affect the interpretation of clinical importance by integrating radiomics with multimodal clinical data including histopathology images and genomic data.


Feature Extraction, Segmentation, Software


IM/TH- Informatics: Informatics in Imaging (general)

Contact Email