Purpose: The National Cancer Institute’s (NCI) Cancer Research Data Commons (CRDC) aims to establish a cloud-based data science infrastructure. The Imaging Data Commons (IDC) is a component of CRDC to enable access and exploration of imaging data, and to support integrated analyses with non-imaging data available in other components of CRDC, such as genomics and proteomics repositories. IDC builds on the strengths of established projects such as The Cancer Imaging Archive (TCIA) to collect and share FAIR (Findable Accessible Interoperable Reusable) imaging data.
Methods: IDC uses a combination of commercially available Google Cloud Platform (GCP) and open source components. While the initial focus is to support clinical radiology and radiotherapy data, IDC aims to support brightfield microscopy, multi-channel immunofluorescence and other imaging modalities. Equally important is the ability to support image-derived data, such as annotations of regions of interest. IDC relies on the DICOM standard to harmonize both imaging and image-derived data. IDC data is public and contains no Protected Health Information (PHI). As CDRC grows, imaging datasets will be cross-linked to genomic, proteomic, and clinical data about the subjects.
Results: The IDC pilot, released in October 2020, focused on radiology data from The Cancer Genome Atlas (TCGA) project. The IDC portal is available at https://portal.imagingdatacommons.cancer.gov, and integrates a customized web viewer for visualization of images and image annotations (specifically, DICOM Segmentation and Radiotherapy Structure Set, including multiplanar reformatting).
Conclusion: The IDC pilot which is available to the cancer research community explores the promise of cloud-hosted public imaging collections co-located with the compute resources and a growing number of tools to support data analysis. Production release of IDC is planned for Fall 2021, and will include all of the public TCIA collections, in particular those shared by radiotherapy studies and clinical trials.
Funding Support, Disclosures, and Conflict of Interest: This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN2612015000031.
Not Applicable / None Entered.