Purpose: To model acute and long-term treatment outcomes, a well-labeled, organized database that pulls from multiple data sources is needed. When focused on a comparatively rare patient population in pediatrics, making use of all data available is essential. We set out to design a database and application capable of handling modern and historical data generated by our department.
Methods: A DICOM RT PACS system was established using the big data toolbox provided ProKnow DS. Dose accumulation for multi-phase and adapted treatments was established using a treatment planning system (TPS) agnostic, deformable image registration workflow that transferred and accumulated all doses on a single CT simulation image. Renaming rules were applied to structures to bring historical data in line with TG-263 recommendations. The ProKnow API was used to link with SQL queries against the MOSAIQ database that updated and organized patients based on protocol and trial enrollment. The API was then used to link DICOM RT and dosimetry data with a custom, in-house database developed using PostgreSQL. Importing scripts were developed to pull in clinical data from active and legacy institutional research databases.
Results: Dosimetry data for a total of 918 patients and 958 courses of treatment were aggregated from three different TPSs. Clinical research outcomes were loaded from 3 unique databases. Data were organized at an institutional level and into 10 unique subgroups corresponding to institutional protocols and clinical trial enrollment. A responsive web application was built which includes dashboards to interact with and easily export clinical data collected through trial and protocol databases.
Conclusion: Well-formatted data and insightful visualizations are essential in the move towards big data style analytics. Our early success has seeded the results of several studies investigating clinical outcomes, dosimetric predictors, and physics-based planning studies.