Purpose: Although commercial treatment planning systems (TPSs) can automatically solve the optimization problem for treatment planning, human planners need to define and adjust the planning objectives/constraints to obtain clinically acceptable plans. Such a process is labor-intensive and time-consuming. In this work, we show an end-to-end study to train a deep reinforcement learning (DRL) based virtual treatment planner (VTP) that can behave like a human to operate the commercial TPS for high-quality planning.
Methods: We considered prostate cancer IMRT treatment planning as a testbed. The VTP takes DVH of a plan as input and predicts the optimal adjustment action to improve plan quality. The training of VTP followed the state-of-the-art Q-learning framework. Experience replay was implemented with epsilon-greedy search to explore the impacts of taking different actions on a large number of automatically generated plans, from which an optimal policy can be learned. Since the major computational efforts in training were spent to perform plan optimization repeatedly, we implemented a GPU-based Eclipse-equivalent TPS using PYCUDA to improve the optimization efficiency. Upon the completion of training, the established VTP was deployed to plan for a set of testing patient cases. Similar to a human planner, VTP keeps adjusting the planning objectives/constraints in the TPS to improve plan quality until the plan is acceptable or achieves maximum adjusting steps. The generated plans were evaluated using the ProKnow scoring system.
Results: VTP was successfully trained, validated, and tested with 10, 2, and 50 cases, respectively. With PYCUDA, the training efficiency was improved by 3 folds. The mean plan score (± std.) of the 50 test cases was improved from 6.18 ± 1.75 to 8.11 ± 1.27 by VTP, with 9 being the maximal score-value.
Conclusion: The proposed DRL-based VTP was able to operate the Eclipse-equivalent TPS to generate high-quality plans for prostate cancer IMRT.
Not Applicable / None Entered.
Not Applicable / None Entered.