Purpose: To propose a method to solve BOO problem in a matter of seconds without dose-influence matrices information available. We compare the proposed method with the state of the art column generation method (CG) to show its capability in solving BOO problem.
Methods: A tree search embedded reinforcement learning method (RL)is proposed to solve beam orientation optimization problem (BOO) in IMRT treatment planning for patients with prostate cancer. This method uses a pre-trained deep neural learning (DNN) model introduced in  and improves on it by adding a tree-search algorithm during the training of the model. During clinical use, after the model is trained, it can suggest a set of beams for treatment panning in less than one second with only patient anatomical information as input. The same set of patients previously used to train and validate the DNN model is used here and a separate independent set of data is used for testing.
Results: Although the model is still in training stages, because of the complex and memory intensive training steps, initial results are promising. In about 50% of the cases, RL successfully surpassed CG, and in the rest, although it may not surpass CG, the model will be able to find a solution as good as or better than the DNN model. In all cases, the beam selection process takes 1 seconds to propose five beams for each patient.
Conclusion: A fast method for solving BOO problem that uses only anatomical features of the patient and can perform better compared to DNN model and CG.
Funding Support, Disclosures, and Conflict of Interest: This work was sponsored by NIH grant No. R01CA237269 and Cancer Prevention and Research Institute of Texas (CPRIT) (IIRA RP150485, MIRA RP160661).
Treatment Planning, Prostate Therapy, Optimization
Not Applicable / None Entered.