Purpose: Artificial intelligence (AI) is key to the improvement of radiotherapy treatment. The training of an AI dose prediction model requires a large amount of planned doses for specific cancer locations. The generation of these labels represent an exorbitant effort that needs to be addressed. Instead of randomly drawing data for experts to label, Active Learning (AL) selects data more efficiently, hence reduces labeling workload. AL prioritizes the annotation of a subset of informative unlabeled samples. Publications in the field have shown impressive results for classification tasks such as segmentation or detection. Our contribution is an adapted approach to select informative data for regression tasks such as dose prediction. In the clinic, the informativeness metric can be used for quality assurance, identifying patients for which the model will not perform as expected.
Methods: The OpenKBP AAPM Grand Challenge 2020 database, composed of 200 head and neck patients, was used to train a 3D UNET model. To estimate the informativeness of unlabeled data, we use a representativeness metric and uncertainty metric. Representativeness estimates how the training set is a good representation of the population. Uncertainty indicates the confidence of a model in its prediction. For representativeness, we used the structural similarity index which has three terms: luminance, contrast and structure. For uncertainty we used Monte Carlo sampling followed by pixel-wise variance. Accuracy is a metric that combines differences of DVH metrics.
Results: Experiments on head and neck patients showed that our proposed workflow can achieve dose prediction with similar accuracy by using only 51% of the training data. Moreover, it allowed us to identify 5% of the dataset that is detrimental to the accuracy of the model performance.
Conclusion: This work demonstrated that using an AL workflow is feasible and can improve drastically the efficiency of labeling dose.