Click here to

Session: Science Council Session: Advancing Science to Expand Access to State-of-the-Art Applications in Medical Physics [Return to Session]

Augmented Colorectal Cancer Detection Using Self-Attention-Incorporated Deep Learning

X Jia1, S Sang1, Y Zhou2, H Ren1, M Laurie1, M Islam1, O Eminaga1, J Liao1, L Xing1*, (1) Stanford University School of Medicine, Stanford, CA, (2) University of California Santa Cruz, Santa Cruz, CA


TU-GH-BRB-3 (Tuesday, 7/12/2022) 1:45 PM - 3:45 PM [Eastern Time (GMT-4)]

Ballroom B

Purpose: To develop a self-attention-incorporated detector using deep learning of colonoscopy for accurate detection of colorectal cancer (CRC) and characterize its performance on archived patient data.

Methods: “TransRCNN”, a deep learning-based CRC detector, was developed by incorporating Transformer’s self-attention to enhance medical decision-making during CRC screening. TransRCNN uses the advanced feature pyramid network (FPN) as the backbone to seek deeper and richer representations on data input. We built four self-attention pathways by attaching transformer encoder blocks to the pyramidal output P2-P5 and obtained multi-scale attention maps. Feature maps resulting from attention augmentation were concatenated and fed into a region proposal network (RPN) to make the CRC detection predictions. The new model with self-attention embedded is able to compute attention scores with respect to multi-scale features and all pixels of the input, thus emphasizing the informative features and improving the learning process. The training set was constructed by 300 colonoscopy frames that were obtained from 13 colonoscopy video sequences acquired from 13 patients. The test set was constructed based on 612 colonoscopy frames obtained from 31 sequences of 23 patients. Both datasets were annotated by expert video-endoscopists, and a rectangular bounding box was generated to represent the actual CRC area within each image.

Results: TransRCNN was evaluated on the test set with 86.0% recall and 73.5% mAP. This result exceeded the benchmark detector of Faster-RCNN (82.8% recall) with an additional 21 true-positive CRCs.The improvement can be explained as that TransRCNN with transformer attention embedded enables better feature learning throughout the training and, in particular, highlights the foreground proposal regions to detect more true positives while favorably avoiding the false alarms.

Conclusion: We have developed a deep-learning algorithm that accurately detects CRC in colonoscopy. Attention augmented AI platform may aid in diagnostic decision-making to improve diagnostic yield for cancer screening.


Image Analysis, Image Processing, Lesion Detectability


IM/TH- Image Analysis (Single Modality or Multi-Modality): Computer-aided decision support systems (detection, diagnosis, risk prediction, staging, treatment response assessment/monitoring, prognosis prediction)

Contact Email