Erica Duh, MD1, Robert Nisbet, PhD2, Amelie Tiritilli, MD1, David L. Cheung, MD1, Christopher Rombaoa, MD3, Mary Kate Roccato, MD4, William Karnes, FACG, MD1 1UC Irvine, Orange, CA; 2UC Irvine, Irvine, CA; 3Sutter West Bay Medical Group, San Francisco, CA; 4Cooper Health Gastroenterology, Camden, NJ
Introduction: The timing of the first colonoscopy for average-risk individuals based on age has resulted in a 30% reduction in colorectal cancer (CRC) mortality among those above the screening age but an alarming rise in CRC among those under the screening age. By ignoring risk factors other than age, the timing of cost-effective preventative screening colonoscopy will remain flawed. We hypothesized that a personalized risk model for precancerous colorectal lesions could be developed with potential to optimize timing of first colonoscopy using decision tree-based machine learning.
Methods: Outcome data from first colonoscopies in the University of California, Irvine Colonoscopy Quality Database (UCICQD) between 2012 and 2023 were combined with pre-colonoscopy data from the EHR. Colonoscopy outcome was defined by the presence or absence of one or more neoplastic (Type 1) lesions (adenoma, sessile serrated polyp, hyperplastic polyps >1cm, or carcinoma). Patients were excluded if EHR or UCICQD data indicated inflammatory bowel disease, positive FIT or DNA fecal test, family history of colorectal cancer, genetic syndrome, or prior history of colonoscopy, bowel surgery, or colonic neoplasm. Following these exclusions and including only first-time colonoscopies and EHR data that predated the colonoscopy, 3,994 patients/colonoscopies were available for analysis. A ten-fold cross-validation process was applied to three models (Tree Ensemble, Gradient Boosted Tree, and XGBoosted Tree) using the open-source Konstanz Information Miner (KNIME).
Results: An ensemble of the 3 models utilizing just 137 modeling variables predicted the presence of Type 1 colonic lesions with an overall accuracy of 67% (Sensitivity = 79%; Specificity = 51%) and an AUC = 0.7. The model was able to increase the average number of projected type 1 lesions per colonoscopy from 1.39 to 1.787 when including positive predictors, a 27.7% increase.
Discussion: The resulting model performs well enough in this population to begin prospective validation as a screening tool to determine the optimal starting time for average-risk screening colonoscopy by incorporating variables in addition to age.
Figure: Flow chart containing variables that fed into our final ensemble of algorithims to create our polyp prediction model.
Disclosures:
Erica Duh indicated no relevant financial relationships.
Robert Nisbet indicated no relevant financial relationships.
Amelie Tiritilli indicated no relevant financial relationships.
David Cheung indicated no relevant financial relationships.
Christopher Rombaoa indicated no relevant financial relationships.
Mary Kate Roccato indicated no relevant financial relationships.
William Karnes: Docbot – Consultant, Owner/Ownership Interest.
Erica Duh, MD1, Robert Nisbet, PhD2, Amelie Tiritilli, MD1, David L. Cheung, MD1, Christopher Rombaoa, MD3, Mary Kate Roccato, MD4, William Karnes, FACG, MD1. P3177 - A Prediction Algorithim for Neoplasia in First Time Screening Colonoscopy, ACG 2023 Annual Scientific Meeting Abstracts. Vancouver, BC, Canada: American College of Gastroenterology.