Raseen Tariq, MBBS, Ankita Sethi, BA, Shivaram Poigai Arunachalam, PhD, DBA, Darrell S.. Pardi, MD, MS, William A. Faubion, MD, Sahil Khanna, MBBS, MS Mayo Clinic, Rochester, MN
Introduction: Recurrent Clostridioides difficile infection (rCDI) is common in inflammatory bowel disease (IBD) patients, but robust prediction models are lacking. We report a predictive model determining risk of rCDI in IBD patients using supervised machine learning techniques (ML).
Methods: An ML study using electronic health record (EHR) data from adult patients with IBD and primary CDI from 2012 to 2021 was carried out. Clinical parameters (features) including demographics, comorbidities, care setting, procedures, medications, laboratory tests in the 3 months prior to first CDI were used for model development and validation. CDI recurrence was defined as a positive CDI stool test and an antibiotic prescription within 8 weeks. The dataset was divided into an 80-20 split for training and validation. The model was based on the XGBoost classifier trained on the top 25 features selected by Recursive Feature Elimination (RFE). The model's performance was evaluated using accuracy, the Area under the Receiver Operating Characteristic curve (AuROC), precision, sensitivity (recall) and specificity.
Results: Our cohort included 2573 patients with IBD and CDI, 54.4% were female and median age was 48 years; median BMI was 25.3 and 93% were white. Of those, 72.4% had Ulcerative colitis and 27.1% had Crohn’s disease. At initial CDI diagnosis, 29.1% were hospitalized. Recurrent CDI within 8 weeks was identified in 655 patients (25.4%).
The top 25 features chosen by RFE that predict rCDI are illustrated in the figure. The XGBoost classifier, trained with these features, showed an overall accuracy of 80.05% and an AuROC of 0.80 with 76% sensitivity (recall), 84% specificity and a precision of 83%. The model demonstrated a 78% precision and 84% recall for non-recurrence and 83% precision and 76% recall for predicting recurrence.
Discussion: We report a machine learning model that predicts rCDI in IBD patients using EHR data with high accuracy. This can be used to prognosticate risk of rCDI to implement recurrence prevention strategies.
Figure: Top 25 features using Recursive Feature Elimination for predicting recurrent C difficile infection in IBD patients
Disclosures:
Raseen Tariq indicated no relevant financial relationships.
Ankita Sethi indicated no relevant financial relationships.
Shivaram Poigai Arunachalam indicated no relevant financial relationships.