P2151 - Algorithmic Identification of Treatment-Emergent Adverse Events From Clinical Notes Using Large Language Models: A Pilot Study in Inflammatory Bowel Disease
Anna L.. Silverman, MD1, Madhumita Sushil, PhD2, Balu Bhasuran, PhD2, Dana Ludwig, MD2, James Buchanan, PharmD2, Rebecca Racz, PharmD3, Mahalakshmi Parakala, BS4, Samer El-Kamary, MD, MPH3, Ohenewaa Ahima, MD3, Artur Belov, PhD3, Lauren Choi, PharmD3, Monisha Billings, DDS, MPH, PhD3, Yan Li, PhD3, Nadia Habal, MD3, Qi Liu, PhD3, Jawahar Tiwari, PhD3, Atul Butte, MD, PhD2, Vivek Rudrapatna, MD, PhD2 1Mayo Clinic, Scottsdale, AZ; 2UCSF, San Francisco, CA; 3FDA, Silver Spring, MD; 4University of California Berkeley, Berkeley, CA
Introduction: Outpatient clinical notes are a rich source of information regarding drug safety. However, data in these notes are currently underutilized for pharmacovigilance due to methodological limitations in text mining. Large language models like BERT have shown progress in a range of natural language processing tasks but have not yet been evaluated on adverse event detection.
Methods: We adapted a new clinical language model, UCSF BERT, to identify serious adverse events (SAEs) occurring after treatment with a steroid-sparing immunosuppressant for inflammatory bowel disease (IBD). We compared this model to other language models that have previously been applied to AE detection.
Results: We annotated 928 outpatient IBD notes corresponding to 928 individual IBD patients for all SAE-associated hospitalizations occurring after treatment with a steroid-sparing immunosuppressant (Table 1). These notes contained 703 SAEs in total, the most common of which was failure of intended efficacy (Figure 1). Out of 8 candidate models, UCSF BERT achieved the highest performance on identifying drug-SAE pairs from this corpus (accuracy 88-92%, macro F1 61-68%), with 5-10% greater accuracy than previously published models.
Discussion: Large language models like UCSF BERT achieve superior accuracy on the challenging task of SAE detection from clinical notes compared to prior methods. Future work is needed to adapt this methodology to a wider range of clinical contexts, as well as improve model performance and evaluation using multi-center data. If successful, these models could complement existing pharmacovigilance methods that rely on spontaneous reporting and structured data in support of more accurate assessments of safety signals.
Figure: Figure 1. Network Graph of SAEs by medication class. The width of lines indicates the strength of association by frequency. The size of the nodes is relative to the number of exposures in our corpus to each medication. SAE colors are indicative of which medication(s) they were associated with. An interactive version of this figure can be found at https://ibd-ade.streamlit.app/
Disclosures:
Anna Silverman indicated no relevant financial relationships.
Madhumita Sushil indicated no relevant financial relationships.