Poster #91 - Cong Liu(1)

vitod24
Oct 20, 2025
2 min read

Prioritizing NICU Patients for Genetics Referral and Rapid Exome/Genome Sequencing via EHR-Based Machine Learning Models

Cong Liu, PhD; Department of Pediatrics; Division of Genetics and Genomics; Boston Children's Hospital

Rapid exome/genome sequencing (rES/rGS) can critically inform care in the neonatal intensive care unit (NICU) but is resource-intensive and under-utilized. We investigated whether machine learning (ML) models applied to electronic health records (EHRs) can (A) triage NICU patients for genetics referral and (B) prioritize candidates for rapid sequencing. We used the Boston Children's Hospital EHR (2014-2022) to curate concept tiers indicative of genetic evaluation/sequencing and assembled NICU visit-level cohorts. Structured and unstructured data sources included demographics, diagnoses, procedures, medications, vitals, clinical notes, labs, and services. Univariate filtering identified proximal features, which were manually reviewed. Random forest classifiers with class weighting and training-set feature selection were trained for three prediction tasks: (1) genetics referral vs. controls, (2) rES/rGS cohort vs. controls, and (3) rES/rGS vs. genetics referral cohort. Ablation studies assessed index time, feature set composition, and generalizability. Following exclusions, ~3,000 unique patients were analyzed. Given class imbalance, we report precision-recall AUC (PR-AUC). Performance was high for genetics referral triage (PR-AUC 0.944) and moderate for direct rES/rGS triage (0.800), while discriminating rES/rGS from genetics referrals was more challenging (0.374). Top-ranking features included congenital malformation codes, ICU/discharge documentation, subspecialty consultations, and select labs/procedures. Temporal ablation revealed that predictive signal saturated at ~48h post-admission and could anticipate test ordering ~48h in advance. Restricting to ICD-derived Phecodes modestly reduced performance but preserved discriminative ability. Our results demonstrate that ML models leveraging routinely collected EHR data can identify NICU patients likely to benefit from genetics referral or rapid sequencing, achieving strong discrimination in early hospitalization windows. The findings also highlight the sensitivity of model performance to proximal feature inclusion and index-time definition, underscoring the importance of rigorous study design and cautious interpretation of ML-driven clinical prioritization.

MidAtlantic Bioinformatics Conference

Friday November 7, 2025

Poster #91 - Cong Liu(1)

Recent Posts

Comments