top of page
Search

Poster #2 - Haedong Kim

  • vitod24
  • Oct 20
  • 2 min read

Exome-Wide Copy Number Variation in 142,357 Individuals from Autism Spectrum Families in the Simons SPARK Cohort


Haedong Kim,1,2 Grace Tzun-Wen Shaw,1 Jeffrey K. Ng,3 Timothy L. Mosbruger,1 Ramakrishnan Rajagopalan,1,2 Tychele N. Turner*,3,4 Tristan J. Hayeck*,1,2 (*Co-Senior Authors. 1. Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA. 2. Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA. 3. Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, US. 4. Intellectual and Developmental Disabilities Research Center, Washington University School of Medicine, St Louis, Missouri.)


Rare and de novo CNVs play a critical role in neurodevelopmental disorders like autism spectrum disorder (ASD). While whole-exome sequencing (WES) has become more accessible in clinical settings, CNV detection from short-read WES data faces challenges including inherent biases, inconsistent results across callers, and substantial false positive/negative rates, necessitating labor-intensive manual curation. We developed a scalable CNV call and scoring pipeline to detect and distinguish valid CNVs from large datasets. The aim of this project is to provide both tools and resources, we applied our approach to the SFARI SPARK cohort, analyzing 142,357 individuals. First, we developed a novel CNV caller based on a fast kernel change point detection method. Then hybrid partially Bayesian machine learning framework is employed to train scoring models. We curated training datasets and generated features for modeling from high quality cross platform data and manually labeled CNVs by experts. Features were derived from multiple complementary sources to provide a comprehensive perspective CNVs to account for systematic effects on CNV detection reliability, including primary statistics from various read-depth signals, genomic context properties, individual-level demographic, and ancestral factors. We provide additional CNV call sets with the features generated and scored by the model using other established methods (i.e., XHMM, CoNIFER, cn.MOPS), providing a comprehensive resource of CNV map for the SPARK cohort. Our results and models demonstrate strong performance (AUC ≈ 0.9995; F1≈ 0.9989) and improvements in CNV detection accuracy, providing a resource for ASD research, and enabling seamless integration with existing clinical and research pipelines

 
 
 

Recent Posts

See All
Poster #9 - Yuheng Du

Cell-Type-Resolved Placental Epigenomics Identifies Clinically Distinct Subtypes of Preeclampsia Yuheng Du, Ph.D. Student, Department of Computational Medicine and Bioinformatics, University of Michig

 
 
 
Poster #15 - Jiayi Xin

Interpretable Multimodal Interaction-aware Mixture-of-Experts Jiayi Xin, BS, PhD Student, University of Pennsylvania, PA, USA Sukwon Yun, MS, PhD Student, University of North Carolina at Chapel Hil

 
 
 
Poster #14 - Aditya Shah

Tumor subtype and clinical factors mediate the impact of tumor PPARɣ expression on outcomes in patients with primary breast cancer. Aditya Shah1,2, Katie Liu1,3, Ryan Liu1, 4, Gautham Ramshankar1, Cur

 
 
 

Comments


bottom of page