top of page
Search

Poster #49 - Busra Coskun

  • vitod24
  • Oct 20
  • 2 min read

Benchmarking Statistical Dimension Reduction Frameworks for Integrative Multi-Omics Analysis with Missing Modalities


Busra Coskun; University of Pennsylvania Qi Long, PhD; University of Pennsylvania Konstantinos Tsingas, MS; University of Pennsylvania


Multi-omics studies, which integrate data sources from distinct biological scales such as transcriptomics, proteomics, and metabolomics, have the potential to reveal mechanisms that underlie disease. However, these studies often contain missing information: one or more molecular layers may not be collected for a given patient or sample, which limits the ability of integration and inference. Standard complete-case analyses reduce sample size and statistical power, while naive imputation methods can produce incorrect associations that do not reflect true biology. We compare a broad class of statistical dimension reduction frameworks that estimate shared low-dimensional representations of patients across heterogeneous data types, including generalized factor analysis, joint matrix factorization, and canonical correlation analysis models. We evaluate three strategies for handling incomplete data: (1) complete-case analysis, which discards patients with any missing values; (2) ad-hoc imputation before applying standard integrative models; and (3) methods designed to tolerate missingness, which implicitly leverage available data. Using incomplete multi-omics data from the Alzheimer's Disease Neuroimaging Initiative, we evaluate these approaches using a systematic benchmark. We initially assess whether latent factors learned under each strategy align with clinical diagnosis and staging using a bootstrap analysis. To evaluate robustness and interpretability of the latent structures, we test whether inferred latent clusters are associated with missingness indicators, revealing whether models truly capture biological or clinical patterns or are instead driven by missing data. Overall, this benchmarking study clarifies trade-offs in using integrative methods for multi-omics studies with missingness. In doing so, we demonstrate how different strategies for handling incomplete multi-omics data can influence patient stratification and disease prediction.

 
 
 

Recent Posts

See All
Poster #9 - Yuheng Du

Cell-Type-Resolved Placental Epigenomics Identifies Clinically Distinct Subtypes of Preeclampsia Yuheng Du, Ph.D. Student, Department of Computational Medicine and Bioinformatics, University of Michig

 
 
 
Poster #15 - Jiayi Xin

Interpretable Multimodal Interaction-aware Mixture-of-Experts Jiayi Xin, BS, PhD Student, University of Pennsylvania, PA, USA Sukwon Yun, MS, PhD Student, University of North Carolina at Chapel Hil

 
 
 
Poster #14 - Aditya Shah

Tumor subtype and clinical factors mediate the impact of tumor PPARÉ£ expression on outcomes in patients with primary breast cancer. Aditya Shah1,2, Katie Liu1,3, Ryan Liu1, 4, Gautham Ramshankar1, Cur

 
 
 
bottom of page