Poster #34 - Brydon Wall(1)
- vitod24
- Oct 20
- 2 min read
Ancestry-Specific Pangenome Graphs Enable the Study of Population-Specific Genomic Variation and Structural Complexity
Brydon P. G. Wall, MS, Department of Biostatistics, SOPH, VCU Bei Zhang, PhD, Department of Biostatistics, SOPH, VCU My Nguyen, BS, Department of Biostatistics, SOPH, VCU Stella Castro, School of Life Sciences and Sustainability, CHS, VCU; The Honors College, VCU Shasmeen Azhar, School of Life Sciences and Sustainability, CHS, VCU; The Honors College, VCU Jinze Liu, PhD, Department of Biostatistics, SOPH, VCU; Massey BISR Mikhail G. Dozmorov, PhD, Department of Biostatistics, SOPH, VCU Jinchuan Xing, PhD, Department of Genetics, Rutgers University; Human Genetic Institute of New Jersey, Rutgers University Katarzyna M. Tyc, PhD, Department of Biostatistics, SOPH, VCU; Massey BISR
Use of a single human reference genome in genomics studies overlooks ancestry-specific sequences and biases variant discovery. It limits our ability to study complex regions and unrepresented human sequences, ignoring potentially clinically relevant genetic variations. Pangenome approaches narrow this gap by integrating diverse human sequences into a graph-based reference that better represents human diversity. We constructed ancestry-specific pangenome graphs for African (AFR) and Admixed American (AMR) populations using Human Pangenome Reference Consortium haplotypes from these populations and GRCh38. While total sequence content remained comparable between graphs, the AFR graph captured substantially greater structural complexity than the AMR graph, consistent with the higher genetic diversity of AFR populations and presence of more unique variants, compared to distinct rearrangements shaped by admixture in the AMR populations. Applying these ancestry-specific graphs to both whole-exome sequencing (WES) and whole-genome sequencing (WGS) public data we examine read mapping against graph-based references relative to traditional linear reference genome. For each data type, we selected representative samples of European (EUR) and AFR ancestry and systematically compared mapping outcomes across linear and graph-based references. We find that the "recovered" reads mapping only to graphs are enriched for repetitive DNA (rDNA arrays, centromeric satellites, young LINE/SVA retrotransposons, and microsatellites). While many represent gaps in the linear reference genome, we are now examining these features in the context of population-specificity to reveal ancestry-associated genome rearrangements, sequence insertions, and gene gains/losses that could contribute to differential disease susceptibility and outcomes. In summary, our work uses ancestry-specific pangenomes as a powerful framework to uncover overlooked genomic variation, reduce systematic biases in downstream analyses including variant calling, and enhance the resolution of human genome analyses. This work lays the foundation for advancing the precision medicine paradigm by ensuring that the genomic tools guiding clinical insights are inclusive of global human diversity.


Comments