Poster #98 Beoung Hun Lee

vitod24
Oct 20, 2025
2 min read

PipeVar: Reproducible pipeline to prioritize pathogenic variants in undiagnosed, rare disease patients for short-read sequencing and long-read sequencing.

Beoung Hun Lee1,2, Elizabeth Bhoj1,3, Rebecca Ahrens-Nicklas1,3, Kai Wang1,2 1. University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA 2. Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA 3. Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, USA

Accurate genetic diagnosis of rare disease depends on detecting and prioritizing the full spectrum of variant classes, from single‑nucleotide variants (SNVs) to structural variants (SVs) and repeat expansions. Recent developments of long-read sequencing (LRS) have improved SV calling and short tandem repeats (STRs) expansion detection that were previously missed by short-read sequencing (SRS). LRS enables phasing of SNVs and SVs, improving resolution of compound heterozygosity and aiding gene discovery. In practice, laboratories now use SRS to capture SNVs/indels, and when causal variants are not found, further use LRS to call SVs, quantify STRs, and resolve compound heterozygous alleles. There are 3-4 million SNVs and thousands of SVs in average human genome, complicating causal variant identification, particularly without phenotype-guided prioritization. Existing tools and LRS pipelines are less well developed than SRS pipelines, analyze only subsets of these variant classes and lack phenotype driven ranking, leaving a gap between raw variant calls and clinical diagnosis. Here, we present PipeVar, a pipeline to prioritize potential disease-causal variants using either long-read and short-read WGS with phenotype information. PipeVar addresses this challenge by integrating multi-class variant calling with automated, HPO-driven prioritization in a unified, portable workflow. Built with Nextflow and containerized via Singularity, PipeVar ensures reproducibility and easy deployment. PipeVar utilizes multiple tools and can analyze SNVs, indels, SVs and repeat expansions, and can take either BAM or VCF and Human Phenotype Ontology (HPO) IDs as input. PipeVar facilitate genetic diagnosis by providing ranked lists of SNVs, indels and SVs based on pathogenicity and the phenotype's relatedness with gene and quantify for existing STR diseases. Using PipeVar, we were able to identify pathogenic SV and SNV in IARS2 in trans, for a previously undiagnosed patient. PipeVar unifies SNVs, indel, SVs, and STR prioritization in a single, portable workflow for both LRS and SRS data, greatly reducing barrier of entry to long-read analysis, and accelerating rare‑disease diagnosis.

MidAtlantic Bioinformatics Conference

Friday November 7, 2025

Poster #98 Beoung Hun Lee

Recent Posts

Comments