Poster #41 - Shupeng Luxu
- vitod24
- Oct 20
- 2 min read
GeneGrad: Geometric Gradient-Based Biomarker Discovery in Single-Cell Transcriptomics
Shupeng Luxu, MS candidate, Department of Biostatistics, Harvard T.H. Chan School of Public Health; Rong Ma, PhD, Department of Biostatistics, Harvard T.H. Chan School of Public Health
Single-cell technologies have enabled the characterization of gene expression at cellular resolution, providing critical insights into both static cellular heterogeneity and dynamic biological processes such as cell differentiation and disease progression. Current approaches for cell type annotation rely heavily on clustering resolution and known reference marker genes, which may overlook biomarkers that capture more subtle differences critical for distinguishing transient cell states within existing cell types that are associated with developmental or disease processes. Although several "cluster-free" biomarker identification methods have been proposed, they do not fully exploit the information encoded in the smooth local geometry of the cell-state manifold. Here, we introduce GeneGrad, a novel computational framework designed to identify key genes by estimating gene-specific geometric expression gradients-the directions of maximal expression increase within each cell's local neighborhood, defined with respect to the intrinsic geometry of the high-dimensional cell-state manifold. The resulting gene-gradient vector fields can be projected into two-dimensional space alongside the cells, enabling the discovery of important geometric patterns such as flow divergence, local heterogeneity, and alignment with developmental trajectory. This analysis facilitates the identification of important genetic programs driving cell-state transitions. We first validated our methodology on synthetic data, demonstrating high accuracy in both gradient estimation and direction recovery. Then, we apply GeneGrad to developmental single-cell datasets of hematopoietic progenitor cells (HPCs), identifying markers predictive of their subsequent divergence into ten lineages. We further extend the framework beyond single-cell transcriptomics to additional modalities-including ATAC-seq-highlighting its utility for uncovering biologically meaningful signals in complex cellular processes.


Comments