Poster #72 - Ziyan Zhang
- vitod24
- Oct 20
- 1 min read
Scalable Computational Approaches for Genome-Wide Epistatic Variance Component Approximation
Ziyan Zhang, BS, Carnegie Mellon University Martin Zhang , PhD, Carnegie Mellon University Richard Border, PhD, Carnegie Mellon University
Epistatic variance arising from interactions between genes across loci represents a poorly characterized aspect of broad-sense heritability across many complex traits in humans. Accurate estimation of epistatic variance can help address missing heritability and uncover higher-order mechanisms underlying complex traits. However, exact genome-wide epistatic estimation remains computationally intractable, as the number of pairwise interactions (and corresponding latent variables) grows quadratically with the number of markers examined. Following previous efforts, we begin by formulating the pairwise epistatic covariance structure in terms of W := (XXT)∘(XXT), where X denotes the standardized n-by-m matrix of n individuals' m genotypes and ∘ denotes the Hadamard product. Computing W exactly has cubic time complexity assuming n~m. Further, unlike the standard genetic relationship matrix in additive models, the action of W as a linear operator is unavailable due to the Hadamard product, preventing application of existing randomized or iterative methods. Here, we develop quadratic time complexity epistatic variance estimation via a stochastic linear operator u↦Wu. We first disassemble Wu as diag XXTDuXXT, which we approximate using the Hutchinson diagonal estimator. This allows us to avoid forming W explicitly, achieving quadratic time complexity through a small number of random draws. Using this approach, we implement quadratic-time approximations of the method-of-Moments and maximum likelihood estimators. We demonstrate that these estimators are unbiased via numerical experiments and describe their asymptotic behavior. Finally, we discuss our results in the context of outstanding challenges to scalable epistasis quantification.

