Rare variants enhance the ability to identify associated phenotypes in disease networks
Rasika Venkatesh*1,2, Jakob Woerner*1,2, Vivek Sriram1,2, Dokyoon Kim2,3 1. Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA 2. Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA 3. Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
Poster # 14
Complex diseases exhibit genetic associations with multiple phenotypes, but the role of rare variants in these pleiotropic connections is not well understood. Disease-disease networks (DDNs) offer insight into the relationships between complex disorders through graphical representations where nodes represent disease phenotypes and edges represent phenotype- associated genes or variants that are shared across nodes. We hypothesize that integrating rare variant information in DDNs improves our understanding of cross-disease relationships compared to using common variants alone. We constructed single nucleotide polymorphism (SNP)-based common variant and gene-based rare variant DDNs for 93 diseases using phenome-wide association study (PheWAS) summary statistics from the UK Biobank. Common variants were defined as having a minor allele frequency (MAF) > 0.01, while rare variants had an MAF < 0.001. The common variant network was then supplemented with edges exclusively derived from the rare variant network, resulting in an augmented DDN. The intersection between common and rare variant edges made up 2.09% of the total edges of the augmented DDN, indicating that rare variants uncover cross-phenotype associations not captured by common variants. The augmentation of the common variant DDN led to a 22.7% increase in information, identifying 280 new associations. Egocentric networks focusing on circulatory, endocrine, and neoplasm phenotypes were derived from the augmented DDN. We identified rare-variant edges linking myeloproliferative disease with hypervolemia, as well as type 1 diabetes with mitral valve disease. These genetic connections, well-supported in the literature, were not identifiable using data from common variants alone. These findings underscore the importance of including rare variants in addition to common variants in network analyses of polygenic diseases. Future directions involve integrating additional phenotypes into the rare variant augmented DDN and employing a graph-based semi-supervised learning approach to evaluate its translational utility.