Genetic Heterogeneity of Four Deep Learning-derived MCI/AD Dimensions via Genome-wide Tiling ...
Updated: Sep 29
Jiong Chen1,2,3, Junhao Wen, PhD1,2, Zhijian Yang1,2, Yuhan Cui1,2, Jingxuan Bao4, Brian N Lee2, Guray Erus, PhD1,2, Sarah Wait Zaranek, PhD5, Alexander Wait Zaranek, PhD5, Yong Fan, PhD1,2, Andrew J. Saykin, MS, PsyD6, Paul M. Thompson, PhD7, Li Shen, PhD4, Haochang Shou, PhD1,8, Ilya M. Nasrallah, MD, PhD1,2, Christos Davatzikos, PhD1,2,9,10 1Center for Biomedical Image Computing and Analytics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA 2Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA 3Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, USA 4Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, USA 5Curii Corporation, Somerville, MA, USA 6Center for Neuroimaging, Department of Radiology and Imaging Sciences, and the Indiana Alzheimer's Disease Research Center, Indiana University School of Medicine, Indianapolis, USA 7Keck School of Medicine, University of Southern California, Los Angeles, CA, USA 8Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA 9For the Alzheimer's Disease Neuroimaging Initiative 10For the AI4AD consortium
The heterogeneity of Alzheimer's Disease (AD) manifests in many facets, including clinical symptoms, neuroanatomy, brain function, and networks, as well as genetic underpinning. We previously dissected the neuroanatomical heterogeneity of mild cognitive impairment (MCI)/AD patients into four dimensions using a deep learning semi-supervised clustering method, termed Smile-GAN: the P1-dimension captures normal brain anatomy, P2-dimension captures medial temporal-spared mild diffuse atrophy, P3-dimension captures focal medial temporal lobe atrophy, and P4-dimension captures advanced atrophy. In this study, we further investigated which genetic factors might influence the four neuroanatomical dimensions using genome-wide tiling data, a novel representation partitioning the genome into shorter sequences with 250 base-pairs. Genome-wide tiling data were analyzed for 1,481 individuals in the Alzheimer's Disease Neuroimaging Initiative. A linear regression model was fit for each tile variant using Smile-GAN's dimensional scores as the phenotypes. Age, sex, DLICV, and the first five genetic PC were included to adjust for potential confounding. Log-likelihood-ratio test comparing models with and without tile variants was used to derive p-values. We applied the genome-wide significance threshold of -log10(P-value)>7.3 to identify significant associations between tile variants and Smile-GAN dimensions. Seven unique tiles [tile number(mapped gene)] were associated with the Smile-GAN dimensions: 9553646(APOE), 9553678(APOC), 9553680(APOC downstream variant), 9553662(APOC upstream variant), 5461366(CPQ), 1543336(STAT4), and 8703531(AC109462.1). The tile 5461366(CPQ) and 8703531(AC109462.1)) were novel and not previously associated with AD-related traits, and the CPQ has previously been identified to be associated with other clinical traits such as type II diabetes. The tile 1543336(STAT4) was associated with PHF-tau measurement. Other tiles were mapped to well-established AD-related genes: the APOC and APOE genes. Our findings demonstrate the applicability of genome-wide tiling data in discovering novel genetic underpinnings of the neuroanatomical heterogeneity of AD. Further research avenues include understanding variant interactions within tiles and applying machine learning predictive models for disease diagnosis and prognosis.