Chong Li1, Marc Jan Bonder2, Sabriya Syed3, Matthew Jensen4, Human Genome Structural Variation Consortium (HGSVC), HGSVC Functional Analysis Working Group, Mark Gerstein4, Michael C. Zody5, Mark J.P. Chaisson6, Michael E Talkowski7,8,9,10, Tobias Marschall11,12, Jan O Korbel13, Evan E Eichler14,15, Charles Lee3, and Xinghua Shi1 1Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA 2German Cancer Research Center, Division of Computational Genomics and Systems Genetics, Heidelberg, Germany 3The Jackson Laboratory for Genomics Medicine, Farmington, CT, USA 4Dept of Molecular Biochemistry and Biophysics, Yale University, New Haven, CT, USA 5New York Genome Center, New York, NY, USA 6Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA 7Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA 8Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA 9Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA 10Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA 11Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany 12Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany 13European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany 14Department of Genome Sciences, Seattle, University of Washington School of Medicine, WA, USA 15Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
Poster # 59
Decoding the relationship between variants in the human genome and the resulting phenotype is central to functional genome annotation. This includes understanding how DNA's spatial arrangement in the cell nucleus impacts gene functionality and regulation. Hi-C can facilitate exploring 3D chromatin structure, pinpointing regions known as topologically associating domains (TADs) and TAD boundaries. TADs represent stable genomic regions insulated by proteins like the CCCTC-binding factor (CTCF) to restrict chromatin interactions between genes and their regulatory elements. Research indicates that structural genetic variants (SVs) can alter TADs, subsequently affecting gene expression and disease onset. This study outlines a Hi-C analysis pipeline, producing a comprehensive catalog of TADs and TAD boundaries across 44 human genomes in five super populations from the 1000 Genomes Project. The Hi-C dataset comprises 38 billion sequenced read pairs and 23 billion contacts between genomic regions. From this, we have crafted a high-resolution contact map and a list of TADs: 14,612 TAD boundaries, 18,972 TADs, and 6,819 sub-TADs. Remarkably, 2,121 TADs and 172 sub-TADs have not been reported previously. To investigate the impact of SVs on TADs, we identified 430 such variants that significantly impact TAD boundaries (TAD-SVs). 39 of them intersect with previously identified SV-eQTLs (i.e., SVs that are significantly associated with changes in gene expression profiles), and 19 coincide with known SV-sQTLs (i.e., SVs that are significantly associated with changes in gene splicing profiles). Using the ENCODE candidate cis-Regulatory Elements (cCREs) database, we identified 71 of these 430 variants that overlap with cis-regulatory markers, including promoter and enhancer signatures (e.g., DNase, H3K4me3 and CTCF). In conclusion, our research provides an in-depth view of the 3D genome structure in 44 humans, presenting an invaluable resource for understanding the impact of SVs on the 3D chromatin structure and gene regulation mechanisms.
Comments