- mabc307
Structuring information via an immune-focused ontology enables the construction of a high-quality kn
Updated: Sep 29, 2022
Van Q. Truong1-5, Joseph D. Romano2, Allison R. Greenplate4-6, Scott M. Dudek2,3, E. John Wherry4-6, Marylyn D. Ritchie2,3,7 1 Graduate Group in Genomics & Computational Biology, Perelman School of Medicine, University of Pennsylvania 2 Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania 3 Biomedical and Translational Informatics Laboratory, Perelman School of Medicine, University of Pennsylvania 4 Immune Health Project, Perelman School of Medicine, University of Pennsylvania 5 Institute for Immunology, Perelman School of Medicine, University of Pennsylvania 6 Department of Pharmacology & Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania 7 Department of Genetics, Perelman School of Medicine, University of Pennsylvania
Autoimmune diseases are a highly heterogeneous family of diseases which occur due to immune dysfunction causing systemic attacks against the body's own tissues. Approximately 4.5% of people worldwide are impacted by at least one autoimmune disorder, but global incidence of autoimmune disease is rising while our comprehensive understanding remains static. Recent discoveries of linked genes provide a clue for commonalities in co-occurring disorders, but tremendous work remains to uncover new links between disorders with suspected likelihoods. The study of autoimmunity is further complicated by the complexity of the immune system as a hierarchical network of interacting biomolecular components spanning multiple levels of biology (genes, proteins, pathways, immune cells, and more). The disorganization and separation of immunological knowledge across disparate repositories is a major hurdle preventing the discovery of possible biomarkers and disease mechanisms. Thus, biomedical informatics is well-suited to unify heterogenous data into a information network known as a knowledge graph (KG) where entities are stored as nodes, while edges represent the relationships connecting them. KGs offer a powerful and efficient solution for connecting heterogeneous knowledge, but recent biomedical implementations were constructed in a non-standardized process, contained outdated information, lacked structure, and/or did not capture autoimmunity well. These challenges led us to i) integrate and structure immune-related information into a queryable knowledge graph, and ii) validate an ontology-based model which turns implicit understanding into explicit reasoning. We integrated several sources of curated immunological and biological knowledge encompassing genes, proteins, pathways, cell types, and diseases. The KG quality was assessed according to established metrics: accuracy, trustworthiness, timeliness, availability, completeness, and consistency. Coupled together, our immune-focused ontology and knowledge graph enables the discovery of links between nodes, which represent pre-existing, unfilled, or new knowledge. Our work provides a path forward to explore data beyond single data-types and embrace a meta-dimensional framework for modeling strategies and applications in autoimmunity