Translational Bioinformatics for Heterogenous Longitudinal Data in Pre-Clinical Models of Neuro...
Updated: Sep 29, 2022
Hunter A. Gaudio1,2, Viveknarayanan Padmanabhan3, Gerard Laurent4, Ryan W. Morgan1,2, Julia Slovis1,2, Frank Mi3, Helen Shi3, Luiz Eduardo Silva3, Wesley B. Baker2,4, Fuchiang Tsui3, Todd J. Kilbaugh1,2, Tiffany S. Ko1,2 1 Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, PA 2 The Resuscitation Science Center, Children's Hospital of Philadelphia Research Institute, Philadelphia, PA 3 Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 4 Division of Neurology, Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA
The CHOP Resuscitation Science Center's Large Animal Laboratory has established numerous high-fidelity pediatric models of neurological injury (e.g., cardiac arrest, traumatic brain injury). These models enable controlled and reproducible characterization, but present a secondary challenge: optimizing the processing, visualization, and sharing of multi-modal physiologic datasets. File System: Working with the Arcus Library Science Team, we developed file architecture with standardized directory structures, cohort and subject ID syntax, and data type storage across experimental models. Data Collection: In the last 5 years, measurements of high-resolution physiologic waveforms, imaging, biological samples, and associated genomic, metabolomic, proteomic, and histologic data were acquired in 600+ animals. Physiologic time-series data is collected via ADInstrument PowerLab acquisition hardware by LabChart software. Demographic and episodic data is collected using metadata-driven EDC software (REDCap). Stored biological samples are categorized using Freezerworks. Experimentally generated values (e.g., mitochondrial respiration) are saved to the endpoints directory. Data Processing and Quality Review: A modular MATLAB program performs raw data processing and time-synchronization and exports a subsampled spreadsheet. Monitoring manifests, generated from REDCap user input, aid in data quality review, which is required to push data to interim and endpoints directories. Data Warehousing & Visualization: A Python ETL pipeline pulls data from REDCap and the interim data directory and pushes it to a PostgreSQL database. This code is packaged into a deployable Docker container, version-controlled using GitHub. A container registry, Red Hat Quay, builds and deploys the Docker container on a Kubernetes cluster and schedules daily deployment, scaling, and management of the container. A custom Qlik Sense dashboard queries the database for interactive data visualization and download. Advanced statistical tools are being applied to this data to identify better predictive markers of neurological injury. Support development for emerging diagnostic (sequencing, metabolomics, proteomics) and image (advanced imaging, histopathology, echocardiography) data products is ongoing.