A python program for automated census of protein domains for comparative proteomics.
Santosh Kumar H. S., Chandrashekhar C. R., Lavan S. Patil, and Sriraksha Prakash
Poster Not on Display
Analysis of protein sequences from different organisms often unravel cases of exon/ORF shuffling in a genome. This shuffling results in the fusion of domains in proteins, either in the same genome or that of some other organism's genomes which are termed as Rosetta Stone sequences/events. These sequences help to link disparate proteins together describing local and global relationship among the proteomes. The functional role of protein is determined mainly by domain- domain interactions.The automated comparison of proteome of different organism can be done with python code, since it is very flexible and robust. In the backdrop of it being highly used in ML and other AI based bioinformatics initiatives, an attempt was to deploy a python code for automated census of domain from pneumonia causing bacteria. Pneumonia, an inflammatory pulmonary condition, is caused by microbes belonging to different classes and it is well established the etiology of the disease can be attributed to protein complement and the protein interaction network. The present study considered the genomes of Mycoplasma pneumonia M129, Streptococcus pneumoniae, Klebsiella pneumoniae KPNH11, Legionella pneumophilia and Chlamydia pneumoniae for comparison.The code, successfully identified the domains, domain association (tethering) phenomenon which led us to observe the versatility and abundance of protein domains across sample genomes. Further, the most abundant genome exclusive domains were taken for the construction of protein interaction network. The network analysis yielded us the hub protein which are found to be potential drug targets when scanned with Therapeutic Target Database which is the testament to the effectiveness of the program. The pipeline was further applied to more genome and found to be effective in domain based comparative genomics and rapid identification of potential drug targets.Keywords: Python, Comparative Genomics, Versatility, Abundance, Protein Interaction Network, Drug Targets