Sen Yang, UT Southwestern Medical Center/Southern Methodist University Shidan Wang, University of Texas Southwestern Medical Center Yiqing Wang, Southern Methodist University Dajiang J. Liu, Pennsylvania State University Xiaowei Zhan, University of Texas Southwestern Medical Center
Supervised contrastive learning methods have been recognized for their efficacy in enhancing microbiome-based prediction models by integrating other omics data. These methods have exhibited superiority over self-supervised contrastive learning in the integration of multi-omics data. However, the current model is restricted to handling categorical covariates. To overcome this limitation and broaden the applications of supervised contrastive learning on continuous covariates, we introduce a novel framework known as GCL-Omics. GCL-Omics consists of two components: a supervised contrastive learning model and a prediction head. By introducing a unified contrastive loss based on similar and dissimilar data pairs, it creatively incorporates both self-supervised and supervised contrastive learning, facilitating its application to both categorical and continuous covariates. Through experiments on simulated and real data, we demonstrate the superiority of our model in predicting multiple covariates. The GCL-Omics framework proves to be robust and flexible in terms of prediction head selection. Furthermore, the embedding learned in the representation domain can generate distinct clusters for different covariate groups, enhancing visualization capabilities. The integration of these features positions GCL-Omics as a significant advancement in the field, offering a more versatile and effective approach to multi-omics integration.