Quantification of 3′ UTR isoform expression from scRNA-seq reveals substantial changes in differenti
Updated: Sep 29, 2022
Mervin M. Fansler (1,2), Gang Zhen, PhD (2), and Christine Mayr, MD, PhD (1,2). (1) Tri-Institutional Training Program in Computational Biology and Medicine, Weill-Cornell Graduate College, New York, NY 10021, USA (2) Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, NY
Most human and mouse genes have multiple 3′-end cleavage sites available during mRNA processing, resulting in alternative 3′ UTR isoforms. Many single-cell RNA sequencing (scRNA-seq) libraries are compatible with quantifying 3′ UTR isoform expression, but most processing pipelines discard this information when summarizing to gene-level counts. We developed a reusable pipeline, called scUTRquant (https://github.com/Mayrlab/scUTRquant), that measures gene and 3′ UTR isoform expression from scRNA-seq data. scUTRquant-derived gene and 3′ UTR isoform counts were validated against standard methods which demonstrated their accuracy. 3′ UTR isoform quantification was substantially more reproducible than previous methods. scUTRquant incorporates human and mouse atlases of high-confidence 3′-end cleavage sites at single-nucleotide resolution to simplify expression comparisons across datasets. Analysis of 120 mouse cell types revealed that during differentiation genes either change their expression or they change their 3′UTR isoform usage. Applying the scUTRquant pipeline to a human cortical organoid model of myotonic dystrophy revealed hundreds of significant cell-type-specific changes in isoform usage when comparing disease model to control samples. As with our observations in normal differentiation, these differences had little overlap with differentially expressed genes, indicating an additional layer of dysregulation that goes uncharacterized when quantifying only gene-level counts.