top of page
  • mabc307

Repairing the neutral set in codon evolutionary models

Updated: Sep 29, 2022

Hannah Verdonk (1), Sergei L. Kosakovsky Pond, PhD (1), and Jody Hey, PhD (2) 1. Institute for Genomics and Evolutionary Medicine, Department of Biology, Temple University, Philadelphia, Pennsylvania, USA 2. Center for Computational Genetics and Genomics, Department of Biology, Temple University, Philadelphia, Pennsylvania, USA

Most models of codon evolution rely on the ratio of the nonsynonymous substitution rate (dN) and the (putatively neutral) synonymous substitution rate (dS). Neutral substitutions provide a convenient control in studies of evolutionary adaptation and enable statistically valid hypothesis testing. However, synonymous substitutions are frequently under selective pressure to streamline gene translation and expression. We develop a Multiclass Synonymous Substitution (MSS) model, which partitions synonymous substitutions into neutral (Sn) and selected classes (Ss) inferred from sequence data. We apply our MSS model to 12,061 Drosophila gene alignments from 12 species, and compare our model's estimate of dN/dSn to the Muse-Gaut 94 (MG94) model's estimate of dN/dS. We find that the ratio of estimates of dN/dSn and dN/dS has a mean value of 0.896 (95% CI 0.888 to 0.903). A mean ratio less than 1 is consistent with dS<dSn because dS is estimated from both neutral and selected synonymous substitutions. We also identify a putatively neutral codon set via factor analysis and compare it to the neutral set identified with MSS. We expect synonymous selection to alter codon frequency, leading to covariation with gene expression. In factor analysis, codons with less covariation are assigned to the neutral set. We find that 7,252 (60%) genes prefer the MSS neutral set to the factor analysis set (mean AIC benefit 1.18, IQR 3.25). We explore the relationship between model preference and gene expression (FPKM) in our dataset, across five orders of magnitude (FPKM minimum 0, maximum 118,053, mean 1,161). We reason that our model will outperform alternate models on highly expressed genes under synonymous selection for translational efficiency. Unexpectedly, model preference is not strongly correlated with gene expression, indicating that synonymous selection may act on a complex set of traits. Our results suggest that we can repair the neutral set in codon evolutionary models.

15 views0 comments

Recent Posts

See All
bottom of page