4. Multi-scale integration.
CAS provides tools to link molecular interactions (RNA--protein binding) with population-level dynamics (mutation, selection, drift) and ecological constraints (resource availability, temperature), generating a coherent picture across scales.
5. Predictive formalism.
By embedding these dynamics into replicator--mutator equations and agent-based models, CAS yields quantitative, testable predictions about co-selection signatures, covariance patterns, and evolutionary stability conditions.
Thus, CAS transforms the RNA--protein question from a paradox of historical priority into a problem of self-organizing synchronization. Within this framework, genetic codes and proteins are not sequential inventions but coevolving components of a complex system that stabilizes through emergent attractors. This reframing lays the foundation for a mathematical and simulation-based approach to the origin of molecular codes.
II. Background: Empirical and Theoretical Context
A. Evidence from proteomic dipeptide analyses (JMB study)
A recent study published in the Journal of Molecular Biology, "Tracing the Origin of the Genetic Code and Thermostability to Dipeptide Sequence in Proteomes," provides critical empirical evidence relevant to the coevolution of genetic codes and proteins. The study analyzed large-scale proteomic data to examine correlations between dipeptide frequencies, codon assignments, and protein thermostability across diverse organisms.
Several findings stand out:
1. Dipeptide patterns as molecular fossils.
The study identified consistent patterns in dipeptide usage that reflect deep evolutionary constraints. These patterns appear to encode signals of early co-selection between codon usage and amino acid pairing, leaving a lasting imprint in modern proteomes.
2. Thermostability correlations.
Dipeptide frequencies correlate strongly with protein thermostability, suggesting that early evolution optimized codon assignments in tandem with stability requirements. This links the origin of the genetic code not merely to informational capacity but to structural robustness in fluctuating environments.