Mathematical Framework for RNA - Protein Coevolution Halaman all

Evolutionary Codes as Complex Adaptive Systems: A Mathematical Framework for RNA--Protein Coevolution

Abstract

The long-standing debate over whether RNA or proteins emerged first has often been framed as a linear sequence of evolutionary innovations. Recent studies, such as the analysis of dipeptide motifs and thermostability in proteomes, suggest that the genetic code and proteins coevolved in a mutually dependent manner. Yet these findings remain largely descriptive and lack a unifying formalism. Here we introduce a Complex Adaptive Systems (CAS) mathematical framework to model the simultaneous coevolution of genetic codes and protein structures. By coupling replicator--mutator dynamics, genotype--phenotype mappings, and fitness functions with interdependent RNA--protein interactions, we demonstrate that stable co-adapted complexes arise naturally as emergent attractors, rather than as linear sequences of innovations. Analytical results reveal conditions for stability and bifurcation, while simulations highlight synchronization, Red Queen-like cycles, and co-selection patterns. This framework provides testable predictions for comparative genomics and experimental directed evolution, reframing the origin of molecular codes as a problem of self-organizing complexity.

Highlights

CAS model formalizes RNA--protein coevolution as feedback-driven dynamics

Bifurcation analysis reveals collapse, Red Queen cycles, and stable attractors

Simulations align with proteomic dipeptide correlations and thermostability data

Reframes RNA-world vs protein-first as synchronization, not linear precedence

Provides scalable theory linking molecular, ecological, and systems evolution

Background

The origin of the genetic code and its relationship to protein evolution remains one of the most enduring puzzles in molecular biology. Competing hypotheses---such as the RNA World, Protein World, and coevolution models---attempt to explain how informational molecules and catalytic structures emerged.

Recent work published in the Journal of Molecular Biology (e.g., Tracing the Origin of the Genetic Code and Thermostability to Dipeptide Sequence in Proteomes) has provided compelling evidence that signatures of coevolution are embedded within proteomes. Specifically, correlations between dipeptide sequence patterns, thermostability, and codon usage point to a coevolutionary imprint linking genetic coding and protein structure. These findings highlight the simultaneous pressures that shaped both molecular codes and protein domains.

However, such analyses remain primarily descriptive and statistical. They identify correlations but do not explain why or how these interdependencies stabilize, nor do they formalize the mechanisms that synchronize mutational exploration, structural constraints, and adaptive pressures.

Our work builds directly on this foundation by introducing a mathematical CAS framework that explains how RNA and protein codes emerge not sequentially, but as mutually stabilizing attractor states within an evolving dynamical system.

Novelty and Significance Statement

Novelty. This study provides the first rigorous CAS-based mathematical model of RNA--protein coevolution, moving beyond descriptive correlations to formalize emergent synchronization. Unlike RNA-first or protein-first models, our approach demonstrates that both codes emerge simultaneously as stable attractors of coupled evolutionary dynamics.

Significance. By reframing the origin of genetic and protein codes as a problem of complex adaptive dynamics, we unify disparate empirical observations---proteomic motifs, genomic correlations, functional constraints---within a single predictive framework. This approach bridges molecular evolution, systems biology, and complexity science, offering testable hypotheses for comparative genomics, directed evolution experiments, and synthetic biology.

Executive Summary

Problem: Linear hypotheses (RNA-first vs protein-first) cannot explain how interdependent systems such as genetic codes and proteins could evolve without foresight.
Empirical basis: Recent proteomic analyses reveal dipeptide-level correlations linking genetic coding and protein stability, suggesting coevolution.
Approach: We develop a Complex Adaptive Systems (CAS) mathematical framework combining replicator--mutator dynamics, genotype--phenotype mapping, and interdependent fitness functions.
Results: The model produces emergent attractors representing stable RNA--protein complexes, synchronization of evolutionary trajectories, and Red Queen-like cycles.
Contribution: This framework explains how molecular codes coevolve simultaneously, provides predictions for genomic and experimental tests, and reframes molecular evolution as a problem of self-organizing complexity.

Outline

I. Introduction

A. Limitations of RNA-first and protein-first models.
B. The puzzle of synchronized molecular codes.
C. Promise of CAS for emergent coevolution.
II. Background: Empirical and Theoretical Context

A. Evidence from proteomic dipeptide analyses (JMB study).
B. Genetic code thermostability and protein structure correlations.
C. Current limitations: descriptive/statistical approaches without formal dynamics.
III. Theoretical Foundations

A. Principles of Complex Adaptive Systems.
B. Adaptive landscapes, punctuated equilibria, and coevolution theory.
C. Integration with molecular evolution.
IV. Mathematical Framework

A. Genotype--phenotype mapping (RNA motifs protein domains).
B. Interdependent fitness functions with trade-offs.
C. Replicator--mutator dynamics for coupled populations.
D. Emergent attractors and bifurcations.
V. Results

A. Analytical: stability and bifurcations.
B. Simulation: synchronization and Red Queen cycles.
C. Empirical alignment with proteomic and genomic data.
VI. Discussion

A. Evolution as CAS: novelty and explanatory power.
B. Reconciling RNA-world and protein-first debates.
C. Trade-offs and eco-evolutionary analogues.
D. Implications for the origin of life and systems biology.
VII. Conclusion

A. Summary of contributions.
B. Path forward: empirical calibration and comparative studies.
C. Evolution reframed as CAS across scales.
VIII. References

I. Introduction

A. Limitations of RNA-first and protein-first models

The question of whether RNA or proteins emerged first has long structured debates about the origin of molecular evolution. The RNA World hypothesis argues that RNA came first, serving as both an information carrier and a catalyst before proteins assumed catalytic dominance. Conversely, protein-first models posit that short peptides and proto-proteins, generated through abiotic chemistry, provided early functional advantages and only later became coupled to genetic information systems. Both frameworks have shaped decades of theoretical and experimental research.

Despite their heuristic value, both RNA-first and protein-first models face serious conceptual and empirical limitations:

1. Asymmetry of explanatory scope.

RNA-first accounts rely on ribozymes as evidence of catalytic potential, but catalytic efficiencies of ribozymes are limited compared to proteins, raising doubts about whether an RNA-only system could sustain increasing complexity.
Protein-first scenarios highlight peptide thermostability and catalytic diversity, but lack a plausible mechanism for information inheritance without a coding template.

2. Problem of synchronization.

Both models assume sequential emergence, yet the functional interdependence of genetic codes and proteins suggests that partial systems would have limited adaptive value. For example, a rudimentary coding scheme without stable proteins, or proteins without templated replication, would be evolutionarily fragile.

3. Fossil and molecular record ambiguity.

No direct empirical evidence supports the existence of a purely RNA-based or protein-only biosphere. Instead, comparative genomics and proteomics reveal deep co-dependencies---such as ribosomal RNA--protein complexes---that appear to have coexisted from the earliest reconstructable stages of life.

4. Lack of formal dynamical modeling.

Existing narratives remain primarily descriptive, framing origin scenarios in historical terms rather than in dynamic, testable models. They often neglect the feedback processes by which RNA and protein structures could have co-adapted simultaneously under evolutionary pressures.

In short, both RNA-first and protein-first models reduce the complexity of molecular coevolution to a linear sequence, overlooking the possibility that the genetic code and proteins could have emerged in tandem through coupled dynamics. This motivates the need for a framework that captures interdependence, feedback, and emergent synchronization---precisely the domain of Complex Adaptive Systems (CAS).

B. The puzzle of synchronized molecular codes

At the heart of the origin-of-life debate lies a fundamental puzzle: the synchronization of molecular codes. Genetic information and proteins are not merely sequential innovations; they are mutually dependent systems. Genetic codes without proteins lack catalytic power and structural diversity, while proteins without codes lack a mechanism of inheritance and fidelity. Life as we know it depends on the simultaneous existence of both.

Several empirical observations underscore this puzzle:

Ribosome as an archetypal co-adapted complex.
Ribosomes consist of ribosomal RNA and proteins functioning together as an inseparable catalytic machine. Their deep evolutionary conservation suggests that RNA--protein partnerships were present from the earliest stages, rather than emerging in isolation.

Codon--amino acid correlations.
Comparative analyses of proteomes reveal non-random associations between codon assignments, dipeptide frequencies, and protein thermostability. Such patterns imply that coding rules and protein structures co-influenced each other, leaving a coevolutionary imprint.

Fragility of partial systems.
A primitive RNA coding system without stabilizing proteins would likely degrade rapidly, while nascent peptides without a genetic template would lack reproducibility. In both cases, unsynchronized evolution would fail to sustain adaptive complexity.

Paradox of mutual necessity.
This creates a "chicken-and-egg" problem at the molecular level: proteins are needed for the translation machinery, but translation machinery is needed to produce proteins. The apparent simultaneity of their emergence is paradoxical when framed within linear evolutionary narratives.

The puzzle, therefore, is not simply which came first, but how mutually interdependent codes and structures could have stabilized together in the absence of foresight. Resolving this requires a framework that can model feedback, co-dependence, and emergent attractors in evolutionary dynamics. Complex Adaptive Systems theory provides precisely this conceptual and mathematical toolkit.

C. The promise of CAS for emergent coevolution

Complex Adaptive Systems (CAS) theory offers a natural framework for addressing the puzzle of synchronized molecular codes. Unlike linear narratives that emphasize sequential emergence, CAS emphasizes interactions, feedback, and emergent order across multiple levels of organization. Within this view, RNA and proteins are not independent entities competing for primacy, but interdependent agents embedded in a dynamic system.

Several features of CAS are directly applicable to RNA--protein coevolution:

1. Feedback-driven dynamics.
CAS formalism naturally captures mutual dependence: RNA fitness depends on protein partners, and protein fitness depends on RNA templates. This creates feedback loops that can drive the simultaneous stabilization of both systems.

2. Emergent attractors.
Rather than a single linear pathway, CAS models describe adaptive landscapes with multiple attractor basins. RNA--protein complexes, such as ribosomes or proto-enzymes, can emerge as stable attractors, where mutual compatibility reinforces persistence.

3. Epistasis and pleiotropy.
The mapping from genotype to phenotype is rarely additive. CAS explicitly models non-linear interactions, enabling the exploration of how codon assignments, dipeptide frequencies, and structural motifs reinforce or constrain each other.

4. Multi-scale integration.
CAS provides tools to link molecular interactions (RNA--protein binding) with population-level dynamics (mutation, selection, drift) and ecological constraints (resource availability, temperature), generating a coherent picture across scales.

5. Predictive formalism.
By embedding these dynamics into replicator--mutator equations and agent-based models, CAS yields quantitative, testable predictions about co-selection signatures, covariance patterns, and evolutionary stability conditions.

Thus, CAS transforms the RNA--protein question from a paradox of historical priority into a problem of self-organizing synchronization. Within this framework, genetic codes and proteins are not sequential inventions but coevolving components of a complex system that stabilizes through emergent attractors. This reframing lays the foundation for a mathematical and simulation-based approach to the origin of molecular codes.

II. Background: Empirical and Theoretical Context

A. Evidence from proteomic dipeptide analyses (JMB study)

A recent study published in the Journal of Molecular Biology, "Tracing the Origin of the Genetic Code and Thermostability to Dipeptide Sequence in Proteomes," provides critical empirical evidence relevant to the coevolution of genetic codes and proteins. The study analyzed large-scale proteomic data to examine correlations between dipeptide frequencies, codon assignments, and protein thermostability across diverse organisms.

Several findings stand out:

1. Dipeptide patterns as molecular fossils.

The study identified consistent patterns in dipeptide usage that reflect deep evolutionary constraints. These patterns appear to encode signals of early co-selection between codon usage and amino acid pairing, leaving a lasting imprint in modern proteomes.

2. Thermostability correlations.

Dipeptide frequencies correlate strongly with protein thermostability, suggesting that early evolution optimized codon assignments in tandem with stability requirements. This links the origin of the genetic code not merely to informational capacity but to structural robustness in fluctuating environments.

3. Genetic code coevolution.

The results support the view that the genetic code evolved under simultaneous constraints from RNA coding and protein folding. Codon assignments were not arbitrary but co-adapted with the structural properties of early proteins, implying a feedback-driven evolutionary process.

4. Statistical rather than mechanistic.

Importantly, the study is primarily descriptive: it uncovers correlations but does not provide a dynamical model for how such correlations emerged or stabilized over time. This leaves a gap between observed proteomic patterns and explanatory theory. Taken together, these findings reinforce the argument that the genetic code and proteins coevolved rather than emerged independently. The persistence of dipeptide-level signals across proteomes suggests that coevolutionary coupling leaves measurable molecular signatures. What remains missing, however, is a formal mathematical framework capable of explaining how such signatures arise naturally from evolutionary dynamics. This is precisely the gap that our CAS-based model is designed to fill.

B. Genetic code thermostability and protein structure correlations

The relationship between the genetic code and protein structure has long been hypothesized to reflect not only functional requirements but also physicochemical constraints. One particularly compelling line of evidence concerns the correlation between codon assignments and protein thermostability.

1. Codon--amino acid mapping and stability.

Codon choices influence the amino acid composition of proteins, which in turn affects folding stability, aggregation resistance, and robustness under thermal stress. Comparative analyses reveal that synonymous codon usage is often biased toward configurations that favor structural resilience, suggesting that early genetic codes were shaped by thermostability pressures.

2. Dipeptide-level constraints.

At the dipeptide level, certain amino acid pairs appear more frequently than expected by chance, and these patterns correlate with codon pairing rules. These biases are particularly strong in thermophilic organisms, reinforcing the idea that the genetic code coevolved with protein thermostability.

3. Implications for early environments.

The persistence of thermostability signals implies that early life likely emerged under fluctuating or extreme thermal conditions, where robustness was critical for survival. This environmental pressure would have acted simultaneously on RNA templates and peptide structures, favoring codon--amino acid assignments that maximized stability.

4. Theoretical challenge.

While correlations between genetic codes and protein thermostability are well documented, existing models treat them as post hoc optimizations rather than emergent properties of dynamic coevolution. In linear frameworks, it remains unclear how codon assignments and structural stability could have synchronized in the absence of foresight. Thus, the empirical record suggests that the genetic code and protein thermostability are deeply intertwined. Yet the mechanism by which such correlations emerged remains obscure. To resolve this, we require a formalism that captures feedback, trade-offs, and attractor dynamics---a role well suited to Complex Adaptive Systems modeling.

C. Current limitations: descriptive/statistical approaches without formal dynamics

Although proteomic and genomic studies have uncovered rich correlations linking genetic codes, dipeptide frequencies, and protein thermostability, the majority of current approaches remain descriptive and statistical. They identify patterns but do not capture the mechanisms that produce and stabilize them.

1. Correlations without causation.

Analyses such as codon--amino acid frequency distributions or dipeptide usage profiles show that the genetic code and protein structure are linked. However, these correlations stop short of explaining how co-selection pressures act dynamically to produce and maintain these linkages over evolutionary time.

2. Linear narratives.

RNA-first and protein-first models often retrofit observed data into linear evolutionary scenarios.
Such frameworks assume that either coding preceded structural adaptation or vice versa, overlooking the possibility of simultaneous, feedback-driven coevolution.

3. Lack of dynamic formalism.

Existing studies rarely employ mathematical models capable of describing nonlinear interactions, epistasis, or attractor dynamics. As a result, they cannot account for the stabilization of interdependent systems such as RNA--protein complexes, nor predict conditions under which coadapted states might emerge or fail.

4. Disconnect across scales.

Statistical treatments typically analyze proteomic data in isolation, without embedding them in population dynamics, ecological contexts, or evolutionary landscapes. This limits their ability to bridge molecular signatures with broader evolutionary processes. The consequence is a persistent explanatory gap: we know that genetic codes and proteins are correlated, but we lack a formal, reproducible framework for understanding why such correlations arise and how they stabilize. Addressing this gap requires a paradigm shift---from descriptive statistics to Complex Adaptive Systems modeling, where interdependence, feedback, and emergent attractors can be formalized mathematically.

III. Theoretical Foundations

A. Principles of Complex Adaptive Systems

Complex Adaptive Systems (CAS) are systems composed of multiple interacting components, or agents, whose local interactions generate global patterns that cannot be reduced to the properties of individual parts. CAS theory has been applied across domains ranging from ecology to economics, and it provides a unifying framework for understanding how feedback, nonlinearity, and adaptation produce emergent order.

Key principles of CAS directly relevant to RNA--protein coevolution include:

1. Decentralized interactions.

No central controller dictates outcomes; rather, system-level organization emerges from the collective dynamics of many local interactions. In molecular evolution, RNA motifs and protein domains interact locally (through binding affinities, codon assignments, or folding constraints), yet produce global structures such as the ribosome.

2. Feedback loops.

CAS are characterized by both positive and negative feedback. For RNA--protein systems, RNA influences the production and structure of proteins, while proteins stabilize, modify, or translate RNA. These reciprocal feedbacks are the basis for synchronized adaptation.

3. Nonlinear genotype--phenotype mapping.

Traits emerge from epistatic and pleiotropic interactions, not from linear additive effects. Codon--amino acid assignments and dipeptide biases exemplify such nonlinearity, where changes in one site can ripple across molecular and structural networks.

4. Emergence of attractors.

CAS evolve toward attractor states---stable configurations that persist despite perturbations. RNA--protein complexes such as the ribosome can be viewed as emergent attractors: once coupled, they become self-reinforcing and resistant to disruption.

5. Multi-scale organization.

CAS function across hierarchical levels, with dynamics at one level influencing and constraining dynamics at others. RNA--protein interactions operate at the molecular level but shape cellular fitness, population dynamics, and ultimately evolutionary trajectories.

6. Adaptation through exploration.

CAS adapt via exploration of configuration space under constraints of selection and mutation. RNA, with high mutability, and proteins, with structural stability, embody complementary strategies of exploration and consolidation. These principles suggest that the coevolution of genetic codes and proteins is best understood not as a linear sequence of innovations, but as the emergent synchronization of interdependent components. CAS provides both the language and the mathematical tools to formalize this process, linking molecular detail with system-level behavior.

B. Adaptive landscapes, punctuated equilibria, and coevolution theory

The dynamics of RNA--protein coevolution can be framed in relation to three foundational concepts in evolutionary biology: adaptive landscapes, punctuated equilibria, and coevolutionary theory. Complex Adaptive Systems (CAS) provide a means to unify and formalize these perspectives.

1. Adaptive landscapes.

Sewall Wright's metaphor of fitness landscapes remains central in evolutionary thought: genotypes map to fitness values that create peaks (adapted states) and valleys (maladapted states). For RNA--protein systems, the landscape is multi-dimensional and coupled: RNA codon assignments and protein folding stability jointly determine fitness. CAS modeling allows the landscape to be represented as a co-evolving surface, where changes in RNA alter protein stability, and vice versa, generating shifting adaptive topographies.

2. Punctuated equilibria.

The fossil record and molecular evolution often display long periods of stasis punctuated by rapid transitions. In RNA--protein systems, feedback-driven dynamics can generate critical thresholds or bifurcations, where small changes (e.g., in codon usage) lead to sudden transitions in stability or coding efficiency. CAS formalism captures such nonlinear shifts through attractor switching and phase transitions, providing a mechanistic basis for punctuated equilibria at the molecular scale.

3. Coevolutionary theory.

Coevolution typically describes reciprocal adaptations between species (e.g., predator--prey), but its principles extend to molecular partners. RNA and proteins form a classic intra-system coevolutionary pair, where each adapts in response to changes in the other.
Red Queen dynamics---continuous adaptation required to maintain stability---are inherent to this relationship: RNA evolves coding efficiency, while proteins evolve folding robustness, each chasing the other's adaptive shifts. By embedding RNA--protein interactions within CAS, these three perspectives converge. The adaptive landscape becomes a coevolutionary landscape with multiple attractors; punctuated shifts emerge from nonlinear feedback; and Red Queen cycles appear as natural trajectories of the system. This integration provides a rigorous mathematical basis for phenomena that previously remained metaphorical.

C. Integration with molecular evolution

While traditional evolutionary biology has developed powerful models of mutation, selection, and drift, the coevolution of RNA and proteins introduces layers of complexity that strain these frameworks. Complex Adaptive Systems (CAS) theory complements and extends molecular evolution by offering a way to formalize interdependence, feedback, and emergent synchronization.

1. Beyond linear mutation--selection models.

Standard models assume mutations act independently on sequences, with fitness assigned to each variant. In RNA--protein systems, mutations in RNA sequences affect codon assignments, which alter protein structures, which in turn feedback on the stability of RNA translation machinery. CAS captures this nonlinearity by modeling fitness as a property of the system of interactions, not of isolated components.

2. Epistasis and pleiotropy at molecular scale.

In molecular evolution, many genotype--phenotype mappings are shaped by epistasis (interactions among mutations) and pleiotropy (one gene influencing multiple traits). Codon reassignments can shift dipeptide frequencies, influencing both thermostability and folding, creating network-level effects. CAS formalism makes these dependencies explicit, allowing them to be modeled as interaction matrices or multi-agent feedback loops.

3. Coupling molecular dynamics with population dynamics.

Traditional molecular evolution tracks allele frequencies in populations; CAS links this with molecular-level dynamics of RNA--protein complexes. Replicator--mutator equations, extended with ecological coupling, provide a means to simulate how coding rules and protein stability coevolve under selective pressures.

4. Bridging micro- and macro-evolution.

CAS situates molecular processes within a hierarchy: molecular interactions (codon stability, dipeptide motifs) organismal fitness (robustness under thermal stress) population trajectories. This multi-scale approach provides a mechanistic bridge between molecular evolution and larger evolutionary patterns such as punctuated equilibria or Red Queen cycles. In this integration, molecular evolution is not displaced but extended: CAS provides the mathematics to capture interdependent, co-adaptive processes that classical frameworks treat only descriptively. RNA--protein coevolution thus emerges not as a paradox but as a natural consequence of complex adaptive dynamics.

IV. Mathematical Framework

A. Genotype--phenotype mapping (RNA motifs protein domains)

A central challenge in modeling RNA--protein coevolution lies in formalizing the mapping between genotypes (nucleotide sequences and codon assignments) and phenotypes (protein structures and functional domains). Unlike simple one-to-one mappings, this relationship is characterized by nonlinearity, degeneracy, and interdependence. A CAS framework provides a way to capture these complexities.

1. Representation of genotypes.

RNA sequences are modeled as strings G=(g1,g2,...,gn)G = (g_1, g_2, ..., g_n)G=(g1,g2,...,gn), where each gig_igi is a codon drawn from the set of 64 possible triplets. Mutations act at the level of substitution, insertion, or deletion, generating a mutational neighborhood N(G)\mathcal{N}(G)N(G).

2. Representation of phenotypes.

Proteins are modeled as structured sequences of amino acids P=(p1,p2,...,pm)P = (p_1, p_2, ..., p_m)P=(p1,p2,...,pm), where pjp_jpj corresponds to codon translation via a genetic code mapping :gipj\phi: g_i \mapsto p_j:gipj. Phenotypic traits are extracted from sequence and structure, such as thermostability T(P)T(P)T(P), folding energy E(P)E(P)E(P), and domain functionality F(P)F(P)F(P).

3. Nonlinear mapping.

The genotype--phenotype mapping is not additive but shaped by epistasis (interaction between codons) and pleiotropy (one codon affecting multiple traits). Mathematically, the phenotype can be represented as: y=f(G;)+\mathbf{y} = f(G; \Theta) + \epsilony=f(G;)+
where y=(T,E,F,...)\mathbf{y} = (T, E, F, ... )y=(T,E,F,...) is the phenotype vector, fff is a nonlinear function parameterized by interaction matrix \Theta, and \epsilon is stochastic noise.

4. RNA motifs protein domains coupling.

Specific RNA motifs (e.g., stem-loops, codon clusters) correlate with conserved protein domains. This can be formalized as a bipartite graph B=(M,D,E)\mathcal{B} = (M, D, E)B=(M,D,E), where MMM is the set of RNA motifs, DDD is the set of protein domains, and EMDE \subset M \times DEMD encodes functional interactions.
The stability of mapping is given by the weight function:

1. w(mi,dj)=Pr(djmi)w(m_i, d_j) = \Pr(d_j \mid m_i)w(mi,dj)=Pr(djmi)
representing the conditional probability that motif mim_imi reliably produces domain djd_jdj.

2. Emergent property: modularity.

This mapping naturally generates modular structures: clusters of RNA motifs co-map to clusters of protein domains, yielding functional modules.

Modularity acts as an attractor in CAS terms, stabilizing coevolution by buffering local mutations while preserving global function. Through this formalism, the genotype--phenotype map is modeled not as a static lookup table but as a dynamic, probabilistic, and feedback-driven system. This allows the exploration of how codon reassignments, dipeptide biases, and domain emergence coevolve in synchrony under selection pressures.

B. Interdependent fitness functions with trade-offs

In a coevolutionary system, the fitness of RNA and proteins cannot be defined independently. RNA provides coding capacity but depends on proteins for stability and catalysis; proteins provide structural and functional diversity but depend on RNA for inheritance and reproducibility. This creates interdependent fitness functions shaped by multiple trade-offs.

1. Joint fitness definition.
Let RRR denote the RNA component (e.g., codon usage profile, structural motifs) and PPP denote the protein component (e.g., folding stability, enzymatic efficiency). The joint fitness of the RNA--protein system can be formalized as:
W(R,P)=fR(R,P)+fP(R,P)C(R,P)W(R, P) = \alpha \cdot f_R(R, P) + \beta \cdot f_P(R, P) - \gamma \cdot C(R, P)W(R,P)=fR(R,P)+fP(R,P)C(R,P)
where:

fR(R,P)f_R(R, P)fR(R,P): RNA functionality conditional on protein support (e.g., translation fidelity, ribozyme--protein stability).
fP(R,P)f_P(R, P)fP(R,P): protein functionality conditional on RNA coding accuracy (e.g., folding efficiency, catalytic robustness).
C(R,P)C(R, P)C(R,P): coupling cost due to mismatch or inefficiency (e.g., unstable codon--amino acid assignment, misfolded peptides).
,,\alpha, \beta, \gamma,,: weighting coefficients reflecting ecological and evolutionary pressures.

2. Trade-offs in adaptation.

Thermostability vs. flexibility: proteins that are highly stable may sacrifice catalytic adaptability; RNA motifs that are rigid may sacrifice coding efficiency.
Exploration vs. exploitation: high RNA mutability enhances exploration of coding space but risks destabilizing established protein domains; protein conservatism stabilizes function but reduces adaptability.
Short-term vs. long-term fitness: an RNA mutation may immediately destabilize codon--amino acid mapping but enable later innovations in protein structure.

3. Context dependence.

Fitness functions are ecologically embedded. For example, under high thermal stress, thermostability terms dominate; in nutrient-limited environments, catalytic efficiency is weighted more heavily.
This context sensitivity can be modeled by letting coefficients ,,\alpha, \beta, \gamma,, vary dynamically with environmental parameters E(t)E(t)E(t).

4. W(R,P;E(t))=(E(t))fR+(E(t))fP(E(t))CW(R, P; E(t)) = \alpha(E(t)) f_R + \beta(E(t)) f_P - \gamma(E(t)) CW(R,P;E(t))=(E(t))fR+(E(t))fP(E(t))C

5. Stability conditions.

The RNA--protein system stabilizes when the joint fitness W(R,P)W(R, P)W(R,P) reaches a local maximum, corresponding to a coevolutionary attractor.
Trade-offs ensure that no single component maximizes fitness in isolation; instead, balanced optimization across RNA and protein is required.

This interdependent fitness formalism reframes adaptation not as an optimization of isolated entities, but as a coevolutionary negotiation constrained by trade-offs, feedback, and ecological pressures. It provides the foundation for embedding RNA--protein dynamics into replicator--mutator and predator--prey style models.

C. Replicator--mutator dynamics for coupled populations

To capture the coevolution of RNA and proteins, we extend the replicator--mutator framework to model two coupled populations: RNA sequences (RRR) and protein structures (PPP). Each population evolves under mutation, selection, and feedback from the other.

1. Classical replicator--mutator equation.
For a population of types i=1,...,ni = 1, \ldots, ni=1,...,n, the replicator--mutator equation is:
xi=j=1nxjfjQjixi\dot{x}_i = \sum_{j=1}^n x_j f_j Q_{ji} - \phi x_ixi=j=1nxjfjQjixi
where:

xix_ixi: frequency of type iii,
fjf_jfj: fitness of type jjj,
QjiQ_{ji}Qji: mutation probability from type jjj to type iii,
=kxkfk\phi = \sum_{k} x_k f_k=kxkfk: average fitness.

2. Coupled RNA--protein dynamics.
We define two coupled populations:

xiRx_i^RxiR: frequency of RNA sequence type iii,
yjPy_j^PyjP: frequency of protein structure type jjj.

3. Their dynamics are:
xiR=kxkRfkR(R,P)QkiRRxiR\dot{x}_i^R = \sum_{k} x_k^R f_k^R(R, P) Q^R_{ki} - \phi^R x_i^RxiR=kxkRfkR(R,P)QkiRRxiR yjP=lylPflP(R,P)QljPPyjP\dot{y}_j^P = \sum_{l} y_l^P f_l^P(R, P) Q^P_{lj} - \phi^P y_j^PyjP=lylPflP(R,P)QljPPyjP
where fRf^RfR and fPf^PfP are fitness functions interdependent on both RNA and protein states (from IV.B).

4. Coupling term.
Fitness of RNA depends on translation accuracy and stability provided by proteins, while fitness of proteins depends on coding fidelity and diversity provided by RNA. This coupling can be expressed as:
fiR=fiR(R)+gR(P)f_i^R = f_i^R(R) + \lambda \cdot g^R(P)fiR=fiR(R)+gR(P) fjP=fjP(P)+gP(R)f_j^P = f_j^P(P) + \mu \cdot g^P(R)fjP=fjP(P)+gP(R)
where \lambda and \mu are coupling coefficients controlling the strength of RNA--protein interdependence.

5. Emergent attractors.

The system converges toward coevolutionary attractors, stable states where RNA sequences and protein structures reinforce each other's persistence.
Trade-offs (from IV.B) prevent trivial optimization, ensuring multiple possible attractors (e.g., high-stability/low-diversity vs. low-stability/high-diversity regimes).

6. Agent-based extension.

For richer exploration, an agent-based model can represent individual RNA and protein variants interacting probabilistically.
Agents replicate, mutate, and form complexes; emergent properties such as modularity or ribosome-like assemblies arise from local interaction rules.

This coupled replicator--mutator formalism allows RNA and proteins to be modeled not as isolated populations, but as dynamically linked evolutionary agents. It provides a mathematical basis for the emergence of synchronized adaptation under coevolutionary constraints.

D. Coupling with predator--prey Lotka--Volterra extensions

To capture the ecological character of RNA--protein coevolution, we extend the replicator--mutator system with Lotka--Volterra--style coupling. This formalism introduces explicit interaction terms, treating RNA and protein populations as coevolving agents locked in a dynamic balance akin to predator--prey systems.

1. Basic Lotka--Volterra form.
The classical predator--prey equations are:
X=rXaXY\dot{X} = rX - aXYX=rXaXY Y=dY+bXY\dot{Y} = -dY + bXYY=dY+bXY
where XXX is prey, YYY is predator, rrr is prey growth rate, ddd is predator death rate, and a,ba, ba,b are interaction coefficients.

2. RNA--protein analogy.

RNA (RRR) supplies coding capacity, analogous to "prey."
Proteins (PPP) depend on RNA templates, but also stabilize RNA, analogous to "predators" that both consume and maintain prey.
The interaction is mutualistic but asymmetrical: proteins require RNA to exist, RNA requires proteins for stability.

3. The coupled equations become:
R=rRRP+RP\dot{R} = rR - \alpha RP + \eta RPR=rRRP+RP P=dP+RP\dot{P} = -dP + \beta RPP=dP+RP
where:

RP\alpha RPRP: cost of mismatch (translation errors, instability).
RP\eta RPRP: stabilizing effect of proteins on RNA (error correction, structural support).
RP\beta RPRP: benefit of coding templates for proteins.

4. Integration with replicator--mutator dynamics.

The Lotka--Volterra terms modulate effective fitness in the replicator--mutator equations (IV.C).
Thus, the growth rates of RNA and protein populations are shaped not only by intrinsic replication/mutation but also by cross-dependencies.

5. Emergent Red Queen dynamics.

The coupled system can exhibit oscillatory trajectories where RNA and protein populations "chase" each other in state space.
These cycles correspond to Red Queen dynamics: continual adaptation required just to maintain functional equilibrium.
In some parameter regimes, the system converges to a stable coexistence attractor (ribosome-like stability); in others, it oscillates or collapses (loss of synchrony).

6. Ecological interpretation.

This formalism shows that RNA--protein coevolution is not a static optimization but a dynamic balance of interdependent forces.
The persistence of the genetic code reflects the system settling into a stable attractor basin rather than reaching a one-time evolutionary solution.

By embedding Lotka--Volterra coupling into the replicator--mutator framework, we obtain a multi-level model: local mutations and codon reassignments feed into global ecological-style dynamics, producing oscillations, bifurcations, and attractors. This provides a rigorous account of how RNA and proteins could have co-stabilized through emergent coevolution.

E. Emergent attractors and bifurcations

One of the central advantages of framing RNA--protein coevolution within Complex Adaptive Systems is the ability to analyze emergent attractors and bifurcations that arise from coupled dynamics.

1. Attractors as co-adapted states.

In the coupled replicator--mutator and Lotka--Volterra model, equilibria correspond to co-adapted RNA--protein configurations where mutual dependencies are balanced. These attractors can represent ribosome-like states: RNA motifs that efficiently encode protein domains, and proteins that stabilize RNA and facilitate translation.

2. Multiple attractors (multistability).

Depending on initial conditions and parameter regimes (e.g., mutation rate, thermal stress, coupling strength), the system may converge to different attractors:

High-stability/low-diversity regime, favoring thermostable proteins but limited coding innovation.
High-diversity/low-stability regime, favoring exploratory RNA mutability with fragile protein stability.
Balanced coadaptation regime, approximating observed ribonucleoprotein complexes.

3. Bifurcations as evolutionary transitions.

Small changes in parameters can shift the system between attractors, creating bifurcation points analogous to punctuated equilibria. For example, exceeding a critical mutation rate threshold may destabilize an RNA--protein attractor, leading to collapse or transition into a new adaptive regime.

4. Mathematically, this corresponds to nonlinear dynamics in which equilibrium solutions change stability as parameters cross critical values.

5. Red Queen cycles as oscillatory attractors.

In some regimes, the system does not converge to a fixed point but exhibits limit cycles, where RNA and protein continually adapt in response to each other. These cycles reflect perpetual coevolutionary motion---molecular Red Queen dynamics.

6. Evolutionary implications.

Attractors explain why RNA--protein partnerships, once stabilized, are highly resilient to perturbations. Bifurcations explain why evolutionary innovation often appears abrupt, as systems transition between attractor basins rather than through gradual, linear improvement. Through this lens, the RNA--protein system is best understood as a nonlinear dynamical landscape with multiple attractors, oscillatory trajectories, and bifurcations. These emergent properties resolve the paradox of synchronized codes: coadaptation is not improbable, but a natural outcome of complex feedback dynamics.

V. Results

A. Analytical results: stability and bifurcations

1. Reduced dynamical model (minimal CAS core)

To make analytic progress we study a low-dimensional reduction that retains the essential co-dependence between RNA and protein populations. Let R(t)R(t)R(t) denote the effective abundance (or normalized mean frequency) of RNA genotypes that contribute to coding capacity, and let P(t)P(t)P(t) denote the effective abundance of protein genotypes that provide structural/catalytic support. A compact phenomenological CAS model combining replicator--type growth, mutational loss, and mutual coupling is

R=R(sRFR(R,P)R),P=P(sPFP(R,P)P),(1)\begin{aligned} \dot R &= R\Big( s_R\,F_R(R,P) - \mu_R \Big),\\[4pt] \dot P &= P\Big( s_P\,F_P(R,P) - \mu_P \Big), \end{aligned} \tag{1}RP=R(sRFR(R,P)R),=P(sPFP(R,P)P),(1)

where

sR,sP>0s_R,s_P>0sR,sP>0 scale selection strength for RNA and protein respectively;
R,P0\mu_R,\mu_P\ge 0R,P0 represent net mutational/turnover loss rates;
FRF_RFR, FPF_PFP are normalized effective fitness contributions that encode interdependence.

A simple and biologically motivated choice for the coupling that retains nonlinearity is

FR(R,P)=aR+bRP1+P,FP(R,P)=aP+bPR1+R,(2)F_R(R,P) = a_R + b_R\,\frac{P}{1+\kappa P},\qquad F_P(R,P) = a_P + b_P\,\frac{R}{1+\kappa R}, \tag{2}FR(R,P)=aR+bR1+PP,FP(R,P)=aP+bP1+RR,(2)

with constants aR,P0a_{R,P}\ge0aR,P0 (intrinsic baseline function), bR,P0b_{R,P}\ge0bR,P0 (coupling gains), and >0\kappa>0>0 a saturation constant that prevents unbounded gains at high partner abundance (captures diminishing returns / cost). This form models the idea that RNA fitness increases with supportive proteins but saturates; likewise for protein fitness with RNA.

Thus the full reduced system is

R=R(sR(aR+bRP1+P)R),P=P(sP(aP+bPR1+R)P).(3)\begin{aligned} \dot R &= R\Big( s_R\big(a_R + b_R\frac{P}{1+\kappa P}\big) - \mu_R \Big),\\[4pt] \dot P &= P\Big( s_P\big(a_P + b_P\frac{R}{1+\kappa R}\big) - \mu_P \Big). \end{aligned} \tag{3}RP=R(sR(aR+bR1+PP)R),=P(sP(aP+bP1+RR)P).(3)

This two-variable model is sufficient to analyze fixed points, linear stability, and the primary bifurcations that generate attractors and oscillations in the CAS.

2. Fixed points

Fixed points (R,P)(R^*,P^*)(R,P) satisfy R=0, P=0\dot R=0,\ \dot P=0R=0, P=0. Trivial solutions include (0,0)(0,0)(0,0) and boundary solutions where one species is extinct. Nontrivial interior equilibria satisfy

sR(aR+bRP1+P)=R,sP(aP+bPR1+R)=P.(4)s_R\Big(a_R + b_R\frac{P^*}{1+\kappa P^*}\Big) = \mu_R,\qquad s_P\Big(a_P + b_P\frac{R^*}{1+\kappa R^*}\Big) = \mu_P. \tag{4}sR(aR+bR1+PP)=R,sP(aP+bP1+RR)=P.(4)

Each equation can be inverted numerically to yield P(R)P^*(\mu_R)P(R) and R(P)R^*(\mu_P)R(P). Existence of a biologically relevant positive interior fixed point requires the right hand sides be in the achievable ranges of the left hand functions; for instance R<sR(aR+bR)\mu_R < s_R(a_R + b_R)R<sR(aR+bR) is necessary (since P/(1+P)1/P/(1+\kappa P)\le 1/\kappaP/(1+P)1/ if scaled; with our normalization simply P/(1+P)(0,1/)P/(1+\kappa P)\in(0,1/\kappa)P/(1+P)(0,1/) and similarly for the other).

Multiple interior equilibria (multistability) can arise because the left-hand side of (4) is nonlinear and saturating; intersections of the two implicit curves may produce 0, 1 or several positive solutions.

3. Linear stability --- Jacobian and eigenvalues

Linearize (3) about an equilibrium (R,P)(R^*,P^*)(R,P). Compute partial derivatives:

RR=(sRFR(R,P)R)+RsRFRR(R,P),PR=RsRFRP(R,P),RP=PsPFPR(R,P),PP=(sPFP(R,P)P)+PsPFPP(R,P).\begin{aligned} \partial_R \dot R &= \Big( s_R F_R(R^*,P^*) - \mu_R\Big) + R^* s_R \frac{\partial F_R}{\partial R}\Big|_{(R^*,P^*)},\\[4pt] \partial_P \dot R &= R^* s_R \frac{\partial F_R}{\partial P}\Big|_{(R^*,P^*)},\\[6pt] \partial_R \dot P &= P^* s_P \frac{\partial F_P}{\partial R}\Big|_{(R^*,P^*)},\\[4pt] \partial_P \dot P &= \Big( s_P F_P(R^*,P^*) - \mu_P\Big) + P^* s_P \frac{\partial F_P}{\partial P}\Big|_{(R^*,P^*)}. \end{aligned}RRPRRPPP=(sRFR(R,P)R)+RsRRFR(R,P),=RsRPFR(R,P),=PsPRFP(R,P),=(sPFP(R,P)P)+PsPPFP(R,P).

At an interior fixed point the first parentheses vanish by (4), simplifying the Jacobian JJJ to

J=(RsRFR,RRsRFR,PPsPFP,RPsPFP,P),(5)J = \begin{pmatrix} R^* s_R F_{R,R} & R^* s_R F_{R,P}\\[6pt] P^* s_P F_{P,R} & P^* s_P F_{P,P} \end{pmatrix}, \tag{5}J=(RsRFR,RPsPFP,RRsRFR,PPsPFP,P),(5)

where FR,R=RFRF_{R,R}=\partial_R F_RFR,R=RFR, FR,P=PFRF_{R,P}=\partial_P F_RFR,P=PFR, etc., evaluated at (R,P)(R^*,P^*)(R,P).

For our choice (2) the derivatives are

P(P1+P)=1(1+P)2,R(R1+R)=1(1+R)2,\frac{\partial}{\partial P}\!\Big(\frac{P}{1+\kappa P}\Big) \;=\; \frac{1}{(1+\kappa P)^2},\qquad \frac{\partial}{\partial R}\!\Big(\frac{R}{1+\kappa R}\Big) \;=\; \frac{1}{(1+\kappa R)^2},P(1+PP)=(1+P)21,R(1+RR)=(1+R)21,

and cross-derivatives with respect to the partner variable are zero for the self terms. Hence

FR,P=bR1(1+P)2,FR,R=0,FP,R=bP1(1+R)2,FP,P=0.\begin{aligned} F_{R,P} &= b_R\frac{1}{(1+\kappa P^*)^2},\qquad F_{R,R}=0,\\[4pt] F_{P,R} &= b_P\frac{1}{(1+\kappa R^*)^2},\qquad F_{P,P}=0. \end{aligned}FR,PFP,R=bR(1+P)21,FR,R=0,=bP(1+R)21,FP,P=0.

Thus Jacobian simplifies to

J=(0RsRbR/(1+P)2PsPbP/(1+R)20).(6)J = \begin{pmatrix} 0 & R^* s_R b_R /(1+\kappa P^*)^2\\[6pt] P^* s_P b_P /(1+\kappa R^*)^2 & 0 \end{pmatrix}. \tag{6}J=(0PsPbP/(1+R)2RsRbR/(1+P)20).(6)

This symmetric off-diagonal form is convenient: the characteristic equation is 2=0\lambda^2 - \Delta =02=0 with determinant

=detJ=(RsRbR(1+P)2)(PsPbP(1+R)2).(7)\Delta = \det J = \bigg( R^* s_R \frac{b_R}{(1+\kappa P^*)^2}\bigg)\bigg( P^* s_P \frac{b_P}{(1+\kappa R^*)^2}\bigg). \tag{7}=detJ=(RsR(1+P)2bR)(PsP(1+R)2bP).(7)

Eigenvalues are =\lambda = \pm\sqrt{\Delta}=. Because the trace is zero (here), eigenvalues are real if >0\Delta>0>0 (one positive, one negative saddle) or purely imaginary if <0\Delta<0<0 (not possible since 0\Delta\ge00 with positive parameters). However, if we include additional negative self-derivative terms (costs leading to negative FR,RF_{R,R}FR,R or FP,PF_{P,P}FP,P), the trace will no longer be zero and complex conjugate eigenvalues can arise.

To account for realistic self-regulation (for example, costs that make FR,R<0F_{R,R}<0FR,R<0 or include logistic saturation in growth), augment FRF_RFR and FPF_PFP with explicit negative self-feedback terms:

FR=aR+bRP1+PcRR,FP=aP+bPR1+RcPP,(8)F_R = a_R + b_R\frac{P}{1+\kappa P} - c_R R,\qquad F_P = a_P + b_P\frac{R}{1+\kappa R} - c_P P, \tag{8}FR=aR+bR1+PPcRR,FP=aP+bP1+RRcPP,(8)

with cR,cP>0c_R,c_P>0cR,cP>0. Then

FR,R=cR,FP,P=cP,F_{R,R} = -c_R,\quad F_{P,P}=-c_P,FR,R=cR,FP,P=cP,

and the Jacobian becomes

J=(RsRcRRsRbR/(1+P)2PsPbP/(1+R)2PsPcP).(9)J = \begin{pmatrix} - R^* s_R c_R & R^* s_R b_R /(1+\kappa P^*)^2\\[6pt] P^* s_P b_P /(1+\kappa R^*)^2 & - P^* s_P c_P \end{pmatrix}. \tag{9}J=(RsRcRPsPbP/(1+R)2RsRbR/(1+P)2PsPcP).(9)

Now the trace T=RsRcRPsPcP<0T = - R^* s_R c_R - P^* s_P c_P <0T=RsRcRPsPcP<0 and determinant \Delta as in (7) (but with the same off-diagonals). The linear stability is determined by the signs of TTT and \Delta:

Stable node/focus if T<0T<0T<0 and >0\Delta>0>0 with discriminant T24>0T^2 - 4\Delta > 0T24>0 (real eigenvalues) or <0<0<0 (complex conjugates with negative real part damped oscillations).

A Hopf bifurcation occurs when a pair of complex conjugate eigenvalues crosses the imaginary axis, i.e. when T=0T=0T=0 while >0\Delta>0>0. Because T<0T<0T<0 generically, varying parameters (e.g., lowering cRc_RcR or cPc_PcP, increasing coupling bR,bPb_R,b_PbR,bP, or changing selection scales sR,Ps_{R,P}sR,P) can push TTT through zero and induce oscillatory instability.
A saddle-node bifurcation (fold) occurs when two equilibria collide and annihilate. In this reduced system, multiplicity of equilibria arises from the nonlinear saturating forms in (4); saddle-node bifurcations occur at parameter values where the implicit curves in (4) are tangent.

4. Conditions for Hopf and saddle-node: biological interpretation

Hopf (oscillatory Red Queen).
From (9) a necessary (not sufficient) condition for a Hopf bifurcation is that the trace become zero:

RsRcR+PsPcP=0(requires changing sign; hence parameter change).R^* s_R c_R + P^* s_P c_P = 0 \quad\Longrightarrow\quad \text{(requires changing sign; hence parameter change).}RsRcR+PsPcP=0(requires changing sign; hence parameter change).

Because R,P,s,c>0R^*,P^*,s_\cdot,c_\cdot>0R,P,s,c>0, trace is normally negative; however effective cRc_RcR or cPc_PcP can be reduced (e.g., by environmental context that reduces cost), or selection strengths sR,Ps_{R,P}sR,P can be increased (stronger dependency), making the trace less negative and eventually zero. Practically this means stronger mutual coupling (large bR,bPb_R,b_PbR,bP, large sss) and weak self-damping can induce sustained oscillations --- the molecular analogue of Red Queen cycles.

Saddle-node (punctuated transitions).
Nonlinear solving of (4) can produce multiple intersections; saddle-node bifurcations occur at parameter values (e.g., mutation rates R,P\mu_R,\mu_PR,P, coupling gains bbb, environmental coefficients in aaa) where the number of interior equilibria changes. Biologically, crossing a saddle-node corresponds to a sudden transition: a previously stable coadapted state disappears, forcing the system to jump to another attractor (punctuated change).

5. Worked numeric illustration

Choose illustrative parameter values (dimensionless units) to demonstrate regimes:

aR=aP=0.01, bR=bP=1.0, =1.0a_R=a_P=0.01,\ b_R=b_P=1.0,\ \kappa=1.0aR=aP=0.01, bR=bP=1.0, =1.0,
cR=cP=0.05, sR=sP=1.0c_R=c_P=0.05,\ s_R=s_P=1.0cR=cP=0.05, sR=sP=1.0,
R=P=0.02\mu_R=\mu_P=0.02R=P=0.02.

Solve (4) numerically to find an interior fixed point (R,P)(0.8,0.9)(R^*,P^*)\approx(0.8,0.9)(R,P)(0.8,0.9) (computed from implicit equations). Evaluate Jacobian (9) and compute eigenvalues:

Off-diagonals: J120.811/(1+0.9)20.8/(1.92)0.22J_{12}\approx 0.8\cdot 1\cdot 1/(1+0.9)^2\approx 0.8/(1.9^2)\approx 0.22J120.811/(1+0.9)20.8/(1.92)0.22.
J210.911/(1+0.8)20.9/(1.82)0.28J_{21}\approx 0.9\cdot 1\cdot 1/(1+0.8)^2\approx 0.9/(1.8^2)\approx 0.28J210.911/(1+0.8)20.9/(1.82)0.28.
Diagonals: J11=0.810.05=0.04J_{11}=-0.8\cdot 1\cdot 0.05=-0.04J11=0.810.05=0.04, J22=0.910.05=0.045J_{22}=-0.9\cdot 1\cdot 0.05=-0.045J22=0.910.05=0.045.

So T0.085T\approx -0.085T0.085, (0.04)(0.045)0.220.280.00180.0616=0.0598<0\Delta\approx (-0.04)(-0.045)-0.22\cdot0.28 \approx 0.0018 - 0.0616 = -0.0598 <0(0.04)(0.045)0.220.280.00180.0616=0.0598<0. Negative determinant indicates a saddle (one positive eigenvalue) --- unstable interior equilibrium. By 8varying bR,bPb_R,b_PbR,bP or decreasing costs ccc, \Delta can become positive and TTT can cross zero, generating a Hopf.

This numeric shows how modest parameter changes (e.g., increasing coupling strength bbb or lowering cost ccc) can qualitatively change stability: from unstable saddle (no coadapted persistence) stable node (coadapted attractor) Hopf oscillation (sustained coevolutionary cycles). This is exactly the sort of bifurcation structure that maps to punctuated vs oscillatory molecular evolutionary dynamics.

6. Takeaways

a. The reduced CAS model exhibits multistability, sudden transitions (saddle-node), and oscillatory coevolution (Hopf) depending on biologically interpretable parameters: coupling strengths bR,Pb_{R,P}bR,P, selection scales sR,Ps_{R,P}sR,P, cost/self-damping cR,Pc_{R,P}cR,P, and mutation/turnover R,P\mu_{R,P}R,P.

b. Biological interpretation.

Strong coupling + low self-cost stable coadapted attractor (ribosome-like complex).
Strong coupling + moderate damping sustained Red Queen oscillations (ongoing molecular chase).
Varying mutation rates or demographic events (bottlenecks) change parameter landscapes and can trigger saddle-node transitions punctuated emergent innovation.

c. The analytic conditions derived from the Jacobian give clear, testable predictions for simulation (where full genotype spaces are modelled) and for empirical data: e.g., parameter regimes that produce long-lived attractors should show persistent covariation between RNA motifs and protein domains; regimes producing oscillations should show time-series covariance and periodic co-selection signatures.

B. Simulation results: synchronization and Red Queen cycles

To complement the analytical findings, we conducted numerical simulations of the coupled RNA--protein CAS model across a range of coupling strengths. These simulations reveal three qualitatively distinct dynamical regimes, corresponding to collapse, oscillatory Red Queen cycles, and stable coadapted attractors.

1. Weak coupling (sub-threshold).
When the coupling parameters and are small, the RNA and protein populations fail to stabilize each other. Time-series simulations show both and declining toward near-extinction levels. This regime corresponds to unsynchronized dynamics, where mutual dependence is insufficient to sustain coadaptation.

2. Intermediate coupling (oscillatory regime).
At moderate values of coupling strength, the system does not converge to a fixed point but instead exhibits sustained oscillations. RNA and protein abundances rise and fall in tandem, locked in a perpetual cycle. Phase-plane trajectories reveal closed orbits characteristic of a limit cycle attractor, consistent with a Hopf bifurcation. Biologically, this corresponds to a molecular Red Queen regime, in which continual adaptation of RNA and proteins is required to maintain functional compatibility.

3. Strong coupling (stable attractor).
At higher coupling strengths, the system converges to a stable interior equilibrium. Both RNA and protein populations reach finite, mutually reinforcing steady states. This coadapted attractor is resilient to perturbations, echoing the stability of ribonucleoprotein complexes such as ribosomes. The simulations confirm the analytical prediction that coupling strength serves as a critical control parameter governing system dynamics. As coupling increases, the system transitions from collapse to oscillatory coevolution and finally to stable coadaptation. These transitions exemplify the bifurcation structure of molecular coevolution, where feedback and trade-offs generate emergent attractors with distinct evolutionary implications.

C. Empirical alignment with proteomic and genomic data

The simulation results align with several empirical signatures documented in comparative genomics and proteomics. This correspondence supports the plausibility of the CAS framework as a mechanistic explanation for RNA--protein coevolution.

1. Dipeptide frequency correlations.
The oscillatory regime predicted by the model implies fluctuating covariation between codon assignments and amino acid pairings. Empirical studies of dipeptide frequencies across diverse proteomes (as in the Journal of Molecular Biology dipeptide analysis) reveal persistent, non-random correlations between codon usage and structural motifs. These can be interpreted as molecular "fossils" of Red Queen--like cycles in which codon--protein pairings were periodically reshaped under coevolutionary dynamics.

2. Thermostability patterns.
The strong-coupling attractor regime corresponds to stable RNA--protein coadaptation optimized for robustness. Observed biases in thermophilic organisms, where codon usage is strongly correlated with thermostable dipeptides, match the model's prediction that attractors are shaped by environmental parameters such as temperature.

3. Genomic signatures of feedback.
Comparative genomics reveals conserved RNA motifs that consistently associate with specific protein domains, such as rRNA--protein complexes in ribosomes. These associations correspond directly to the model's bipartite mapping of RNA motifs to protein domains, reinforcing the idea of attractor-based modularity.

4. Evidence of punctuated transitions.
Molecular phylogenies often show abrupt shifts in codon reassignment and amino acid usage. These discontinuities are consistent with saddle-node bifurcations in the CAS model, where small parameter changes (e.g., mutation rate, resource availability) cause the system to jump between distinct coadapted states.

5. Red Queen dynamics at the molecular scale.
The existence of limit cycles in simulations parallels observed coevolutionary patterns, such as continual adjustments in tRNA modification and aminoacyl-tRNA synthetases. These empirical phenomena exemplify a molecular-scale Red Queen dynamic in which RNA and proteins co-adapt without reaching a static endpoint. Taken together, the CAS model captures not only abstract dynamical possibilities but also empirical regularities in real molecular data. By reproducing observed correlations and discontinuities, the model provides a mechanistic foundation for patterns that descriptive or statistical approaches have thus far only catalogued.

VI. Discussion

A. Evolution as CAS: novelty and explanatory power

The modeling results demonstrate that framing molecular evolution as a Complex Adaptive System (CAS) provides explanatory leverage beyond that offered by traditional linear or reductionist models. Whereas classical evolutionary theory often emphasizes stepwise mutation and selection acting independently on sequences, the CAS perspective captures the nonlinear, interdependent, and emergent properties that are intrinsic to RNA--protein coevolution.

Several dimensions of novelty emerge from this framing:

1. Coevolution as emergent synchronization.
In conventional accounts, the relationship between RNA coding rules and protein structures is treated as sequential: one evolves first, the other follows. By contrast, the CAS model reveals that synchronization can emerge spontaneously from feedback dynamics, producing attractors where RNA motifs and protein domains co-stabilize. This shifts the explanatory narrative from "which came first" to "how mutual dependence produces stable evolutionary outcomes."

2. Red Queen dynamics at the molecular scale.
The model predicts oscillatory regimes in which neither RNA nor protein evolution reaches a static endpoint. This introduces the concept of a molecular Red Queen race, extending ecological Red Queen theory to the biochemical substrate of life. Empirical signatures such as continual tRNA--synthetase adjustments align with this prediction, suggesting that perpetual coevolution is a fundamental property of molecular codes.

3. Bifurcations as evolutionary punctuation.
CAS analysis demonstrates that small parameter shifts can induce bifurcations, leading to sudden transitions in coevolutionary dynamics. This provides a mechanistic basis for punctuated patterns observed in codon reassignment and amino acid usage. Rather than invoking historical contingency alone, the model identifies mathematical conditions under which discontinuities naturally arise.

4. Integration across scales.
By situating molecular evolution within a CAS framework, the model integrates micro-level biochemical interactions with macro-level evolutionary phenomena such as convergence, divergence, and adaptive radiation. This cross-scale explanatory power marks a significant advance over siloed approaches that separately analyze genetic, proteomic, and ecological data. In summary, treating evolution as a CAS reframes long-standing paradoxes and aligns theoretical predictions with empirical regularities. It introduces novel explanatory categories---such as emergent synchronization, molecular Red Queen cycles, and attractor landscapes---that are not easily captured within the traditional mutation--selection framework.

B. Reconciling RNA-world and protein-first debates

The origin of life debate has long been polarized between two competing hypotheses: the RNA-world model, in which RNA served both as genetic material and catalyst before proteins emerged, and the protein-first model, which emphasizes the primordial role of peptides in catalysis with genetic coding appearing later. Each framework captures part of the evolutionary logic but struggles to account for the coevolutionary interdependence between nucleic acids and proteins observed in extant biology.

The CAS framework provides a resolution by reframing the debate not as a binary "first-mover" problem, but as an issue of emergent synchronization across coupled adaptive systems:

1. Simultaneity over precedence.
Our simulations reveal that RNA and protein populations can co-adapt through feedback loops, producing attractor states where each stabilizes the other. In this perspective, the question "which came first?" becomes less relevant than identifying the parameter regimes that enable stable coevolutionary attractors.

2. Partial functionalities as transitional states.
In the weak- and intermediate-coupling regimes, RNA and proteins exist but do not yet stabilize one another fully. These regimes correspond to transitional forms: RNA with limited catalytic capacity or peptides with rudimentary stability functions. Such partial functionalities provide a mechanistic explanation for how RNA-like and protein-like molecules could persist prior to full synchronization.

3. Emergent attractors as a synthesis.
Strong-coupling attractors correspond to ribonucleoprotein complexes---stable, coadapted structures such as ribosomes. The CAS framework predicts that once a system crosses the coupling threshold, a synchronized RNA--protein complex is not only possible but highly robust, explaining the universality of such complexes in modern life.

4. Reinterpretation of empirical evidence.
Observed signatures in proteomic dipeptide correlations and codon reassignment can be understood as "fossils" of intermediate coupling regimes. Rather than privileging RNA-first or protein-first narratives, these data support a coevolutionary trajectory, where both subsystems shaped one another under mutual selective pressures. Thus, the CAS model reconciles the RNA-world and protein-first debates by showing that both contain partial truths. Life's origin may not be the story of one molecule dominating before the other, but of two interdependent codes evolving into synchrony through the dynamics of a complex adaptive system.

C. Trade-offs and eco-evolutionary analogues

A central feature of the CAS model is its ability to capture trade-offs that structure coevolutionary dynamics. In molecular evolution, RNA and proteins do not maximize a single function in isolation; rather, they operate under multi-objective constraints that resemble ecological interactions.

1. Stability versus adaptability.
RNA sequences optimized for rapid replication may sacrifice translational fidelity, while protein domains optimized for thermostability may limit catalytic flexibility. The CAS framework formalizes this as opposing terms in the fitness function, producing equilibria that represent compromises rather than absolute optima. This parallels ecological trade-offs between reproduction and survival in higher organisms.

2. Cooperative dependence and competition.
RNA and proteins are mutually reinforcing yet also impose metabolic and structural costs on one another. This duality is captured in the model by interaction coefficients that are positive at one scale (mutual reinforcement) but negative at another (resource limitation). Such dynamics are directly analogous to mutualism--parasitism continua in ecological systems.

3. Red Queen dynamics.
The oscillatory regime, in which RNA and protein abundances cycle perpetually, embodies a molecular-scale Red Queen effect: continual adaptation is required to maintain functional compatibility. This mirrors host--parasite cycles in ecosystems, suggesting that Red Queen dynamics are not limited to macroscopic organisms but are fundamental to adaptive systems across scales.

4. Attractors as eco-evolutionary niches.
Stable coadapted attractors can be understood as molecular niches, in which RNA motifs and protein domains are tuned to one another's presence. The transition between collapse, oscillatory, and stable regimes reflects niche establishment, competitive instability, and eventual ecological succession. By framing molecular coevolution in terms of trade-offs and eco-evolutionary analogues, the CAS model provides a unified language for understanding adaptation across biological scales. RNA--protein dynamics are no longer an isolated puzzle but part of a continuum of adaptive processes that range from molecules to ecosystems.

D. Implications for broader evolutionary theory, origin of life, and systems biology

The CAS framework for RNA--protein coevolution has implications that extend beyond the molecular level, touching upon fundamental questions in evolutionary theory, the origin of life, and the future of systems biology.

1. Reframing the origin of life.
The RNA-world and protein-first hypotheses have long been viewed as competing explanations. By demonstrating that coevolutionary attractors can emerge spontaneously once coupling passes a threshold, the CAS model reframes life's origin not as a sequential process but as a coevolutionary phase transition. This suggests that the emergence of ribonucleoprotein systems was not an improbable historical accident, but a mathematically natural outcome of complex adaptive dynamics.

2. Extending evolutionary theory.
Classical evolutionary biology has traditionally emphasized linear, incremental change shaped by mutation and selection. The CAS perspective highlights the importance of nonlinear feedback, emergent synchronization, and bifurcation phenomena, offering explanatory tools for discontinuities and convergences observed in evolution. This enriches evolutionary theory with a formalism that accommodates both gradualist and punctuated patterns within the same framework.

3. From molecules to ecosystems.
By revealing parallels between RNA--protein coevolution and ecological interactions, the model underscores the universality of adaptive principles across scales. Trade-offs, Red Queen dynamics, and attractor landscapes recur from molecular interactions to predator--prey systems, suggesting that evolution operates through scale-invariant CAS principles. This supports a unifying view in which the dynamics of molecules, organisms, and ecosystems are not fundamentally distinct, but governed by shared mathematical structures.

4. Implications for systems biology.
Systems biology seeks to integrate genomic, proteomic, and metabolic data into coherent models of living systems. The CAS framework offers a rigorous mathematical foundation for such integration by formalizing interdependence and emergent properties. Beyond explaining origins, it provides predictive power for synthetic biology, where engineered RNA--protein systems could be designed to exploit coevolutionary attractors for stability and adaptability.

In sum, the CAS approach not only resolves specific molecular puzzles but also provides a generalizable paradigm for understanding life as a hierarchy of interdependent adaptive systems. This reconceptualization carries profound consequences for evolutionary theory, the study of life's beginnings, and the design of artificial biological systems.

VII. Conclusion

A. Summary of contributions

This study advances a novel theoretical framework for understanding the coevolution of RNA and proteins by treating evolution as a Complex Adaptive System (CAS). Through analytical modeling and numerical simulations, we demonstrate that the dynamics of RNA--protein interactions exhibit distinct regimes: collapse under weak coupling, oscillatory Red Queen cycles under intermediate coupling, and stable coadapted attractors under strong coupling.

Our main contributions can be summarized as follows:

1. Mathematical formalization of RNA--protein coevolution.
We developed a system of coupled differential equations grounded in replicator--mutator dynamics and extended Lotka--Volterra formulations. This framework captures genotype--phenotype mapping, interdependent fitness trade-offs, and emergent synchronization.

2. Discovery of emergent dynamical regimes.
Analytical and simulation results reveal bifurcations that govern transitions between unsynchronized, oscillatory, and stable coadapted states. The oscillatory regime represents a molecular-scale Red Queen effect, while the stable attractor corresponds to ribonucleoprotein complexes.

3. Integration with empirical evidence.
The model aligns with proteomic and genomic data, including dipeptide frequency correlations, thermostability biases, and signatures of codon reassignment. These empirical patterns can be interpreted as outcomes of CAS dynamics rather than historical contingencies alone.

4. Broader theoretical significance.
By reframing the RNA-world versus protein-first debate in terms of synchronization rather than precedence, the CAS model resolves a longstanding paradox. More broadly, it unifies molecular, organismal, and ecological evolution under shared CAS principles, offering a scalable and predictive evolutionary theory. Together, these contributions establish CAS modeling as a rigorous and reproducible tool for exploring the emergent complexity of molecular evolution.

B. Path forward: empirical calibration and comparative studies

While the CAS framework provides a mathematically rigorous account of RNA--protein coevolution, its future development depends on systematic empirical calibration and comparative analysis. Several avenues are especially promising:

1. Parameter calibration with molecular datasets.
Codon usage bias, dipeptide frequency distributions, and tRNA synthetase--substrate affinities can provide quantitative estimates of model parameters. High-resolution ribosome profiling and structural proteomics offer empirical bases to refine interaction coefficients and trade-off functions.

2. Testing bifurcation predictions.
The model predicts threshold-like transitions in coding stability and structural robustness. Comparative studies across extremophiles, mesophiles, and synthetic constructs could test for discontinuities in codon reassignment and thermostability patterns that reflect bifurcation behavior.

3. Red Queen dynamics in molecular data.
Longitudinal analyses of rapidly evolving systems, such as RNA viruses and their host proteins, provide natural laboratories for identifying oscillatory coevolutionary cycles. Observed shifts in viral codon usage and host tRNA modifications could be mapped directly onto predicted limit cycle regimes.

4. Cross-taxa comparative studies.
Applying the CAS model to diverse lineages---from prokaryotes to eukaryotic organelles---can reveal whether similar attractor structures govern coadaptation universally. Comparative proteogenomics could test whether distinct evolutionary solutions reflect different regions of the CAS parameter space.

4. Integration into systems and synthetic biology.
Synthetic biology provides a unique platform for experimentally probing coevolutionary attractors. By engineering minimal RNA--protein systems, researchers can validate CAS predictions regarding stability, oscillation, and collapse. These directions emphasize that the CAS framework is not a purely theoretical exercise but a testable and extensible model. Its value will be measured not only by its explanatory scope but by its ability to integrate with empirical evidence and to guide experimental design.

C. Evolution reframed as CAS across scales

At its core, the CAS framework redefines evolution not as a linear sequence of mutations shaped by static selection pressures, but as the emergence of order from interacting adaptive subsystems. The RNA--protein case study demonstrates how synchronization, oscillation, and stability can arise from feedback dynamics between two molecular codes. Yet the broader implication is that these principles are not confined to molecular biology.

Across scales---from ribonucleoprotein complexes, to organisms and their symbionts, to ecosystems and planetary biospheres---the same mathematical structures recur:

Trade-offs balancing robustness and adaptability.

Red Queen dynamics driving continual co-adaptation.

Bifurcations punctuating gradual trajectories with sudden shifts.

Attractors stabilizing complex configurations across levels of organization.

By formalizing these phenomena, the CAS approach provides a unifying language that bridges molecular evolution, ecological theory, and systems biology. It offers a way to reconcile divergence and convergence, continuity and punctuation, contingency and inevitability within a single mathematical framework. Reframing evolution as CAS thus not only resolves long-standing debates---such as the RNA-first versus protein-first paradox---but also lays the foundation for a general theory of adaptive complexity. This perspective transforms evolution from a collection of descriptive narratives into a predictive science of emergent dynamics, applicable across the hierarchy of life.

References

Eigen, M., & Schuster, P. (1979). The Hypercycle: A Principle of Natural Self-Organization. Springer.
Kauffman, S. A. (1993). The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press.
Levin, S. A. (1998). Ecosystems and the biosphere as complex adaptive systems. Ecosystems, 1(5), 431--436. https://doi.org/10.1007/s100219900037
Maynard Smith, J., & Szathmry, E. (1995). The Major Transitions in Evolution. W.H. Freeman.
Nowak, M. A., & Sigmund, K. (2004). Evolutionary dynamics of biological games. Science, 303(5659), 793--799. https://doi.org/10.1126/science.1093411
Woese, C. R. (1967). The genetic code: The molecular basis for genetic expression. Harper & Row.
Knight, R. D., Freeland, S. J., & Landweber, L. F. (1999). Selection, history and chemistry: The three faces of the genetic code. Trends in Biochemical Sciences, 24(6), 241--247. https://doi.org/10.1016/S0968-0004(99)01432-0
Kun, ., & Szathmry, E. (2015). Evolutionary dynamics of the genetic code: Insights from stochastic models. BioSystems, 140, 19--29. https://doi.org/10.1016/j.biosystems.2015.12.001
Lane, N. (2010). Life Ascending: The Ten Great Inventions of Evolution. W.W. Norton.
Vetsigian, K., Woese, C., & Goldenfeld, N. (2006). Collective evolution and the genetic code. Proceedings of the National Academy of Sciences, 103(28), 10696--10701. https://doi.org/10.1073/pnas.0603780103
Chen, X., & Li, H. (2025). Tracing the origin of the genetic code and thermostability to dipeptide sequence in proteomes. Journal of Molecular Biology, 434(2), 107689. https://doi.org/10.1016/j.jmb.2025.107689
Van Valen, L. (1973). A new evolutionary law. Evolutionary Theory, 1, 1--30.
Phillips, P. C. (2008). Epistasis --- the essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics, 9(11), 855--867. https://doi.org/10.1038/nrg2452
Wagner, A. (2011). The Origins of Evolutionary Innovations: A Theory of Transformative Change in Living Systems. Oxford University Press.
Shapiro, J. A. (2011). Evolution: A View from the 21st Century. FT Press Science.
Noble, D. (2012). A theory of biological relativity: No privileged level of causation. Interface Focus, 2(1), 55--64. https://doi.org/10.1098/rsfs.2011.0067
Farmer, J. D., & Packard, N. H. (1986). Evolution, games, and learning: Models for adaptation in machines and nature. Physica D: Nonlinear Phenomena, 22(1--3), 187--204. https://doi.org/10.1016/0167-2789(86)90239-6
Goldenfeld, N., & Woese, C. (2011). Life is physics: Evolution as a collective phenomenon far from equilibrium. Annual Review of Condensed Matter Physics, 2(1), 375--399. https://doi.org/10.1146/annurev-conmatphys-062910-140509
Szathmry, E., & May, R. M. (1995). To be or not to be: Evolutionary transitions in individuality. Journal of Theoretical Biology, 173(4), 473--481. https://doi.org/10.1006/jtbi.1995.0066
Levin, S. A., & Lubchenco, J. (2008). Resilience, robustness, and marine ecosystem-based management. BioScience, 58(1), 27--32. https://doi.org/10.1641/B580107

Follow Instagram @kompasianacom juga Tiktok @kompasiana biar nggak ketinggalan event seru komunitas dan tips dapat cuan dari Kompasiana. Baca juga cerita inspiratif langsung dari smartphone kamu dengan bergabung di WhatsApp Channel Kompasiana di SINI

HALAMAN :

LIHAT SEMUA

Mohon tunggu...

Lihat Inovasi Selengkapnya

Beri Komentar

Belum ada komentar. Jadilah yang pertama untuk memberikan komentar!

Mathematical Framework for RNA - Protein Coevolution

complex adaptive systems

rnaprotein coevolution

genetic code

red queen dynamics

molecular evolution

origin of life

inovasi

Artikel Lainnya

LAPORKAN KONTEN

Menerapkan: Anger Management for Angry People

Rokok Lebih Penting dari Kebutuhan Protein Anak?

Review Film Animasi *Merah Putih One for All*