It is impossible to determine whether a complex metabolism based solely on ribozymes actually existed, because any putative RNA catalysts from this era have long since been supplanted by protein enzymes and, thus, the inforrmation necessary for exactly reconstructing a hypothesized "RNA world" has largely been lost. Nevertheless, it has been possible to verify experimentally whether critical features of RNA world models are chemically feasible. In particular, experiments can discern ( I ) whether the precursors of an RNA world, ribonucleotide monomers and oligomers, can be synthesized under conditions similar to those that existed in the prebiotic environment; (2) whether ribozymes can catalyze template-directed replication; and (3) whether ribozymes can catalyze a range of reactions similar to that of protein enzymes.
Until recently, experiments were largely restricted to comparing schemas for the prebiotic synthesis of oligonucleotides and their precursor monomers (point I above). However, advances in nucleic acid amplification technologies have made possible the isolation and engineering of new ribozyme activities (points 2 and 3). The bounds of RNA catalysis have already expanded beyond phosphodiester bond rearrangements with the discovery of ribozyme esterase'2~ and peptidyl-transferase'2b activities, and in the next few years seminal enzymes such as kinases, oxidoreductases, and, most importantly, replicases should be added to this list.
Although there are a number of inefficient steps in most proposed prebiotic syntheses of ribotides, the major objection to RNA as the progenitor of life has been the relatively small yield of ribose in the formose reaction, a simple condensation of glycoaldehyde. Muller et al., however, have discovered a variation of the formose reaction that produces a limited mix of pentose diphosphates in which the ribose forms predominate (52:14:23:1 1, ribose:arabinose:lyxose:xylose). Although many cntical chemical roadblocks remain (such as the extremely low yield of pyrimidine nucleosides following the condensation of ribose and free bases), this advance belies the previously held view that products of the formose reaction are necessarily so chemically diverse that they are "the carbohydrate analog of petroleum."
Given a ready supply of ribonucleotides, it is tempting to imagine that the catalysts of an RNA world were selected from pools of random RNA sequences, and that functional molecules were propagated by templatedirected polymerization. The first problem is, of course, the generation of a pool of templates. The de novo formation of short oligonucleotides (2 - 6 bases) from ribotide monomers using prebiotic activating agents and mineral catalysts have been reported, but the internucleotide linkages are generally 2' to 5', rather than 3' to 5'. Longer oligomers can be constructed by ligating short pieces together: self-complementary dimers have been shown to fonn chains as long as 30 bases in length.
The difficulties encountered in the untemplated synthesis of "natural" nucleic acids is one factor that has prompted many researchers to propose alternatives to RNA as the initial genetic material. For example, a derivatlve of 3'-aminoguanosine can form untemplated chains of up to 20 bases m length. In addition, both cyclic and acyclic nucleotide bisphosphate derivatives polymerize without the aid of a template to form nucleic acids with pyrophosphate linkages between the monomers, rather than the more "natural" phosphodiester linkages.
Assuming the existence of an RNA template, the polymerization of activated ribonucleotides into polymers containing 3', 5'-phosphodiester linkages proceeds readily. Using RNA homopolymers as templates, products of up to 50 bases in length have been observed, and when RNA heteropolymers are used as templates the reaction is faithful. These reactions can be catalyzed by a variety of metal ions, and seem to occur most easily with ribose-linked nucleotides. Short oligonucleotides can also serve as substrates for template-directed polymerization, and they have the advantage that the ligated product can exceed the length of the templating molecule.
The prebiotic relevance of these reactions is open to question, though, since early template-directed polymerizations would have occurred in a heterologous mix of nucleotide isomers and enantiomers. In these circumstances, it is unlikely that an all RNA genetic system (and, hence, an RNA world) could have arisen by the reactions so far examined. Some self-selection of enantiomerically pure polymers may have occurred, but RNA is also known to template efficiently the polymerization of monomers other than ribotides, such as acyclic nucleotide derivatives. Thus, the chemical composition of nascent templates could not have been efficiently reproduced. In addition, the replication of RNA templates is inhibited by the addition of stereochemically "wrong" isomers. This phenomenon, which has been termed enantiomeric cross-inhibition by Joyce, Orgel, and co-workers, is another motivation for the idea that ribotides were preceded by prochiral, acyclic versions of the nucleotides. Although the problem of prebiotic chain termination may have been obviated by some mechanisms, such as the preferential formation of inactive cyclic phosphates in non-ribose-based nucleotides, nevertheless it is difficult to imagine the template-directed polymerization of monomers into long RNA chains in the early biosphere.
So far, we have constructed an unsatisfying picture of the earliest days of an RNA world: although some prebiotic mechanisms may exist for the untemplated formation of oligonucleotides, these molecules would have been short, would have contained a variety of monomers besides ribotides and could not have been faithfully copied by the template-directed polymenzation of monomers. Given this model, it is difficult to imagine the accumulation of RNA sequences necessary for the Darwinian selection of a multitude of active ribozymes. Nevertheless, these precursors may have been adequate for the first critical step in the formation of life: the formation of an RNA replicase.
Although self-replicating systems need not be based on nucleic acid complementarity, as Rebek and co-workers have artfully shown, very simple self-replicating RNAs can be easily devised. Both von Kiedrowski and Zielinski and Orgel have examined systems in which a palindromic oligonucleotide templates the ligation of two shorter, appropriately activated oligonucleotides. Because the template is palindromic, the products of the desired reaction become templates for further cycles of ligation, and the ligated material accumulates autocatalytically. The selection of individual sequences from a complex mixture may be possible in such systems, since different templates have been found to have very different replication proficiencies. However, sequence competition is likely to be based on very simple considerations. such as how the 3'-hydroxyl (or 3'-amine, in the case of some analogs) is aligned on a template relative to the 5'-leaving group of an adjacent molecule.
More efficient self-replicating sequences will presumably utilize more mechanistically complex modes of catalysis, for example, stabilizing the 5'-leaving group (e.g., stabilizing the negative charge that develops as a pyrophosphate is displaced from a nucleotide triphosphate). increasing the nucleophilicity of the attacking 3'-hydroxyl, or binding the bipyramidal geometry of the phosphodiester transition state intermediate. These more involved reaction mechanisms will necessarily be found in molecules that can assume conformations more elaborate than that of a simple double helix. Because the relationship between sequence, structure, and function is nonintuitive, though, it is difficult to envision just how structurally complex the early self-replicases were. Still, it may be possible to develop working models by engineering modern ribozymes.
The Cech and Szostak groups have pioneered the creation of an RNA replicase by using the ligation activity of the Tetrahymena group I selfsplicing intron as an exemplar of ribozyme-directed RNA polymerization. Cech and co-workers have shown that this ribozyme can perform the template-independent elongation of short oligonucleotides using dinucleotide substrates of the form GpN, in which the 5'-terminal guanosine is the leaving group.Bartel et al. have further shown that Watson-Crick base pairing to a template can direct the sequential addition of nucleotides from GpN substrates. These efforts demonstrate that monomers can be ribozymatically polymerized, but attempts at template-directed polymerization have so far been limited to only short stretches of template (2 - 3 bases). In addition, the fidelity of this reaction appears to be limited: at best the correct nucleotide is picked 25 times more readily than a mismatch, while at worst it is preferred by only 2 times.
More substantial progress toward a self-replicase has been made by focusing on the ribozyme-catalyzed ligation of short oligonucleotides. The first step in the splicing cascade, the cleavage of the 5' intron-exon junction by guanosine, is freely reversible and, hence, can serve as a model for template-directed ligation (Fig. 2a). To allow an engineered replicase to work on templates in trans, a piece of the intron containing the cleavage/ligation junction was separated from the catalytic core (Fig. 2b).5' Next, the stem-loop containing the cleavage-ligation junction was broken into three pieces: the two oligonucleotides that are joined during the ligation reaction and the template on which they are aligned (Fig. 2c). Finally, it was demonstrated that multiple aligned oligonucleotides could be ligated together into a cRNA product 40 bases in length (Fig. 2d). A potential objection to this scheme as a model for the origin of life might be that the ligation reaction is most efficient with oligonucleotides of from 5 to 9 bases in length, whereas the majority of prebiotically synthesized oligonucleotides may have been shorter. Recently, however, the reaction conditions have been optimized to ligate successfully cRNA pieces of only 3 bases in length. The separation of catalyst, template, and cRNAs should, in theory, allow an engineered replicase to move back and forth between the ribozyme phenotype embodied in a "positive" strand and the information carried by cRNA in a "negative" strand.
At this point, it should be noted that there are several possible problems with this system serving as a paradigm for the origin of life. Any replicating system, whether composed of RNA or not, would have faced difficulties if t were free in the primordial soup. The concentration of "food" may have been extremely low; vagaries of the environment, such as metal ions, may have contributed to hydrolysis; and there is no assurance that a new catalyst would function so that its descendents (as opposed to "foreign" molecules) would be preferentially replicated. All these problems can be reduced or eliminated by somehow containing the replicating system, and we must assume that this was an important step in the early history of life. The exact nature of the containment is unknown, but it could have been achieved within vesicles that were conceptually similar to modern cells, as an organic layer on a mineral surface, or even by developing features which favor self-recognition (e.g., Weiner and Maizels have proposed that a stem-loop structure similar to tRNA may have "tagged" early molecules.
Further, the use of oligonucleotides does not eliminate the problems of fidelity that were encountered in the polymerization of monomers. As the length of the substrate oligonucleotide increases, so too does the number of competing oligonucleotides with similar sequences. If faithful copies of the parent ribozyme are to be synthesized, some mechanism for the recognition of correctly base-paired substrate-template complexes must be incorporated into the replicase. This is mechanistically feasible: the Tetrahymena ribozyme recognizes the geometry of wobble base pairs at the ligation junction; Herschlag and Cech have demonstrated that the effect of a mismatch on the recognition of a duplex oligonucleotide substrate exceeds the energetic contribution of the mismatch to the formation of the duplex (i.e.. some element of the correctly paired structure is recognized by the ribozyme); and Bartel et al have demonstrated that, for GpN polymerization, mismatches affect not only binding but also the rate of catalysis.
Some authors have also pointed out that an RNA replicase would face inherent difficulties with processivity (owing to intramolecular structure formation) and with separation of a newly synthesized strand from its template (because of the structural stability of double-stranded RNA). However, contemporary sequences, such as QB phage and the "X" RNA replicated by T7 RNA polymerase, have managed to solve these problems by using sequences that can gather into stable intramolecular secondary structures. and thus kinetically compete during synthesis with the formation of intermolecular double-stranded RNA. Alternatively, since life may have originated in an environment comparable to a hydrothermal vent, a sort of primordial polymerase chain reaction (PCR) might have occurred along a spatial (as opposed to temporal) temperature gradient: cRNA would have denatured from RNA duplexes at high temperatures, and new oligonucleotide substrates would have annealed and polymerized at lower temperatures. The intrinsically high temperature optima of most ribozymes (the Tetrahymena ribozyme is active at temperatures in excess of 50) would have been well-suited to this environment. Finally, the problem of strand separation has been approached in the engineered group I RNA ligase by dividing up the ribozyme into three smaller subunits. Each subunit has minimal secondary structure by itself, and thus can be copied readily. Following replication and strand separation, though, the subunits can pair with each other to form the complicated tertiary structure of the active nbozyme. It remains to be seen whether the intersubunit interaction energy is large enough to drive the denaturation of cRNA and template.
There are several reasons to believe that an RNA self-replicase, similar to the RNA ligase described here, could have heralded the evolution of an RNA world from a complicated mix of ribose and nonribose nucleotides. First, given the wide range of prebiotic nucleic acids, ribose-based polymers may be the most eminently suited for catalysis. Eschenmoser has pointed out, for example, that nucleic acids constructed from hexose nucleotides form inflexible ribbon structures, poorly suited for convoluting into the complex shapes that are required for catalysis (e.g., the backbone of the projected tertiary structure of the Tetrahymena self-splicing intron folds back on itself a number of times). Conversely, backbones composed of acyclic nucleotides may be too flexible to adopt stable secondary structures (since a great deal of entropy would necessarily be lost on "freezing" into a given conformer). Ribose, on the other hand, has a limited flexibility because of its pseudorotation cycle, and RNA can adopt a variety of helical conformations.
Second, a self-replicase that used oligonucleotides as substrates instead of mononucleotides would partially avoid the problem of enantiomeric cross-inhibition. Nonribose monomers located in the middle of a substrate oligonucleotide would have only an indirect effect on the alignment of the 3'-hydroxyl and 5'-leaving group in a ligation reaction. In fact, some ribose moieties in the "template" and "substrate" strands (see Fig. 2 for description) of the Telrahymena ribozyme can be substituted with deoxyribose without appreciable loss of catalytic activity. Also, since the ligation reaction is readily reversible, cRNA strands whose growth was terminated by oligonucleotides containing nonribose bases at their 3' ends could be "edited" out and replaced with oligonucleotides that could be productively elongated (of course, if the editing reaction is not specific it could lead to the cleavage of template RNA as well).
Finally, the use of oligonucleotides as substrates may provide a mechanism for preferentially replicating RNA templates. The oligomers in a prebiotic mix that contained a larger proportion or ribose residues should have annealed more readily to an RNA template than those oligonucleotides that contained fewer ribose-linked bases. Duplexes formed solely from RNA are more stable than duplexes in which one of the strands contains even only a minor chemical modification, namely, ribose to deoxyribose. In addition, acyclic residues are found to decrease significantly the melting temperature of DNA duplexes in which they are included.
Taken together, the experiments described above suggest a more optimistic model for an early RNA world than that based solely on prebiotic synthesis. Short oligonucleotides would have condensed into longer products via a series of autocatalytic ligation reactions. These self-replicating oligomers can be viewed as extremely primitive living systems that served primarily to amass sequence complexity for further selection. Sequences that could enhance their own syntheses by mechanisms similar to those found in modern ribozymes would have enjoyed the largest selective advantage. The first self-replicating ribozymes would have been mixed polymers but may eventually have evolved to contain only RNA because nucleic acids containing ribose backbones may have catalytic properties superior to nucleic acids containing other enantiomers in their backbones, and because ribose-linked substrates may have annealed most readily to ribose-linked templates. A more advanced replicase would of course be able to exercise stereochemical selectivity for its substrates, just as the Telrahymena ribozyme can today. The "crystallization" event that drove the formation of an RNA (read: all ribose) world was not some peculiar attribute of prebiotic chemistry; rather, it was the transition from uncatalyzed polymerization to catalyzed self-replication.
Although engineering a group I intron to be an RNA ligase is a practical demonstration that ribozyme-directed RNA polymerization could have existed in the past, it does not prove that the earliest self-replicases were similar to self-splicing introns. Obviously, other phosphodiester transfer reactions could be invoked as potential starting points for RNA self-replicases; for example, the "hammerhead" RNA motif is much smaller than a group I self-splicing intron and would have arisen more readily by chance. In addition, the hammerhead ribozyme utilizes nucleoside 2',3'-cyclic phosphates in ligation reactions, and these can be synthesized by prebiotic routes. Finally, it might be expected that polymerization catalysts that do not have to bind a leaving group would require less chemical sophistication than those that do. On the other hand, two groups have recently demonstrated the existence of homologous group 1 introns in a phylogenetically diverse set of organisms, from cyanobacteria to higher plants such as tobacco. The most parsimonious explanation for the occurrence of these introns is an insertion event that would date the group I motif to greater than 2.0 billion years ago.
Based on chemical considerations alone, ribozymes should be able to catalyze many different types of reactions. Ribozymes can maintain defined secondary and tertiary structures, just as protein enzymes do. Ribozymes can interact with substrates specifically via hydrogen bond networks, just as protein enzymes do. Finally, ribozymes have available to them a chemistry that, while more limited than of proteins, is substantial. RNA contains proton donors and aceeptors with pKa values that cluster at 4 and 9. The critical lack of a good donor/acceptor with a pKa near 7 can be rectified by any of several simple expedients, such as modification of guanosine to 7-methylguanosine, protonation of triple base-paired cystosine, or inclusion of a proton donor/acceptor in an environment with a different polarity than water (in this respect, it is interesting to note that Dahm and Uhlenbeck have found that the cleavage reaction catalyzed by the hammerhead ribozyme is dependent on some dissociable proton with a pKa Of 8.0).
A broad catalytic repertoire for RNA is not only theoretieally possible but experimentally demonstrable. Ribozymes that catalyze reactions other than phosphodiester bond rearrangements have recently been engineered or discovered. Piccirilli et al. have changed the substrate specificity of the Tetrahymena self-splicing ribozyme from tetrahedral phosphates to planar esters. Essentially, an ester bond was foreed into the active site of a group I intron by tethering an amino acid (methionine) to an oligoribonucleotide sequence that would normally position a phosphodiester bond in the active site for cleavage. The ester substrate was hydrolyzed 5-fold better in the enzyme active site than in solution. Similarly, to answer the question of whether ribosomal RNA actively eatalyzed protein synthesis. Noller et al. disseeted the peptidyltransferase aetivity of 50 S (large) ribosomal subunits purified from an obligate thermophile. Deproteination of Thermus aquaticus large subunit ribosomal RNA was vigorous and thorough: isolated ribosomes were treated with proteinase K and sodium dodecyl sulfate (SDS) at 60 and subsequently extracted with phenol. These techniques were sufficient to remove up to all but 1% of the protein normally associated with the ribosome, and the remainder appeared to be primarily in the form of small peptide fragments. The deproteinated RNA retained almost full amide bond-forming activity and was found to be inhibited by the same RNA-binding antibiotics that normally inhibit protein synthesis.
The most important criterion for a biological catalyst, however, is whether it can serve as a surface that enhances the reactivity of specific ligands; in other words, can it preferentially bind to a transition state, or provide the correct electrostatic environment for reaction, or juxtapose two substrates in a chemically productive fashion? Amplification techniques such as PCR and transcription-based amplification systems (TAS) have made it possible to select nucleic acid sequences that can present surfaces that are complementary to individual molecular shapes. In these in vitro genetic methods, a random or semirandom pool of nudeic acids is constructed, and those variants that can bind to a given ligand are selectively amplified. For example, RNA and DNA sequences that can bind to nucleic acid-binding proteins have been selected from pools containing from 8 to 26 randomized bases. RNA molecules that can form surfaces complementary to small organic dye molecules have been selected from pools containing 100 degenerate positions. The interactions with the dye molecules are specific, and particular residues appear to be involved in dye binding.
Although no new RNA enzymes have yet been selected, so-called aptamers, that can bind to specific compounds may be engineered to act as ribozymes. For example, it may be possible to engineer a "ribodiaphorase" from an NAD-binding RNA molecule (Fig. 3). FMN can be positioned near the NAD binding site by inclusion of a site-specific oligonucleotide tail, and electron transfer should be facile owing to proximity effects alone. Such a catalyst would demonstrate that early ribozymes could have relied on base pairing to position substrates in their active sites. Oligonucleotide tails could have acted as convenient "handles," a role which is even now ascribed to the otherwise superfluous nucleotide moieties of coenzymes.
Selecting or engineering new ribozymes will not only demonstrate that an RNA world was possible, but it will also give us an idea of how probable it might have been. For example, the rough probability that a given RNA molecule will bind to an organic dye is 1:10.1. This implies that even I micro-g of material (of ~100 bases in length) would contain 1000 functional sequences. In addition, although the original pool contained 100 degenerate positions, the binding sites in the selected molecules were only approximately 20 to 30 bases in length; this is within the range of lengths that can be produced by even untemplated condensations. Most importantly, in vitro selection experiments have demonstrated that multiple independent "solutions" to the same ligand binding "problem" may exist. For example, when RNA molecules that could bind to T4 DNA polymerase were selected, a sequence that differed significantly from the wild-type was also found. Similarly, when RNA molecules that could bind to organic dyes were cloned, the aptamers revealed little or no sequence similarity. If it is generally true that there are large numbers of independent sequences with similar functions in a random pool, then it would not have been necessary to search sequence space exhaustively for a particular molecule during the course of early evolution. Even relatively small numbers of random oligonucleotides may have contained phenotypes that could be selected.
First, the question of where nucleotide monomers may have come from is critical. Given that the formose reaction is the most likely candidate for the synthesis of prebiotic ribose, but yields very little pure material, the role of stereoselective catalysts (clays, amino acids, or lipid aggregates) in directing the reaction should be fully explored. In this respect, Wachtershauser has advanced a scheme for nucleotide synthesis based on pyrite catalysis than can be readily tested.
Second, the "crystallization" event that led to the selection of RNA as the principal prebiotic biopolymer can be more thoroughly explored. If nucleotide analogs preceded nucleotides as monomeric building blocks, then it should be possible to selectively polymerize activated ribonucleosides on "unnatural" templates. Very little is known about the production of hybrid helices involving both RNA and "other" strands, yet such structures must be postulated as transitory intermediates on the way to contemporary genetic material. In addition, the self-selection of particular RNA oligonucleotides (Fig. I ) can be attempted from random sequence mixtures or from mixtures in which some of the oligonucleotides have been "poisoned" by inclusion of enantiomerically impure monomers (e.g., mixed ribose-arabinose backbones). Ligation may avoid the problems of enantiomeric cross-inhibition described for polymerization.
Third, the search for a self-replicating RNA catalyst can be continued either by deconstructing known catalytic activities or by recapitulating events that may have occurred in the primordial soup. The template-directed polymerization of short oligonucleotides may soon lead to the replication of small ribozyme subsegments and to the creation of replicating RNAs from the bottom up. An alternate top-down approach, though, would be to determine how a whole functional ribozyme could be divided into replicable subsegments. For example, the Tetrahymena self-splicing intron could be split into two (or more) portions. Successful template-directed ligation of these portions would lead to an increase in the amount of catalyst available to perform more ligations. The template could be similarly divided, and mixtures of oligonucleotides could undergo amplification cycles in which first catalyst (ribozyme) and then template molecules were reconstructed. By deterrnining where and how scissions could be introduced into the original ribozyme and template molecules, a self-replicating system based on short oligonucleotides could be derived. Finally, a somewhat impractical experiment that mimics both the origin of life and computer simulations dubbed cellular automatons would be to mix random sequence oligonucleotides with nucleotide monomers and determine what sequences could "take over" the population by replicating themselves. Given the probable complexity (and, hence, relative scarcity in a random sequence population) of a ribozyme polymerase, and the problems of self-recognition and hydrolytic side reactions, this experiment is unlikely to work. However, a more modest task could be set to a random sequence RNA pool, such as catalyzing a single ligation reaction. The sequence complexity of catalysts such as the hairpin ribozyme makes it likely that molecules can be found that will enhance the ligation of oligonucleotides over background rates in solution. Such catalysts could be selected by amplification methods similar to those found in Green et al or Robertson and Joyce.
Finally, in vitro selection experiments can be used to generate new ribozymes. Catalytic antibodies have been selected by immunizing animals with transition state analogs (TSAs; molecules that mimic high-energy reaction intermediates), and the same appro ach should work using random sequence pools of nucleic acids. Affinity columns with TSA ligands could be used to purify nucleic acids from random sequence pools. Multiple cycles of selection and amplification should result in the isolation of sequences th at can bind the TSA tightly and specifically. These sequences can then be tested for catalytic activity against substrates for the reaction that the TSA mimics. The replicability of nucleic acids in vitro suggests other selection schemes as well: a primer for reverse transcription or PCR amplification could be derivatized with a blocking group, and only those sequences in a pool that could hydrolyze the blocking group could be replicated and amplified as well. Similarly, very short oligonucleotide primers (4-6 bases) could be derivatized with a ligand, and only those sequences in a pool that could direct primer binding via the ligand would be replicated and amplified. Although these approaches probably do not resemble events in primordial molecular evolut ion, they nevertheless can give us a feel for how difficult it is to find nucleic acid catalysts de novo, and what such nucleic acid catalysts are capable of doing.
I acknowledge Jennifer Doudna, Michael Famulok, and Jack Szostak for helpful discussions dunng the preparation of the manuscnpt, and David Bartel for insights into the importance of fidelity in template-directed polymerizations.