1. Introduction
As a matter of course, studies on origin and evolution of the genetic code, which mediates between two core members (gene and protein) in the genetic system, are quite important to understand the origins and evolution of gene and protein. Genes and proteins use codons and amino acids specified by the genetic code, respectively. Therefore, origin of life could be solved through elucidation of origin of the genetic system composed of gene, genetic code and protein (
Figure 1).
On the other hand, metabolic pathways are driven by proteins or enzymes, which are produced by expression of genetic information written into genes. The enzymes are synthesized with amino acids, which are produced through metabolic pathways. Therefore, amino acids synthesized through the most primitive metabolic pathways should be used in the first genetic code. Inversely stating this, amino acids used in the first genetic code could be determined by knowing the amino acids, which were produced through the first amino acid synthetic pathways (
Figure 1).
Thus, elucidation of origin and evolution of the genetic code is a quite important matter, which may lead to solving not only origins and evolution of gene and protein but also origins of evolution of metabolism and life. Nevertheless, the origin and evolutionary process of the genetic code, especially, the evolutionary process, about what genetic code was used as an intermediate code bridging over the first genetic code with the universal genetic code has not been well made clear. On the other hand, it is well known that coevolution theory suggests that the genetic code and the biosynthetic relationships between amino acids evolved in parallel. Then, in this article, the origin and evolution of the genetic code are discussed from a viewpoint of the coevolution theory [
1,
2,
3,
4,
5], especially of evolution of amino acid synthetic pathways.
3. Why genetic code and amino acid synthetic pathway coevolve?
The coevolution theory suggests that the genetic code coevolved with invention of biosynthetic pathways for new amino acids [
1,
2,
3,
4,
5]. That is the reason why the organization of the genetic code is determined by relationships of precursor amino acid-product amino acid [
1,
2,
3,
4,
5]. Of course, an amino acid produced upon formation of a new synthetic pathway should trigger formation of a new genetic code on the way of evolution. Therefore, the idea would be naturally and always valid, because evolution of the genetic code encoding amino acids is determined by the order of amino acids, which were produced through metabolic pathways and accumulated at a large amount in cells.
Note that similar things should be observed in the cases of formation of, not only amino acid synthetic pathway but also any metabolite synthetic pathway, because the idea of the coevolution theory could be applied to studies on evolutionary process of the metabolic pathways [
6]. In some cases, a new metabolic pathway is formed by using of an intermediate of previously existed metabolic pathway. For example, Leu synthetic pathway is formed by using an intermediate, 2-oxoisovalerate, for Val synthesis as a starting molecule (
Figure 2) [
6]. In this case, it would be obvious that Leu synthetic pathway was formed after Val synthetic pathway was completed. Thus, it would be easily understood that the coevolution theory is a valid idea, because evolution of genetic code should be triggered by accumulation of an amino acid, which was produced through a newly formed metabolic pathway. Note that it can be also applied in the cases when a new synthetic pathway is formed by using enzymes driving a part of a previously existed metabolic pathway. There is an example that Ile synthetic pathway is formed by connecting a similar but different chemical compound (2-oxo-2-hydroxy butanoate) produced through a new metabolic pathway from pyruvate with the enzyme system starting from 2-acetolactate (
Figure 2) [
6]. In this case too, it can be concluded that Ile synthetic pathway was formed after completion of Val synthetic pathway (
Figure 2).
Thus, it can be reconfirmed that the coevolution theory is a valid idea based on some applicable examples above, in addition to the results obtained thus far [
1,
2,
3,
4,
5]. In the next
Section 4, some conditions, which must be satisfied when the coevolution theory is applied to evolutionary process of the genetic code, are discussed.
5. The origin of the genetic code deduced from coevolution theory
In this article, the following three matters are mainly discussed based on the coevolution theory. (1) What was the first genetic code? (2) How has the first genetic code evolved to the universal genetic code via what type of an intermediate code? (3) Why must the genetic code pass through the intermediate code?
The coevolution theory, which advocates that genetic code has evolved together with development of amino acid metaboism (
Figure 1), is one of important concepts for explaining evolutionary process of the genetic code in relation to amino acid synthetic pathways [
1,
2,
3,
4,
5]. The coevolution theory is also a reasonable idea, which can be applied to the origin of the genetic code.
Then, first reconfirm and discuss the origin of genetic code in relation to amino acid metabolism. However, it is necessary to know what was the first genetic code, in order to discuss on the origin and evolution of metabolism, because some primitive genetic code hypotheses have been proposed until now, for example, GC code hypothesis [
7], the four column theory [
8] (
Table 1) and so on.
However, both ideas would be unreasonable for the first genetic code. The reasons are as follows
Both hydrophobic amino acid and β-sheet forming-amino acid are not contained in the GC code. Instead, two turn/coil-forming amino acids, Pro and Gly, are contained in the four amino acids [
7]. Therefore, any polypeptide chain synthesized under the GC code could not be folded into water-soluble globular structure, which is one of prerequisite conditions to express catalytic activity.
Similarly as GNC primeval genetic code [
9,
10], four [GADV]-amino acids are used in the genetic code, which is supposed in the four column theory [
8]. However, it would be necessary in the four column theory to decode the genetic code as using wobble recognition at the first and the third codon positions. However, it would be impossible to form base pairs through the wobble recognition without any elaborate apparatus like ribosome in the first genetic code era, because base-pair formation should be carried out as seeking for the most stable base position. Otherwise, 64 tRNAs must be prepared for use of only four amino acids. In addition, it is necessary to once change from the assigned codon to unassigned codon on evolutionary process of the genetic code. However, the change would be impossible because of generation of multiple termination codons causing lethal mutations. Therefore it must be concluded that the first genetic code considered in the four column theory never be realized.
These mean that it is not sufficiently understood still now even that the universal genetic code was originated from what type of genetic code. One of the reasons would be, probably because the origin of genetic code has been discussed not comprehensively but individually until now, irrespective of the fact that the genetic code plays an important role in connecting gene with protein (
Figure 1).
On the other hand, we proposed GNC-SNS primitive genetic code hypothesis about 20 years ago [
9,
10]. According to the hypothesis, it is considered that the universal genetic code originated from GNC code composed of four GNC codons and four [GADV]-amino acids and evolved via SNS code composed of 16 codons and 10 amino acids. Then, newly reconfirm plausibility of the GNC primeval genetic code hypothesis.
Protein containing [GADV]-amino acids at roughly equal amounts satisfies four conditions (hydrophobicity/hydrophilicity, α-helix, β-sheet, turn/coil formabilities) necessary to water-soluble globular protein formation [
9,
10]. This means that water-soluble but unrefined and immature [GADV]-proteins could be formed even by random joining of [GADV]-amino acids, because [GADV]-amino acid composition containing roughly equal amounts of [GADV]-amino acids is one of protein 0th-order structures, which give a frame necessary to form water-soluble globular structure [
10,
11]. The characteristics of [GADV]-amino acids made it possible to select the four amino acids in the messy environments on the primitive Earth [
12].
Three anticodon stem-loops (AntiC-SLs) containing one of GNC triplets except GUC are stable without chemical modification in the loop [
10,
12]. In addition, the AntiC-SLs containing one of GNC triplets could make two stable dimers vertically bound through triplet base pairs (5’GGC3’/3’CCG5’ and 5’GUC3’/3’CAG5’) [
13]. Thus, four GNC triples could be selected as anticodons for establishment of GNC primeval genetic code.
A reasonable evolutionary process of tRNA could be deduced from analyses of anticodon stem base sequences of
Pseudomonas aeruginosa tRNAs [
10,
14].
Evolutionary process of the genetic code could be deduced more correctly on the definite basis of the first genetic code, GNC, if it could be reconfirmed that the GNC primeval genetic code hypothesis is valid from a standpoint of the coevolution theory.
A hierarchy between amino acids encoded by codons specified by the first base and precursor molecules of the respective amino acid syntheses can be seen. For example, as seen in
Figure 2, four amino acids encoded by G-start codons, Gly, Ala, Asp and Val, are synthesized by direct amination of the corresponding ketoacids, glyoxylate, pyruvate, oxaloacetate and oxoisovalerate (
Figure 2 and
Figure 3) [
6]. Therefore, it can be concluded that the four amino acids, Gly, Ala, Asp and Val, are the amino acids used in the first genetic code, because the four amino acids can be synthesized without using any other amino acid as a precursor molecule [
6]. Val, which is produced at five reaction steps from pyruvate, is the most simply structured amino acid among natural hydrophobic amino acids (
Figure 2 and
Table 2). Furthermore, it can be understood that the synthetic pathway of Glu was formed after metabolic pathway for Asp synthesis was formed, because Glu is synthesized at four steps from oxaloacetate or a direct precursor molecule of Asp. Thus, the results obtained by analyses of amino acid metabolic pathways are consistent with the GNC primeval genetic code hypothesis insisting that the first genetic code was GNC encoding [GADV]-amino acids [
14].
Lei and Burton describes in their paper that the genetic code initially evolved to synthesize polyglycine as a cross-linking agent to stabilize protocells and, thereafter, the code sectored from a glycine code to a four amino acid code to an eight amino acid code to an ~16 amino acid code to the standard 20 amino acid code with stops [
15]. Di Giulio also described that based on a result of the coevolution theory, the very earliest phases of genetic code the type GNN [
4]. Thus, their ideas about the origin are consistent with GNC-SNS primitive genetic code hypothesis [
9].
6. Evolutionary pathway of the genetic code viewed from coevolution theory or amino acid metabolism
Of course, a reliable genetic code connecting the first genetic code with modern universal genetic code could not be obtained, if the intermediate genetic code is discussed under a wrong idea about the first genetic code (
Figure 4 (A)). However, it has been fortunately confirmed that the first genetic code was GNC code based on the results analyzed from a view point of the coevolution theory (
Section 5).
Then, discuss in this Section what was the intermediate genetic code connecting the GNC primeval genetic code with modern universal genetic code (
Figure 4A). For the purpose, it is first described what type of intermediate genetic codes have been proposed. The first one is SNS code hypothesis (
Figure 4B) [
9] and the second one is RNY code hypothesis (
Figure 4C) [
16,
17].
6.1. Analysis of Evolutionary Process of the Genetic Code
As described in
Section 5, it has been reconfirmed from the viewpoint of the coevolution theory that the first genetic code was GNC code. Then, consider from viewpoint of the coevolution theory what type of genetic code was used to connect the GNC code with universal genetic code.
The results seen in
Table 2, two cases of amino acids encoded by C- and A-start codons contrast with each other. Three (Pro, Gln, Arg) out of five amino acids encoded by C-start codons are synthesized as a starting molecule, Glu, one of five [GADVE]-amino acids, which are encoded by G-start codons (
Figure 5) [
6]. This means that metabolic pathways of the three amino acids were formed after metabolic pathway for Glu synthesis was established. On the other hand, Leu is synthesized with the last intermediate (2-oxoisovalerate) on Val synthetic pathway (
Figure 2). This indicates that the synthetic pathway of Leu was formed after the metabolic pathway of Val was established. Note that His is synthesized at ten steps starting from the first reaction using PRPP and ATP (
Figure 5) [
6]. Therefore, it is considered based on the coevolution theory, that His is captured into a genetic code after formation of ATP synthetic pathway and probably establishment of GNS (GNC + GNG) code. Thus, it can be supposed that all the five amino acids encoded by C-start codons were incorporated into GNS code.
Three amino acids, Thr, Gln and Lys, out of five amino acids encoded by A-start codons are synthesized through metabolic pathways using Asp as a precursor molecule (
Figure 5) [
6]. On the other hand, metabolic pathway for Ile synthesis is composed of 2-aceto-2-hydroxybutanoate synthetic pathway from pyruvate and the four-steps reaction pathway commonly using enzymes for Val synthesis (
Figure 2) [
6]. The rest of amino acid, Met, is synthesized at five steps from Ser via Cys (
Figure 6) [
6]. From the considerations, it can be concluded from a viewpoint of the coevolution theory that both C- and A-start codons were incorporated into a genetic code, after GNS-code was established, because two amino acids, Asp and Glu, are encoded by two G-start codons, GAC and GAG, respectively.
On the contrary, three aromatic amino acids, Phe, Tyr and Trp, out of five amino acids encoded by U-start codons, are produced through a branched pathway starting from phosphoenol pyruvate and erythrose 4-phosphate, which are synthesized through glycolysis and pentose-phosphate cycle, respectively (
Figure 6) [
6]. Ser is synthesized from 3-phophoglycerate, which is produced through metabolic pathway starting from glycelaldehyde on glycolysis. Cys is produced two reaction steps from Ser (
Figure 6) [
6]. Thus, the four amino acids except Cys are synthesized without using a precursor amino acid. Therefore, it is impossible to determine the time, when the amino acids encoded by U-start codons were introduced into what code previously existed, based on only the coevolution theory. In other words, the amino acids except Cys encoded by U-start codons are synthesized independently of any other amino acid. Nevertheless, it could be assumed that the amino acids encoded by U-start codons must be captured at the last stage after G-, C- and A-start codons were formed. The reason is because three aromatic amino acids are synthesized by reactions using more than ten steps, of which formation is expected to be quite difficult (
Figure 6) [
6].
6.2. Characteristics and Differences of RNY code and SNS code
In this Subsection, the characteristics of the two genetic code, SNS code and RNY code, are summarized, because, by doing so, it would become possible to determine which code is appropriate as an intermediate code connecting the first GNC genetic code with universal genetic code.
Both SNS code and RNY code are consistent with the idea that the genetic code originated from the GNC primeval genetic code, because GNC code is contained in the two codes (
Figure 4B,C).
SNS code is symmetrical code between codon on sense strand and anticodon on antisense strand and is composed of ten amino acids and sixteen codons. Two acidic amino acids, Asp and Glu, and two basic amino acids, His and Arg, are contained in the SNS code. Two basic amino acids, His and Arg, and one highly hydrophobic amino acid, Leu, which are relatively complex amino acids, are contained in the SNS code. (
Figure 4B).
RNY code is also symmetrical code between codon and anticodon and is composed of eight relatively simple amino acids except Ile and sixteen codons. No basic amino acid and only one acidic amino acid, Asp, is contained in the RNY code (
Figure 4C).
4. In the case of SNS code, it is considered that the order of codon capture was GNC--GNG--CNC--CNG codons. On the contrary, it is supposed that codons were captured in order of codons GNC--GNU--ANC--ANU to form RNY code.
The reason, why such evolutionary process of the genetic code can be assumed as advanced in the sequence of codon captures, is because it can be supposed that in the case of SNS code, CNC codons complementary with GNG codons were captured into GNS (GNC + GNG) code upon entirely new gene formation from antisense strand [
10,
18]. On the other hand, in the case of RNY code, ANC codons complementary to GNU codons were introduced into GNY (GNC + GNU) code (
Figure 7). Therefore, both cases are consistent with the idea that entirely new genes would be generated from antisense strand of GC-rich gene.
6.3. Evidences showing that SNS code but not RNY code was used as an intermediate code
From here, discuss which code, SNS code or RNY code, was used as an intermediate code connecting the first GNC code with the universal genetic code. Then, two evidences showing that SNS code was used earlier than RNY code are given.
As can be seen in
Figure 8, codon sequences on sense strand and antisense strand of modern GC-rich genes are similar to (SNS)
n [
18]. The fact is one evidence showing that (SNS)
n were certainly used as a genetic sequence in the ancient days, because it is considered that the remnants of SNS codon usage remain in modern GC-rich genes and in its antisense codon sequences still now. Contrary to that, RNY pattern cannot be found out in any region of modern microbial genes (
Figure 8).
Furthermore, the remnants of evolution of tRNA were found out by analyses of anticodon-stem sequences of
Pseudomonas aeruginosa tRNAs, which actually realize the correspondence relationships between a codon and an amino acid [
14].
The evidences above clearly indicate that the genetic code, which originated from GNC code, evolved to universal genetic code via SNS code.
In addition, there is another defect in evolutionary process passing through RNY code, because GC content of genes changes largely from average 83% GC content of GNC code to average 50% GC content of RNY code. On the contrary, GC content of genes slowly changes from GNC (average 83% GC content)-SNS code (average 83% GC content) and to universal genetic code (average 50% GC content) taking a long time. In addition, the field for entirely new gene formation can be remained in antisence codon sequence of GC rich genes [
18]. However, it must be stated that the field for entirely new gene formation should be lost upon the change from GNC code to RNY code.
Thus, it can be concluded that not RNY code but SNS code was used as an intermediate code from the evidences showing that there was an evolutionary route from GNC code to SNS code.
6.4. Decoding SNS Code and RNY Code by tRNA
All the sixteen codons in SNS code can be decoded by tRNA without wobbling recognition. On the contrary, it would be necessary to decode RNY code with eight pairs of RNU and RNC codons as using wobble recognition at the third codon position. However, a complex mechanism, such as ribosome, should be required to enable the wobble recognition between two bases (U and C) at the third codon position. Therefore, it would be quite difficult to decode RNY code without elaborate mechanism in the ancient days, because base pairs would be generally formed as a result of seeking for the most stable position (Watson-Crick base pair formation). Therefore, it is considered that it would be actually impossible to use RNY code as an intermediate code. In addition, it must be indicated that the number of amino acids, which can be used in proteins, does not increase even by using the wobble recognition.
8. Discussion
Significance of the coevolution theory is first described and it was confirmed that the theory is valid as an example of formation process of GNC primeval genetic code. After the conditions, which must be satisfied when the origin and evolutionary process of the genetic code are analyzed by using the coevolution theory, were confirmed, the universal genetic code, which originated from GNC code, evolved through what type of an intermediate code were analyzed from the viewpoint of the coevolution theory and/or amino acid synthetic pathways. In the analyses, it could not be determined which code, SNS code or RNY code, was used as an intermediate code to the universal genetic code. Therefore, it was determined that the universal genetic code originated from GNC primeval genetic code and evolved through SNS code based on characteristics and their differences of SNS code and RNY code. Thereafter, the reasons, why SNS code was formed earlier than RNY code, are discussed and it has been concluded that there was not the period, when RNY code itself was used. Furthermore, the reason, why it could not be determined which code, SNS code or RNY code, was used as an intermediate code, is also discussed.
In the analyses, it was found that many amino acids can be synthesized using an acidic amino acid, Glu or Asp, as a precursor molecule. The reason, why acidic amino acids can be used for synthesis of other amino acids, is explained as that structures of acidic amino acids, Asp and Glu, are comparatively simple and it was convenient to use the amino acids, which accumulated at a high amount in cells at early stage of the emergence of life.
Further, the reason, why structures of acidic amino acids are simpler than those of basic amino acids, is discussed. The reason would be because it was easy to synthesize simply structured acidic amino acids with carboxylic residue and also because positive charges, which are not contained in acidic amino acids, could be substituted by divalent metal ions, as Mg2+, Mn2+, Ca2+, Cu2+ and etc. Contrary to that, the reason, why basic amino acids, which are used in the genetic code, have relatively complex structure, would be because basic amino acids were synthesized much later than acidic amino acids and it was necessary to avoid to compete with acidic amino acids having a simple structure, except charge of side chain.
Lastly, discuss the time when Met was synthesized and incorporated into a previously existed genetic code. The reason is because Met encoded by A-start codon (AUG) is synthesized by using Cys encoded by U-strat codons (UGY) as a precursor amino acid (
Figure 6). This means that the codon for Met should be incorporated into a genetic code after Cys synthetic pathway was formed according to the coevolution theory. However, on the other hand, Met encoded by A-start codon should be incorporated into a genetic code before Cys capture into a genetic code according to the conclusion that A-start codons should be captured earlier than U-start codons as described in this article (
Figure 9). This means that there is another contradiction between the times when two amino acids, Met and Cys, were incorporated into a genetic code. The reason can be explained as follows. Met was synthesized at the last stage of evolutionary process of the genetic code or after synthetic pathway of Cys encoded by U-start codon was completed. Further, it can be reasonably explained as that, at that time, when Met synthetic pathway was formed, the used amount of AUG codon assigned into Ile was small, and therefore, the change of codon assignment from Ile to Met could be carried out without any large obstacle. The validity could be supported by the fact that the usage amount of AUG codon for Met is quite small even at this time point.
Needless to say, the origins and evolution of the genetic code relate not only the metabolic pathways for amino acid synthesis but also tRNA and aminoacyl tRNA synthetase [
15,
19,
20,
21,
22,
23,
24]. Therefore, it is expected that studies on the origins and evolution of the genetic code make great progress through the comprehensive studies about some members related to the genetic code.
Figure 1.
Metabolic pathways are driven by proteins or enzymes (white bold arrow). Proteins are produced through the genetic system mainly composed of three members, gene, genetic code and protein (thin black arrows). The genetic code has evolved by incorporation of new amino acids (upward white bold arrow), which were synthesized through metabolic pathways (downward white bold arrow and curved bold arrow).
Figure 1.
Metabolic pathways are driven by proteins or enzymes (white bold arrow). Proteins are produced through the genetic system mainly composed of three members, gene, genetic code and protein (thin black arrows). The genetic code has evolved by incorporation of new amino acids (upward white bold arrow), which were synthesized through metabolic pathways (downward white bold arrow and curved bold arrow).
Figure 2.
Synthetic pathways of five amino acids (Gly, Ala, Asp, Val and Glu) encoded by G-start codons and of two hydrophobic amino acids, Ile and Leu. The accumulation order of the seven amino acids was deduced based on the number of reaction steps from glyoxylate or pyruvate (indicated by underlined red letters) to the respective amino acids, as (Gly, Ala), Asp, (Val, Glu) and (Leu, Ile). Ketoacids used direct amination are written by red letters. One reaction step from 2-Isopropyl-3-oxosuccinate to 4-Methyl-2-oxopentanoate is omitted from the figure, because the step proceeds spontaneously. Blue boxes indicate the same enzymes used for both Val and Ile syntheses.
Figure 2.
Synthetic pathways of five amino acids (Gly, Ala, Asp, Val and Glu) encoded by G-start codons and of two hydrophobic amino acids, Ile and Leu. The accumulation order of the seven amino acids was deduced based on the number of reaction steps from glyoxylate or pyruvate (indicated by underlined red letters) to the respective amino acids, as (Gly, Ala), Asp, (Val, Glu) and (Leu, Ile). Ketoacids used direct amination are written by red letters. One reaction step from 2-Isopropyl-3-oxosuccinate to 4-Methyl-2-oxopentanoate is omitted from the figure, because the step proceeds spontaneously. Blue boxes indicate the same enzymes used for both Val and Ile syntheses.
Figure 3.
Metabolic pathways starting from four ketoacids (written with red letters). Asp is used as precursor molecules for purine and pyrimidine nucleotides. Single-step reactions and multi-step reactions are indicated by thin arrows and broken arrows, respectively. Bold black arrows indicate that amino acids are used for protein synthesis.
Figure 3.
Metabolic pathways starting from four ketoacids (written with red letters). Asp is used as precursor molecules for purine and pyrimidine nucleotides. Single-step reactions and multi-step reactions are indicated by thin arrows and broken arrows, respectively. Bold black arrows indicate that amino acids are used for protein synthesis.
Figure 4.
(A) Two evolutionary pathways of the genetic code, which passed through as an intermediate genetic code, (B) SNS code or (C) RNY code. It is considered that SNS code and RNY code were formed by piling up CNS code on GNS code and by piling up ANY code on GNY code, respectively. Amino acids written with bold letters indicate the four amino acids encoded by GNC primeval genetic code.
Figure 4.
(A) Two evolutionary pathways of the genetic code, which passed through as an intermediate genetic code, (B) SNS code or (C) RNY code. It is considered that SNS code and RNY code were formed by piling up CNS code on GNS code and by piling up ANY code on GNY code, respectively. Amino acids written with bold letters indicate the four amino acids encoded by GNC primeval genetic code.
Figure 5.
(A) Synthetic pathways of three amino acids (Pro, Gln, Arg) encoded by C-start codons. The three amino acids are synthesized from a precursor amino acid, Glu, encoded by GAR codons. His synthetic pathway is shown in an inlet. (B) Amino acid synthetic pathways of three amino acids (Thr, Gln and Lys) encoded by A-start codons. The three amino acids are synthesized from a precursor amino acid, Asp, encoded by GAY codons. Note that Arg and Lys synthetic pathways are shown in two lines.
Figure 5.
(A) Synthetic pathways of three amino acids (Pro, Gln, Arg) encoded by C-start codons. The three amino acids are synthesized from a precursor amino acid, Glu, encoded by GAR codons. His synthetic pathway is shown in an inlet. (B) Amino acid synthetic pathways of three amino acids (Thr, Gln and Lys) encoded by A-start codons. The three amino acids are synthesized from a precursor amino acid, Asp, encoded by GAY codons. Note that Arg and Lys synthetic pathways are shown in two lines.
Figure 6.
Synthetic pathways for three aromatic amino acids, Trp, Phe and Tyr. The amino acids are synthesized by complex metabolic reactions composed of more than 10 steps. It can be understand that Trp synthetic pathway was formed by branching out from chorismate and, therefore, Trp synthetic pathway was formed after completion of Phe/Tyr synthetic pathway. Boxed metabolic pathways show Met synthetic pathway from Ser via Cys. It is supposed that Ser synthetic pathway from Gly was added at a later time to replenish the shortage of Ser.
Figure 6.
Synthetic pathways for three aromatic amino acids, Trp, Phe and Tyr. The amino acids are synthesized by complex metabolic reactions composed of more than 10 steps. It can be understand that Trp synthetic pathway was formed by branching out from chorismate and, therefore, Trp synthetic pathway was formed after completion of Phe/Tyr synthetic pathway. Boxed metabolic pathways show Met synthetic pathway from Ser via Cys. It is supposed that Ser synthetic pathway from Gly was added at a later time to replenish the shortage of Ser.
Figure 7.
The order of a new codon usage. (A) In the case of SNS code, it is supposed that formation of SNS code started from GNC code and was established by creation and addition of new codons to the GNC code in order of GNG codon formation on sense strand and CUC codon formation on antisense strand. (B) In the case of RNY code, it is assumed that RNY code started from GNC code and was established by creation and addition of new codons to the GNC code in order of GNU codon formation on sense strand and AUC codon formation on antisense strand.
Figure 7.
The order of a new codon usage. (A) In the case of SNS code, it is supposed that formation of SNS code started from GNC code and was established by creation and addition of new codons to the GNC code in order of GNG codon formation on sense strand and CUC codon formation on antisense strand. (B) In the case of RNY code, it is assumed that RNY code started from GNC code and was established by creation and addition of new codons to the GNC code in order of GNU codon formation on sense strand and AUC codon formation on antisense strand.
Figure 8.
The evidence that remnant of SNS code remains in GC-rich region of modern microbial genomes. Note that remnant of GNC code also remains in the GC-rich region. On the contrary, such remnant of RNY code cannot be found any region of the microbial genomes.
Figure 8.
The evidence that remnant of SNS code remains in GC-rich region of modern microbial genomes. Note that remnant of GNC code also remains in the GC-rich region. On the contrary, such remnant of RNY code cannot be found any region of the microbial genomes.
Figure 9.
Evolutionary process deduced from the coevolution theory and the GNC-SNS primitive genetic code hypothesis. See Subsections, 6.2 and 6.3 about the reasons why the evolutionary pathway pass through SNS code. Capital letters in square brackets show amino acids written with one letter symbol. Alphabets written in blue letters indicate amino acids, which were newly captured into the previously existed code.
Figure 9.
Evolutionary process deduced from the coevolution theory and the GNC-SNS primitive genetic code hypothesis. See Subsections, 6.2 and 6.3 about the reasons why the evolutionary pathway pass through SNS code. Capital letters in square brackets show amino acids written with one letter symbol. Alphabets written in blue letters indicate amino acids, which were newly captured into the previously existed code.
Table 1.
Two genetic code tables, which have proposed as the first genetic code.
(A) GC code hypothesis [
7] and
(B) Four column theory [
8]. The GC code is highly GC-rich and the four column theory is composed of four [GADV]-amino acids. Both of which are partly similar to GNC primeval genetic code, which we have proposed [
9,
10].
Table 1.
Two genetic code tables, which have proposed as the first genetic code.
(A) GC code hypothesis [
7] and
(B) Four column theory [
8]. The GC code is highly GC-rich and the four column theory is composed of four [GADV]-amino acids. Both of which are partly similar to GNC primeval genetic code, which we have proposed [
9,
10].
Table 2.
Relationship between (product) amino acid and precursor amino acid.
(A) Four amino acids written in blue letters indicate [GADV]-amino acids encoded be G-start codons. Non-amino acid precursor molecules are written in parenthesis. The number written in parenthesis shows the number of reaction steps from respective precursor molecules. PRPP and Pen-p cycle means phosphoribosyl- pyrophospate and pentose-phosphate cycle, repressively. The number (4) described in double parentheses indicates reaction steps using the same enzymes with Val synthetic pathway (
Figure 2).
Table 2.
Relationship between (product) amino acid and precursor amino acid.
(A) Four amino acids written in blue letters indicate [GADV]-amino acids encoded be G-start codons. Non-amino acid precursor molecules are written in parenthesis. The number written in parenthesis shows the number of reaction steps from respective precursor molecules. PRPP and Pen-p cycle means phosphoribosyl- pyrophospate and pentose-phosphate cycle, repressively. The number (4) described in double parentheses indicates reaction steps using the same enzymes with Val synthetic pathway (
Figure 2).
 |
| Note: Some characteristics can bee seen in the above table. (1) amino acid metabolic pathways of all the five amino acids (Gly, Ala, Asp, Val and Glu) using a G-start codon do not use any precursor amino acid. (2) Reaction steps from pyruvate more than 8 are used in the synthetic pathways of two hydrophobic amino acids (Leu and Ile) with a long side chain. (3) Many reaction steps (more than 10) counted from the initial precursor molecules are also used in the synthetic pathways of three aromatic amino acids (Phe, Tyr and Trp). (4) Intermediate molecule (2-oxoisovalerate) for Val synthesis is used in Leu synthetic pathway and a series of enzymes on the Val synthetic pathway are used for Ile synthesis. (5) PRPP is used at the initial step of His synthetic pathway. (6) Glu and Asp are used as a precursor amino acid in synthetic pathways for three amino acids encoded by C-start codon and A-start codon, respectively. Pent-p cycle is a abbreviation of pentose-phosphate cycle.
|
|
(B) The universal genetic code table summarized for each base at the first codon position. (Upper left) G-start codon table. (Upper right) C-start codon table. (Lower left) A-start codon table. (Lower right) U-start codon table. Term in the U-start codon table indicates termination codon.
|
 |