The Origin(s) of Cell: Some Radical New Hypotheses

The central mechanism of biological evolution, variation-selection-inheritance (VSI), is one of the most universal mechanisms known. Much of our understanding of VSI, however, has been dominated by the Neo-Darwinian Modern Synthesis with a rather narrow understanding of what constitutes variation, selection, and inheritance. This unduly narrow understanding of VSI might have been a key cause behind our failure to adequately explain some critical puzzles in biological evolution, from the origin of the first cell to the origin of the eukaryotes, the puzzling biology of metabolism, apoptosis, aging, and cancer in metazoan.I broaden our understanding of VSI, in a spirit that is somewhat similar to several recent contributions and then extend this broadened view of VSI to its natural starting point: the origin of the First Universal Cell Ancestor (FUCA). I advance three principal arguments. First, survival comes before replication. Before the coming of reproducer and replicator, there must be survivors, to paraphrase Szathmary and Maynard Smith (1997). Second, natural selection, especially the non-Darwinian kind, can operate without replication or even metabolism, as long as different molecules, complexes, and vesicles have differential survival rate within a system. Third, merger and acquisition, via breaking-and-re-encapsulation, endocytosis, endosymbiosis, and processes similar to them, had been a far more powerful force of variation and selection in the pre-Darwinian period of evolution that led to LUCA and long before eukaryogenesis. Endosymbiosis therefore had been a far more foundational force than even Lynn Margulis and many of her supporters have appreciated. Our thesis thus goes beyond Woese’s emphasis of horizontal gene transfer (HGT) and actually subsumes HGT with Margulis’ emphasis of endosymbiosis. Combing these three new perspectives with other perspectives and evidence sheds important new light upon the origin of FUCA, the singular water-shedding moment in the evolution of life.


Introduction
The central mechanism of biological evolution, variation-selection-inheritance (VSI), is one of the most universal mechanisms known. Much of our understanding of VSI, however, has been dominated by the Neo-Darwinian Modern Synthesis with a rather narrow understanding of what constitutes variation, selection, and inheritance. This unduly narrow understanding of VSI might have been a key cause behind our failure to adequately explain some critical puzzles in biological evolution, from the origin of the first cell to the origin of the eukaryotes, the puzzling biology of metabolism, apoptosis, aging, and cancer in metazoan.
This article broadens our understanding of VSI, in a spirit that is somewhat similar to several recent contributions (e.g., Bouchard 2014;Doolittle 2014;Doolittle and Inkpen 2018;Garson 2017;O'Malley 2014;Toman and Flger 2017). 1 I then extend this broadened view of VSI to its "natural" starting point: the origin of the First Universal Cell Ancestor (FUCA). 2 Indeed, without broadening and then extending VSI, it may be difficult to arrive at a reliable understanding about the origin of FUCA and many key puzzles thereafter (cf. de Duve 2005a;2005b;Fry 2011). The key reason is that the VSI operates quite differently in the pre-biotic and pre-cellular world (Koonin 2014a;Egel 2017), if the coming of the Last Universal Cell Ancestor (LUCA) marks the crossing of the "Darwinian Threshold" (Woese 2002). 3 This broadened perspective of VSI therefore is NOT a rejection of Darwinian evolution. Rather, it is a necessary extension of it to the phase of pre-cellular and pre-Darwinian evolution.
I advance three principal arguments.
First, survival comes before replication. Before the coming of reproducer and replicator, 1 The extension advanced here differs critically from other extensions on two fronts. First, they are more theoretical and philosophical than empirical. Second, with the exception of O'Malley (2014), they seek to account for evolution at the community or even ecosystem level. In contrast, the extension advanced here is squarely about the evolution of individual entities, from molecules, to complexes and protocells. 2 This notation is similar to the First Eukaryotic Common Ancestor (FECA) before the Last Eukaryotic Common Ancestor (LECA). Just as both FECA and LECA were most likely a pool of organisms , FUCA and LUCA were also most likely a pool of organisms. Notably, "universal ancestors" in Woese (1998) were what Woese and Fox (1977) called "progenotes" earlier or FUCA here. For similar interpretations, see Koonin 2014a; Gogarten and Deamer 2016;Egel 2017. 3 In fact, despite the legendary differences between Carl Woese and Lynn Margulis, they agreed that "speciation itself is a product of evolution" (Margulis and Sagan 2002, 143), or that only after the crossing of the "Darwinian Threshold" did speciation become possible (Woese 2002, 8744). there must be survivors, to paraphrase Szathmary and Maynard Smith (1997).
Second, natural selection can operate without replication or even metabolism, as long as different molecules, complexes, and vesicles have differential survival rate within a system. Therefore, natural selection, a non-Darwinian kind admittedly, had operated during the prebiotic phase and long before the crossing of the "Darwinian Threshold". In fact, Darwinian selection had been a product of the non-Darwinian phase (Woese 2002;Damer and Deamer 2015;cf. Tessera 2018).
Third, merger and acquisition, via breaking-and-re-encapsulation, endocytosis, endosymbiosis, and processes similar to them, had been a far more powerful force of variation and selection in the pre-Darwinian period of evolution that led to LUCA and long before eukaryogenesis (cf. de Duve 2005b). Endosymbiosis therefore had been a far more foundational force in evolution than even Lynn Margulis and many of her supporters had appreciated (e.g., Sagan 1967;Margulis 1981;1991;Margulis and Sagan 2002;Norris and Raine 1998;O'Malley 2014). Our thesis thus goes beyond Woese's (1998;2002) emphasis of horizontal gene transfer (HGT) and actually subsumes HGT with endosymbiosis (see also Woese and Fox 1977;Vetsigian et al 2006). 4 As I shall attempt to show below, combing these three new perspectives with other perspectives and evidence sheds important new light upon the origin of FUCA, the singular water-shedding moment in the evolution of life.
Of course, the origin(s) of FUCA may well be one of those puzzles that we simply cannot stop thinking about but can never come up with a definitive answer (Luisi 2016;Sutherland 2017). This article does not pretend to provide a definitive answer either. Rather, the purpose is to show that taking a different starting point will allow us to arrive at a more valid interpretation of existing data and evidence. More importantly, this new perspective may allow us to resolve some apparent controversies and point to more fruitful inquiries.
Three criteria should be applied to any new perspective regarding the origin of cell. First, it should provide a more coherent and consistent organization of existing data with fewer ad hoc hypotheses than existing ones, because it integrates new or previously underappreciated 4 Interestingly, starting with a quite different perspective, Woese and Fox (1977, 5) actually argued that "endosymbiosis should probably be considered an aboriginal (…) [and not-so-rare] trait" rather than "an acquired trait" or "a relatively recent and rare occurrence". data (Lanier and Williams 2017). Second, it should resolve more controversies with fewer ad hoc hypotheses than existing ones (Pereto 2005;Fry 2011). As becomes clear below, the new perspective resolves almost all major contradictions and inconsistencies within Carl Woese's theory regarding the same puzzle, as advanced in his millennium series (esp. Woese 1998;2002). Finally and specifically for the origin(s) of FUCA, because we cannot establish the exact timing of the major events leading to FUCA, we should focus more on the sequence of events. Establishing a more valid sequence of events within this extraordinary process while maintaining consistency with thermochemical, geological, and biological principles provides us with a crucial baseline for evaluating competing theories and designing future research.

Seven Premises
I first outline seven simple premises. Except the last two, the other five premises contain key innovative statements. As becomes clear in the next section, these premises lead us to some new hypotheses for understanding the origin of the first cell.
1. Survival comes before replication, and certainly before division or reproduction (with or without genetic replication): before the coming of reproducer and replicator, there must be survivors. 5 This law that survival or persistence comes before replication and reproduction holds most forcefully during the phase of prebiotic evolution. In short, an entity, be it a compound, a complex, or a vesicle, has to exist and then stay (or "survive") within the (pre-)biotic system before it can become part of life, even if it cannot metabolize or replicate (Pascal and Pross 2016;Toman and Flger 2017). In other words, survival, metabolism, and replication were not coupled for much of the early pre-biotic evolution and their coupling came as a product of prebiotic evolution. Indeed, even after the coming of the first protocell, an organism is a survival machine first and foremost and a replication and reproduction machine second. 6 2. Before the first cell (and perhaps well after), variation had been the primary means toward survival (and replication). In other words, from the very beginning of bioorganic evolution 5 Clearly, a cell can survive for a while without replication. 6 As I shall show elsewhere, this principle holds critical implications for understanding other key puzzles, from an organism's lifespan, to the lifespan of cell lineages, cancer, apoptosis, and reproduction. that eventually led to FUCA and LUCA, the evolutionary process was mostly about gaining more molecular and hence functional diversities so that FUCA or LUCA could survive in more diverse environment with a more potent arsenal. Moreover, variation back then was not generate by genetic mutation (which did not exist for a long time), but by two rather different processes: (1) prebiotic chemical synthesis, polymerization, and stereochemical mutualism; (2) absorption, via breaking-and-re-encapsulation, primitive endocytosis (e.g., simple engulfing), primitive endosymbiosis, and processes similar to them. Notably, endosymbiosis (as fusion or merger) and endocytosis (as acquisition) entail extensive "horizontal biomolecule transfer" (HBMT) than merely HGT does: HBMT subsumes HGT in the strict sense. In fact, Woese's (1998;2002) emphasis of HGT in the origin of FUCAs is valid only if he meant HBMT with HGT. HBMT was therefore the more pivotal and pervasive process than HGT.
3. A period of non-Darwinian but natural selection operated before the coming of Darwinian selection: the former produced the latter (cf. Tessera 2018). There were four major non-Darwinian selection mechanisms, which most likely had appeared in chronic order.
a) The first non-Darwinian selection mechanism, which is purely chemical, operates upon biomolecules themselves and their capacities for forming polymers and complexes. Here, the yardsticks of "fitness" include steady supply from abiotic synthesis (i.e., availability), kinetic and thermochemical stability or persistence (Pascal and Pross 2016;Toman and Flegr 2017), (water) solubility, polymerization, and stereochemical "mutualism" for forming complexes (Lanier, Petrov, and Williams 2017;Vitas and Dobovišek 2018).
b) The second non-Darwinian selection mechanism, which is both chemical and physical, selects biomolecules that can not only stabilize vesicles but also make vesicles somewhat more permeable (e.g., Wei and Pohrille 2011;Black et al 2013).
c) The third non-Darwinian selection mechanism operates upon the different capacities of different bioorganic complexes to interact with each other and the different capacities of different vesicles to merge and fuse with or engulf other complexes and vesicles (Damer and Deamer 2015).
d) The fourth non-Darwinian selection mechanism operates upon vesicles that now approach protocells. Already stabilized (or intact) vesicles that can absorb useful ingredients or components from the environment via endosymbiosis or endocytosis and then invent primitive metabolism and replication will hold critical selection advantage. Here, the key yardstick of "fitness" was survival, absorption, growth, and division, first without and then with primitive metabolism and genetic replication (Norris and Raine 1998;Szostak et al 2001;Zhu et al 2012).
4. FUCAs came to exist via breaking-up and then re-encapsulating of vesicles and via endosymbiosis and endocytosis by vesicles, drawing useful ingredients or components from "global invention" (Woese 2002). Put it differently, FUCAs did not come to exist via de novo evolution within individual protocells: this will essentially mean that every FUCA must evolve almost entirely independently and such a possibility would have been a miracle. Rather, FUCAs came to exist via drawing and fusing innovations from many precursors. It is through HBMT that this underpinned by merger and fusion, rather than HGT alone, that FUCAs came to possess a proto-machinery of survival and a proto-machinery of replication within the same protocell.
5. Once FUCAs came to possess both a proto-machinery of survival and a proto-machinery of replication (both machineries of course require some kind of metabolism), survival and replication began to co-evolve with each other. Along the way, FUCAs continued to absorb useful ingredients and synthesize them into more complex, versatile, and effective macromolecules, including more complex proteins and RNAs. For this phase, however, a tight coupling of survival and replication might not hold any selective advantage. Indeed, the opposite might have been true: being more promiscuous and hence more flexible provides a protocell with significant advantage for survival. It is mostly because of this key dynamics that FUCAs did not have a genealogical history, but only a physical-chemical one (Woese 1998;2002).
6. FUCAs competed against each other. After a period of time during which survival and replication co-evolved with each other, some of the FUCAs eventually became protocells in which survival and replication are more tightly coupled and smoothly regulated.
Proto-cellular division (of vesicle) with tightly coupled genetic replication therefore came rather late in the evolutionary process that led to the first cells (Rasmussen et al, 2009, xv-xvi). Once evolved, however, these protocells with division and genetic replication already being tightly coupled would come to enjoy an enormous advantage, and it was these protocells that eventually became the LUCA that came to move into diverse environment and eventually colonize the whole biosphere.
7. The evolution of the first proto-cell may well require a sustained increase in its capacity of harnessing and generating energy (Lane et al, 2015).

Core Hypotheses
I now derive the key hypotheses from the seven premises. I divide the hypotheses into two parts: before and after FUCAs. Obviously, hypotheses in the second part draw more from existing discussions, but even here, our new starting premises offer new perspectives. Figure   1 schematically summarizes the whole process from the origin(s) of FUCA to LUCA.
Evolution before FUCA 1. Abiotic synthesis of bioorganic molecules was the first step in the origin of life. Once bioorganic molecules came to exist---first as monomers (e.g., amino acids, nucleotides, fatty acids, and later on, phospholipids) and then as polymers (e.g., short peptides, small RNAs), they came under the force of selection, even though replication did not operate back then. During this stage, there were two key selection yardsticks. The first is thermochemical stability or survivability within the system. The second is a minimum level of availability that allows a minimum level of concentration for monomers to be assembled into polymers and more complex hetero-biomolecules. Both stability and availability partly depend on the relative easiness of synthesizing them from simple precursors and protection from UV light.
2. In order for simple bioorganic molecules to be assembled into more complex hetero-biomolecules, they also have to be stereochemically compatible with each other: in other words, there must be "molecular mutualism" (Lanier et al 2017;Vitas and Dobovišek 2018). Key examples of such "molecular mutualism" include that only some amino acids or peptides can bind with simple RNAs or that only some peptides can form α-helixes and then insert themselves into lipid membranes to make lipid membranes more permeable. Moreover, once some mutualism or stereochemical compatibility is fixed, it becomes extremely difficult to change or unravel, because such a change may be lethal. A key implication of stereochemical compatibility or molecular mutualism is that simplicity does not always mean better. Because only certain configurations are compatible with certain assembling strategies for bringing different molecules together, those molecules that can interact, bind, or fit with each other properly rather than those that are merely simpler have been selected.
3. Self-organizing amphiphiles tend to form vesicles (Pohorille and Deamer 2009). Here, as long as a vesicle can retain its basic structure, float within a solution, absorb ingredients from its environment (e.g., via proto-endocytosis), and merge with other vesicles, it succeeds in surviving in the system. Most likely, such vesicles also had the capacity of "dividing" without either reproduction or genetic replication. Rather, they divided via pinching or budding due to enlargement of size by 1) absorbing more lipids, peptides, and other bioorganic molecules, 2) merging with and engulfing other vesicles, and 3) and genetic code originated from this co-evolution of amino acid/peptide with RNA (Wong 1975;Wolf and Koonin 2007;Francis 2011;2013;2015;Sengupta and Higgs 2015; Koonin and Novozhilov 2017). During the co-evolution of amino acid/peptide with RNA, precision in RNA replication is not necessarily an advantage. Rather, during this stage of evolution, the key is to make more RNAs without precision because this increases the structural diversity and hence the functional diversity of RNAs (Horning and Joyce 2017).
With more diverse structures and functions, RNAs will be able to support the production of more diverse peptides with different properties. This mutually reinforcing increase in structure and function leads to peptides and proteins with more diverse structures and functions and paves the way for the more complex ribonucleoprotein (RNP) world, the standard genetic code (SGC), and eventually a more versatile metabolism system. 5. For a period of time in the prebiotic evolution, the evolution of peptide-lipid bi-layered membrane and the evolution of RNA-peptide (as the proto-translation machinery) might have proceeded independently from each other. Indeed, the two processes might have operated in different locations (e.g., different terrestrial hydrothermal ponds/fields) or even one came earlier than the other (Mulkidjanian et al 2012;Damer and Deamer 2015;Koonin 2014b, 35-36). Eventually, however, these two processes had to come together to make the first proto-cell, and the moment in which these two processes merged with each other was the watershed event: it laid the foundation of the first proto-cell or FUCA 7. Within a stable microenvironment provided by FUCA's protocell membrane, other more fragile and elaborate proteins began to exist and operate, perhaps with the help of proto-chaperones (either RNA or peptide/protein). Within FUCAs, therefore, metabolism came to support the synthesis of more diverse and elaborate proteins and RNAs that could reinforce metabolism within a better regulated membrane system. Together, they came to form more sophisticated machineries of replication and survival and allow FUCAs to evolve toward LUCAs. Replication, or more precisely, fairly accurate replication of genetic materials (either RNA or DNA) supported by a replication apparatus could have only evolved within a protocell such as FUCA. Most likely, FUCA initially contained only short RNAs rather than a single long RNA molecule.

From FUCA to LUCA
Once the survival machinery (i.e., metabolism supported by proteins within a membrane) and the replication machinery (now supported by both proteins and RNAs) were coupled with each other within a protocell, they began to co-evolve with each other. This co-evolutionary process within FUCA laid the foundation for all subsequent evolutionary processes.
1. FUCAs were a pool of "progenotes", and they struggled for existence. Eventually, one or few of the FUCAs came to possess the right and tight coupling of metabolism, translation machinery, division with genetic replication, and energy efficiency. These luck few became the LUCAs: their progenies came to dominate the whole system. Because LUCAs possessed a relatively tight coupling of cell division with genetic replication, they had crossed the Darwinian Threshold (Woese 1998;2002). By LUCAs, a translation system with the full SGC had also "crystallized" (Woese 1998;2002;Koonin 2014a). More likely than not, this whole process became only possible within protocells with a regulated membrane rather than directly from the "naked" RNA world ( 3. During the FUCA to LUCA period, a transition from RNA to DNA via RNA-DNA hybrid had begun. LUCA, however, began to diffuse into two different environments before the transition from a RNA-DNA hybrid system to a full DNA genetic system was complete. As a result, DNA replication evolved twice independently in the two primary domains after their divergence (Leipe et al 1999). The whole transition from RNA to DNA as genetic material therefore had two phases: a phase of RNA-DNA hybrid and a phase of full transition to a DNA-only system. The former lasted from FUCA to LUCA and beyond, even if some components of a primitive DNA-replication machinery were already in place within FUCA (i.e., before LUCA). The latter started after the divergence of LUCA into bacteria and Archaea and continued well after.
4. Only in LUCA did the narrower HGT replace the broader HBMT as the more critical force in driving evolution, although HBMT continued to operate, most dramatically in eukaryogenesis (Margulis 1981). Moreover, only in LUCA did HGT gradually become more harmful (Jain et al 1999), and a defense system against HGT came to exist (Koonin 2017; see also Koonin and Dajila 2013). In other words, the arms race between HGT and

Evidences: Chemical and Biological-Functional
Because FUCAs were a community of proto-cells and the long evolutionary process after FUCAs with rampant HGT had erased or at least obscured most of the early genetic footprints (Fournier et al 2015), empirical evidence for any theory regarding the origin of the cell must be mostly chemical and biological-functional rather than genetic. 8 This section presents the chemical and biological-functional evidence that supports our core hypotheses, in addition to some genetic evidence. The next section goes on to show that our new theory resolves some key controversies regarding the origin of life, thus providing another source of support.
1. The universality of the standard genetic code (SGC) is indisputable. As part of the core translation apparatus, SGC's indisputable universality in only one out of the three parts in the information process system of modern organisms (i.e., replication of DNA, transcription from DNA to RNA, and translation from mRNA to protein) can only be explained by the coming of amino acid/peptide-RNA interaction very early on, long before the coming of DNA replication and DNA to RNA transcription. The possibility that the proto-translation machinery came to operate via initial chemical mutualism between amino acid/peptide with RNA and then peptide/protein and RNA co-evolved thereafter is now almost universally accepted (e.g., Wong 1975;Petrov et  a) The fact that many ancient proteins are connected to nucleotides and RNAs also suggests an early rather than a late RNA-peptide world (Ma et al 2008).
b) This possibility is further strengthened by the fact that small and simple peptides derived from the core part of ribosome proteins can enhance RNA polymerase ribozyme (RPR) catalytic activities, with the smallest size being merely ten amino acids (Tagami et al 2017). As such, a primitive ribosome-like apparatus with both RNAs and peptides must have come to exist quite early on (Wolf and Koonin 2007).
c) The fact that replicons with protein capsids (i.e., viruses) are more widespread than replicons without protein capsids (i.e., viroids) also suggest that the former is more advantageous than the latter. Indeed, viroids so far have only been found in higher plants whereas viruses have been found in all three domains. 9 This fact is also more consistent with the notion that nuclear acids (mostly likely, RNA) came to interact with peptides or proteins quite early on.
2. The fact that a primitive translation system and eventually the whole SGC had evolved quite early on strongly suggests that vesicles with permeable membrane came to exist quite early on (Pohorille & Deamer 2009;Mulkidjanian et al 2009;Lombard et al 2012; Koonin 2014a). It also strongly suggests the possibility that these vesicles were able to merge with each other via proto-endosymbiosis. Simply put, it would have been extremely difficult for a single proto-cell or vesicle to evolve the whole SGC: the evolution of SGC most likely requires input from "global invention" (Woese 1998;2002). In contrast, with HBMT, which is much more than acquiring a genome, vesicles within a pool of vesicles or protocells could have easily drawn from and brought together "global inventions", not only in genetic materials but also in metabolism and other ingredients, within the community of progenotes. Obviously, HBMT assumes vesicles with proto-membrane that can retain molecules within whereas Woese's HGT does not.
3. The possibility that HBMT via proto-endosymbiosis and proto-endocytosis has been a key mechanism in evolution of FUCA and LUCA, in addition to (and well before) eukaryogenesis, is strongly supported by several sets of evidence. b) The notion that mitochondria and chloroplast and other plastid-like structures (e.g., liposomes, hydrogenosomes, and mitosomes) had evolved as relics of endosymbiosis (i.e., its substrates and leftovers) has now been firmly established (Sagan 1967 4. In the evolutionary process leading to LUCA and beyond, HBMT thus has two distinct phases. The first phase, before the divergence of bacteria and Archaea from LUCA, was mostly, if not exclusively, via fusion and absorption based on endosymbiosis and endocytosis. The second phase, mostly in the form of replicons jumping from one organism to another, came after the "crystallization" of the two principal domains. HGT only dominated this second phase. Even in the second phase, however, endosymbiosis played a key role, especially in eukaryogenesis (Sagan 1967;Margulis and Sagan 2002 Woese's (1998;2002) treatises on the origin of the cell (for an insightful review, see Koonin 2014a). Indeed, Woese (2002, 8744) came close to admit this possibility: "Were that organization (of a progenote as a proto-cell) simple and modular enough, all of the componentry of a cell could potentially be horizontally displaced over all time." Quite evidently, such a possibility is not so compatible with HGT but entirely compatible with HBMT as the central mechanism leading to the origin of the cell.
b) The massive amount of shared genetic material between Archaea and bacteria is more consistent with the possibility that in early evolution, HBMT via fusion and absorption had been a more powerful force than HGT via replicons jumping from one cell or organism to another. Sagan's (2002) notion of "acquiring a genome": HBMT is much more than "acquiring a genome". 11 There was a "genetic takeover" in the origin of cell (Cairns-Smith 1987), but it came via HBMT based on endosymbiosis and endocytosis rather than genetic invasion or HGT. 6. If survival comes before replication, then molecules that can support the survival of an organism are equally critical, if not more so, when comparing to components in the replication (and transcription) apparatuses. Besides the bilayered lipid membrane itself as a protective apparatus, some other protective mechanisms should have been selected for very early on. Put it differently, for any organism or proto-cell, some kind of stress-response machinery is necessary. This is most likely in the form a stress response mechanism and apparatus, or "heat shock response (HSR)" as known today. Very likely,

c) HBMT subsumes Lynn Margulis and Dorion
HSR was an early invention as requisites for life on the edge, for the sake of survival.
a) Major components within HSR, such as Hsp100, Hsp90, Hsp70, Hsp60, and other small heat-shock proteins (HSPs), are highly conserved across the three domains (Richter et al 2010). For instance, DnaK, the prokaryotic version of Hsp70, shares about 60% sequence identity with its eukaryotic counterpart. Likewise, Hsp60 is present in all three domains. Most likely, HSPs were perhaps some of the earliest proteins to be firmly integrated into FUCAs. Only this logic can explain why so many HSPs are highly conserved across all three domains with such high fidelity.
The fact that mitochondria possess HSPs also suggests that stress response was critical for protecting this vital organelle. In short, survival demands FUCAs to possess a stress response system quite early on b) Membranes formed by amphiphiles alone are often too impermeable to ions and other bioorganic molecules. One potential solution for this challenge is to insert peptides that can form α-helixes into the membranes. Indeed, lipid membranes are stabilized and regulated by peptides and this mutualism between membrane lipids and membrane proteins is a key part of the foundation of survival. Several observations suggest that Hsp12 may be one of the molecular fossils from this period.
In its soluble form, yeast Hsp12 is unfolded or unstructured (Richter et al 2010, 254).
Yet, when it interacts with lipids, it becomes α-helical. More critically, Hsp12, when binding membranes, stabilizes membranes by decreasing membrane fluidity. In terms of sequence, 50% of Hsp12's sequence is made of five amino acids: Ala, Asp, Glu, Gly, and Lys. Remarkably, among the five amino acids, four of them (i.e., Ala, Asp, Glu, and Gly) belong to the "first amino acids". More importantly, both Ala and Gly tend to form transmembrane (TM) alpha-helixes via Ala/Gly-X-X-X-Ala/Gly whereas Asp can stabilize α-helixes by capping α-helix TM domain (Francis 2013).
Ala/Gly-X-X-X-Ala/Gly sequences also facilitate dimerization. Together, these facts suggest that machineries that regulate cell's interaction with and response toward the environment came very early on, at least at FUCA. After all, a key aspect of the co-evolution of membrane and membrane proteins was about protecting the membrane (Mulkidjanian et al 2009). So far, no Hsp12 homolog has been discovered beyond yeast and fungus, but this may be due to the fact that membrane proteins are less conserved than water-solvable proteins (Sojo et al 2016).
c) The fact that molecules that ensure survival came rather early is also consistent with the possibility that life most likely began in a stressful environment ( b) UB and UBL (in the Urm1-Uba4 system, which is much closer to THiS and THiF) might have been a 'molecular fossil' from the more ancient sulfur-transfer pathway (Hochstrasser 2009, 425;van der Veen and Ploegh 2012, 342-3). Indeed, metal and sulfur-proteins might have been some of the first sets of proteins that were recruited or assembled into the first protocell. This fact is consistent with the hypothesis that life mostly originated from a hydrothermal environment that is rich in metal and sulfur.
The fact that UB and UBLs are small proteins also suggests an ancient origin.

Controversies Resolved, Partly and Possibly
It is highly unlikely that we can establish the exact timing of each major evolutionary event leading toward FUCAs. But we may be able to establish the order for some of the events. Once the order of these events is established, we can resolve some key controversies and identify more fruitful directions for future research (for earlier reviews, see Pereto 2005;Glansdorff et al 2008;Fry 2011;Koonin 2014a;2014b;Spitzer 2018;Cantine and Fouriner 2018). This section details the key controversies that our theory can help resolve. In contrast, some debates may never be confidently resolved and hence less useful.
1. The question which comes first, the RNA world (and then a RNA-peptide world) or the peptide-lipid membrane, is one of those unfruitful controversies because it will never be firmly resolved. Our thesis suggests that vesicles came to exist very early on and that vesicles or protocells had made subsequent evolutionary processes possible (Lombard et al 2012;Koonin 2014a;Deamer 2016;Cantine and Fournier 2018). Most critically, our thesis implies that the most decisive step was the merging of these two pathways that paved the way for the eventual coupling of survival, metabolism, and replication.
2. During the process leading to FUCAs and LUCAs, the process was mostly about gain in structure and function so that FUCAs and LUCAs could survive in more diverse environment with a more potent weaponry system. Our thesis also suggests that before FUCAs/LUCA, metabolic innovation was equally critical, if not more critical than genetic ones (e.g., Margulis 1981;O'Malley 2014). More likely than not, streamlining or simplification via loss of functions and genes came after the divergence of bacteria and Archaea from LUCA, when "evolutionary temperature" had cooled down considerably (Woese 1998). As de Duve (2005b, 163, fn. 3) put it pithily, "There can be no reduction without prior 'complexification.'" Two additional facts suggest that streamlining or simplification via loss of functions and genes came after the divergence of bacteria and Archaea, including 1) SGC had evolved in two phases, an early phase with early amino acids and a later phase with late amino acids (Koonin 2010;Francis 2011;2013;2015), and 2) many proteins in LUCA performed multiple functions (Ranea et al 2006).
3. The possibility that streamlining or simplification via loss of functions and genes came after the divergence of bacteria and Archaea from LUCA goes against the thesis that the FUCA or LUCA was a proto-eukaryotic cell, and bacteria, Archaea, and Eukaryotes came to exist via genetic reduction or loss of function in proteins and genes (e.g., Glansdorff et al 2008; see also Forterre and Philippe 1999;Philippe and Forterre 1999). Recent advances also strongly suggest a Tree of Life (ToL) with bacteria and Archaea as the two primary domains (Williams et al 2003;López-García & Moreira 2015;Dacks et al 2016;Eme et al 2017). By all likelihood, the hypothesis that FUCA or LUCA was a proto-eukaryotic cell can now be confidently ruled out.
4. Intuitively, when survival comes first and merger and acquisition, vesicles as proto-cells could not afford to be too picky. Indeed, some degree of heterogeneity was perhaps crucial for the evolutionary process (Szostak 2011). When this is the case, FUCAs and LUCAs were most likely to have peptide-lipid proto-membranes that were heterochiral with both isopreniod-based and fatty-acid-based phospholipids (Wächtershäuser 2003 8. Our interpretation questions the possibility that DNA replication came from DNA virus rather than FUCA/LUCA with some kind of rudimentary DNA replication machinery (Forterre 2006). More likely, the transition from RNA to DNA was accomplished by reverse transcription and this transition was not completed before LUCA diverged to bacteria and Archaea. Hence, DNA replication evolved twice, once in bacteria and once in Archaea (Leipe et al 1999).
9. Without a (primitive) defense system (e.g., the CRISP defense system in bacteria), primitive cells are extremely vulnerable to genetic invasion. In contrast, acquiring an even primitive defense system against genetic invasion inevitably reduces the rate of HGT. The acquiring of a defense system against genetic invasion, most likely after FUCAs, therefore marks the coming of HGT as a reduced form of HBMT.
10. Because cell division via pinching or budding in the beginning could not have been very precise, asymmetrical cell division (ACD) was perhaps an inevitable outcome. Moreover, ACD might have been selected very early on because it provides one of the daughter cells with a higher probability of survival. Such an arrangement holds important advantage over the scenario of having two daughter cells with the same probability of survival.

Concluding Remarks: Unsolved Mysteries and Ways Forward
This article advances some radical new hypotheses regarding the origin of the first cells that seems to be consistent with much of the evidence and help resolve some key controversies. My emphasis of merger and acquisition via endosymbiosis and endocytosis does not challenge the central mechanism of VSI but rather constituting an extension of VSI. I argue that endosymbiosis and endocytosis represent a powerful force for generating variations that were not limited to genetic, but also metabolic and perhaps structural. Moreover, before LUCAs, metabolic and structural innovations were more critical than genetic ones.
Our thesis thus calls for a more foundational extension of the Neo-Darwinian Modern Synthesis. For the Modern Synthesis, variation can only be genetic and mostly within cells.
Yet, before LUCAs, variation can be both non-genetic (e.g., metabolic and structural) and imported. Darwinian evolution itself had been a product of pre-Darwinian/cellular evolution.
Moreover, endosymbiosis and endocytosis can be understood as a force of selection. The profound implication for evolution of endosymbiosis and endocytosis therefore must be more appreciated (O'Malley 2014;cf. Margulis 1981;1991).
Of course, we still leave many questions, from the divergence of the two primary domains, to the origin of the eukaryotes, and the evolution of the individual phylum unaddressed.
Nonetheless, we can now decipher several exciting directions forward.
First, some interesting, if not entirely decisive, experiments can be conceived. Among others, three will be of particular interest: 1) Artificial vesicles can perform endosymbiosis; 2) Artificial vesicles can indeed perform endocytosis by absorbing not only small molecules (e.g., amino acids) but also engulfing bioorganic complexes (e.g., an RNA-peptide complex).; and 3) Artificial vesicles can not only absorb but also exclude RNAs.
Second, within a population of artificial vesicles aided by a steady supply of amino acids and nucleotides and environmental changes triggered by a wet-and-dry cycle, a freeze-and-thaw cycle, or some other physical and chemical changes, a rudimentary genetic code can indeed evolve. Such a rudimentary genetic code may be some probabilistic association between some amino acids and some short (oligo-)nucleotides.
Experiments along this direction may constitute powerful evidence that the model outlined here (Fig. 1) might have indeed operated in the origin(s) of LUCA with the standard genetic code.
Third, structural phylogenomic analysis with HSPs that are responsible for protecting the membrane (e.g., Hsp12) being of special interest. Other useful molecular markers may include UBLs and UBMLs. The possibility that beta-grasp fold was a RNA-binding domain with connections to RNA metabolism also suggests that its origin has been ancient. Of course, molecular phylogeny analysis based on sequence will be of increasingly limited value as we reach further and further back into the origin of life, even if we get the techniques of such analyses right (Shen et al 2017). A better way forward will be searching for structural homologies. But even this technique has its limits.

Figure 1. From FUCA to LUCA
Numbers in subscript denote different amino acids and nucleic acids. The exact matching between amino acid and nucleic acid within LUCA, in a metaphorical sense, implies that the standard genetic code (SGC) had evolved most completely by the time of LUCA. The less than exact matching amino acid and nucleic acid within FUCA and vesicles before FUCA denotes the evolutionary path of SGC from a rudimentary form to a mature form in LCUA.
Protocells or vesicles are in closed circles whereas broken vesicles are in broken circles. Viruses are in elongated or other non-circular shapes. The three to one ratio of virus versus cell at the stage of LUCA is to imply the fact that virus may be the most abundant biological entity in the biosphere. The wet-and-dry cycle (on the left of the diagram) might have played a key role in driving the process of breaking-and-re-encapsulation cycle that facilitates the merger and acquisition by vesicles. The wet-and-dry cycle part within the figure is adapted from Bruce Damer and David Deamer (2015) with permission.