Is it possible that cells have had more than one origin ?

Cells occupy a prominent place in the history of life on planet Earth. The central role of cellular organization is observed by the fact that “cellular life” is often used as a synonym for life itself. Thus, most characteristics used to define cells overlap with the ones used to define life. Notwithstanding, new scenarios about the origin of life are bringing alternative views to describe how cells may have evolved from the open biological systems named progenotes. Here, using a logical and conceptual analysis, we re-evaluate the characteristics used to infer a single origin for cells. We argue that some evidences used to support cell monophyly, such as the presence of elements from both the translation mechanism and the universality of the genetic code, actually indicate a unique origin for all “biological systems”, a term used to define not only cells, but also virus and progenotes. Besides, we present evidence that at least two biochemical pathways as important as (i) DNA replication and (ii) lipid biosynthesis may not homologous between Bacteria and Archaea. The identities observed between the proteins involved in those pathways along representatives of these two ancestral Domains are too low to indicate common genic ancestry. Altogether these facts can be seen as an indication that cellular organization has possibly evolved two or more times and that LUCA (the Last Universal Common Ancestor) might not have existed as a cellular entity. Thus, we aim to consider the possibility that different strategies acquired by biological systems to exist, such as viral, bacterial and archaeal were originated independently from the evolution of different progenote populations.


Introduction
In the first half of the 17 th century, an innovative instrument opened new perspectives into the microbiological world. Although lenses are known to be used since old Greece and magnifying glasses were already used in the 13 th century, the first microscopes began to be designed in Europe about 1620. Those instruments enhanced our power to observe small sized phenomena and biological structures that were previously impossible to be observed in detail. It was 1665 when Robert Hooke , the prestigious English scientist sometimes referred as the "England's Leonardo" (Chapman 1996), published his book called Micrographia. In that book, Hook made the first descriptions of self-contained microscopic structures that seemed "small rooms" therefore naming those structures as cells, in reference to the Latin word cella. At that time, though, it was not clear whether the observed structure consisted of a living entity or not. An important step to solve this issue was taken in 1676 by the Dutch scientist Antonie Von Leeuwenhoek (1632Leeuwenhoek ( -1723. He observed that the structures described as cells could move, an important feature to differentiate living and non-living entities at that time (Dunn and Jones 2004). However, it would took more than a century for scientists to acknowledge the actual relevance of cells. Finally (Baker 1955). In the following year (1939), Schwann published an even more consistent work in the form of a book. It was there the first mention about Cell Theory through the use of the German word "Zellentheorie" and also the description that the nucleus was present in both plant and animal cells. Besides, he recognized and named the nucleolus in both cell types.
Since those times, cell theory has advanced significantly, being one of the most fundamental principles of biology nowadays. It currently assumes a broad and wellfounded set of principles, such as: (i) all living beings are composed of cells; (ii) all cells arise from pre-existing cells; (iii) the cell is the fundamental unit of structure and function in living beings; (iv) the activities of organisms depend on the overall activities of their cells; (v) the flow of energy occurs within cells; (vi) cells have DNA as hereditary material; (vii) all cells have basically the same constitution. Among the points presented above, the characteristics that (ii) all cells arise from pre-existing ones, (vi) present DNA and (vii) have basically the same constitution suggest that cells have had a unique evolutionary origin and, therefore, should be taken as monophyletic entities.
In 1977, the American microbiologists Carl Woese and George Fox were astonished to discover an important division in the molecular diversity of prokaryotes when analyzing RNA sequences from the smaller ribosomal subunit. In that study, they proposed to split the prokaryotic group (known as the Monera kingdom) into two main groups named (i) eubacteria and (ii) archaebacteria. In the following decade, with the advancement of both molecular biology techniques and cell biology studies, Woese et al.
(1990) proposed a new classification for life. They therefore created the taxonomic category of Domain, which would be superior to the kingdom classically proposed by the Swedish father of taxonomy Carolus Linaeus in his book Systema Naturae (1735). This highest-level category would represent the first major division between basic cellular groups and, this way, life would be composed of three major domains, namely: Bacteria, Archaea and Eukarya. This classification reinforced the idea that cells have had a single and common origin in the deep past. They further suggested that it would be possible to predict the existence of an ancestor for these clades, therefore defining the existence of a last common ancestor named LUCA, the Last Universal Common Ancestor. Since then, the search for characteristics of LUCA and explanations about how it originated have been guiding important works in the origin of life research program (Forterre et al. 2005, Mat et al. 2008, Gogarten and Deamer 2016, Martin et al. 2016. Today, however, we know that the domain Eukarya evolved from a specific group of archaebacteria from the phylum Lokiarchaeota (subphylum Asgard) that was involved in symbiotic relationships with bacterial clades (Spang et al. 2015, Spang et al. 2019. Eukaryotes are therefore a derivate clade and were not involved in the constitution of LUCA.
In any case, the existence of a conserved genetic code shared amongst all cellular entities has been identified as a strong evidence for the monophyly of cells and it is used to corroborate the idea that cells appeared only once (Weiss et al. 2018 (Barbieri 2016 inspired the current discussion about whether cells actually originated one single time, and which are the evidences used to support such inference. Although we do not aim to present any definitive view, we would like to raise arguments that may lead to a reassessment of this issue and their implications on our understanding about the early evolution of life. Along that line of reasoning, we will review historic and contemporary evidence to account for the evolutionary history of life in a tentative to contribute with a new outlook on how we should understand the importance of cells in that extraordinary journey of the phenomenon of Life in Earth along the last ~4 billion years.

Are cells monophyletic: which evidences support this view?
The claim that cells are the basic units of life is circular to the very definition of life once life is usually assumed to be cellular life. In that sense, many characters used to support the unique origin of life in Earth are also used as evidence of a unique origin for cells. Therefore, the definition of life as cell is a tautology. The main characteristics emphasized to propose a single origin for cells are: (a) all cells have DNA as informational molecule, (b) all cells possess a nearly identical genetic code together with a ribosome-centered translation machinery, and (c) they present a semipermeable membrane that delimits their internal space and organizes the communication with the external environment. Although DNA is used by viruses, both the mechanism for translating biological information and the semipermeable membrane are exclusive features of cellular entities (Weiss et al. 2018).
In order to consider the characters that actually confirm the monophyletic status of cells, it is necessary to discern which of them (i) are exclusive elements of cells and therefore should be understood as actual evidence of their unique origin and, on the other side, (i') which of them are elements that belong to cells due to the fact that they are shared by other biological systems and configure actually an evidence for monophyly of life, being older than cells. In that latter case, those characters were probably acquired before the appearance of cells and were inherited by them. We define a biological system as any entity that presents an organic code  and that is capable to produce proteins based on nucleic acid information. All biological systems are a product . The most accepted model that describes a "pre-cellular era" postulates that life was initially organized in an RNA-world (Gilbert 1986). At that time, primitive RNA molecules performed both the function of information storage and the establishment of a proto-metabolism (Poole et al. 1998, Dworkin et al. 2003, Robertson and Joyce 2012, Vazquez-Salazar and Lazcano 2018. After an initial period on which all catalytic activities were operated by ribozymes, this major biological function was gradually replaced by the activities of peptidic enzymes made of chemically diverse amino acids (Cech 2009). It is well known that the RNA world theory presents theoretical difficulties, mainly related to its disregard of the importance and abundance of amino acids in early prebiotic contexts, but also due to difficulties in providing convincing scenarios for the abiotic production of nucleotides.
Therefore, alternative scenarios started to be discussed in the scientific community The consensus reached by scientific community that the first informational molecule was RNA-based is followed by the necessary conclusion that DNA has been further selected as the main nucleic acid for cells, possibly due to its higher stability and double helix structure that provides an easy way to allow replication and a backup informational structure. But then, how these facts relate to the monophyly of cells? Well, once all cells today use DNA for information storage, the simplest conclusion is probably to propose that cells have had a common evolutionary history with the first forms that coopted DNA as their main informational molecule. On that matters, it is important to notice that the general mechanism of replication is similar between cellular organisms, being based on: (i) semiconservative replication, (ii) bidirectional replication, (iii) the need for an RNA primer to start the process, and (iv) the presence of nucleases, polymerases and ligases that participate in the process of removing RNA primers. On the other hand, it should be noted that the proteins involved in those processes have a low level of identity and similarity (Leipe et al. 1999) between the basal cell groups, i. e., Bacteria and Archaea. This observation led to the alternative proposal that the last common universal ancestor (LUCA) might have had an RNA-based genome. In that case, it has been proposed that the basal domains might have independently acquired the capacity to store biological information in the form of DNA (Leipe et al. 1999, Forterre 2005, Forterre 2006).
Although the topic is still a subject of debate in the scientific community, there is available evidence to consider that the storage of information in DNA molecules cannot be used as a reliable indicator for the monophyly of all cells. The main arguments for the independent origin are (i) the low similarity of proteins from the replication machinery but also the fact that (ii) it is possible to find DNA as an informational molecule in several clades of viruses. Therefore, we cannot refute the idea that Archaea and Bacteria could have acquired their replication systems from different progenote populations. Those populations could even share a common ancestor that would explain the overall similarity in the mechanism, but not a very recent one, so to justify the dissimilarity observed in their sequences.

B) Ribosomes, genetic code and the translation machinery
The (i) presence of ribosomes, the (ii) existence of a conserved genetic code and the (iii) sharing of the translation machinery are characters used as strong indicators that cells have had a single and common origin. These features are surely found in all known organisms. The ribosome and the genetic code are fundamental elements of the translation 7 system and they are intimately related to cell autonomy, another character of extreme relevance in any attempt to characterize cells (Varela et al. 1974). In recent years, several studies have tried to understand the process of origin and evolution of the ribosome (Petrov et al. 2015, Bowman et al. 2020. In all of them, there is a consensus about the fundamental relevance of this complex in the initial organization of biological systems (Yonath 2009, Tamura 2011, Petrov et al. 2014, Farias etal. 2014a, Farias et al. 2014b, Farias et al. 2017, Root-Bernstein and Root-Bernstein 2015, Prosdocimi et al., 2020b. Even if many of these works still dispute alternative models and scenarios for ribosomal evolution, it is currently accepted that the ribosomal RNA appeared in very primordial stages of the origin of life. That is to say that rRNA emerged when biological When we consider that ribosomes were among the first molecular structures to self-organize, it becomes clear that the emergence of the genetic code must have occurred immediately after or concomitantly with ribosomal maturation. Thus, the genetic code emerged when the biological systems were still expressed in the form of purely molecular systems; that is, before the appearance of cells (Hartman and Smith 2019). In addition to evidence for the antiquity of ribosomes (Yonath 2009), some studies point to the fact that tRNAs, which are molecules responsible for the connection between the information contained in nucleic acids and the amino acids, were among the oldest biological molecules we know (Caetano-Anolles and Caetano-Anolles 2016, Farias et al. 2014a;Farias et al. 2016). The antiquity of tRNA molecules is gaining prominence in the quest for the initial organization of biological systems once they consist on the simplest nucleic acid structures used by such systems. Some models suggest that both the larger subunit (Prosdocimi et al. 2020b) and the smaller ribosomal subunits, as well as the first mRNAs, were most likely formed by the junction of proto-tRNAs (Farias and Jose 2020). All of these ideas reinforce the fact that the translation system was one of the first systems to self-organize, much earlier than the time on which cells appeared, still at the progenote era. This organization of the protein synthesis apparatus have had place long before the emergence of the cells and therefore cannot be used as evidence for the monophyly of cells.

The origin of viruses and its relationship with the translation system
When considering the possibility that the mechanism of protein synthesis could not be used to define cell monophyly, we must also consider contemporary models for the origin and evolution of viruses. Viruses are very particular biological systems because, although they do not have ribosomes, they present biological information encoded in either RNA or DNA sequences, single or double stranded, and they use the same genetic code present in cells. For many years, the similarity between the cellular and viral genetic codes has been explained by the escape model, in which viruses would have been molecular particles that escaped from the cellular context and acquired their own "independence", even though they remained dependent on cells for reproduction (Hendrix et al., 2000). However, with the increase in availability of genomic data, new scenarios have been proposed (Nasir and Caetano-Anollés 2015, Koonin et al. 2015).
Among the scenarios that have been gaining popularity, one considers the origin of some viral groups before the emergence of cells (Nasir and Caetano-Anollés 2015, Farias et al. 2019). It is worth noting that no current proposal for the origin of viruses suggests a single origin for the whole group. On that matters, some of them suggest that the viral form should be seen as a life-sustaining strategy that may have appeared several times throughout evolutionary history . However, when one considers that some viral groups must have been contemporary with the progenotes, the fact that these biological entities have the same genetic code used by the cells must be reinterpreted. In that context, the presence of a nearly universal genetic code must be seen as evidence for the monophyly of all biological systems.
It becomes therefore necessary to clearly understand the fact that the origin of biological systems happened much before than the origin of cells. Thus, the genetic code present in both viruses and cells cannot be used as evidence for the unique origin of those latter entities. So, when scientific evidence will be able to confirm that at least one group of viruses appeared before cells, then the present of a genetic code will definitely be seen as evidence for the monophyly among all biological systems. Thus, many groups of viruses and cells probably descended from populations of progenotes living as open systems that shared the translation machinery to express themselves and reproduce.
However, even though the existence of a universal genetic code cannot be used as an evidence for cell monophyly, we need to consider that ribosomes are nowadays found only within cells. In this way, the encapsulation of a translation system can be considered as an indication of a single origin indeed. However, when we consider a scenario prior to the emergence of cells, it becomes even more difficult to imagine how a translation system could have appeared within a lipid envelope. It is proposed that the evolution of open systems of quasi-species led to the process of compartmentalization of molecular subsystems (Woese 1998). In this sense, the systems that maintained and incorporated the translation system would have gained autonomy and were capable to give rise to what we know as cells (Koonin et al. 2006. It is possible that some systems have been compartmentalized in simpler wraps without the translation system, not developing autonomy and remaining dependent on the ribosomes present only in cellular systems: such systems would have formed the first viral groups.

C) Could membranes represent definitive evidence for the unique origin of cells?
When considering the scenario above mentioned as possible, a further question is: was the translation system incorporated by one single membrane system or by multiple ones? If we can show that the translation machinery has been incorporated only once, this would be an unequivocal evidence for a single origin for cells. On the other hand, we may wonder if there exist a possibility that the translation system has been co-opted by more than one population of those pre-cellular, progenote subsystems. In the latter case, we would conclude that the entities known as cells may have had at least two origins, and, thus, would not form a monophyletic group.
In order to search answers for this question, we need to analyze the existing compartmentalization systems in detail, that is, the membrane systems that led to the individualization process in biological systems. Such systems would have operated in the transition between the world of progenotes and the cellular world. In other words, it represents the transition between (i) a molecular world of open biological systems and (ii) a cellular world of closed biological systems. It is also the same to talk about a transition between the era of progenotes and FUCA into the era of cells and LUCA. At this point, we present the still unsolved dispute about whether phospholipids were used for encapsulation once or twice along the initial emergence of cellular systems: this quarrel is referred in scientific literature as the "lipid divide" issue (Villanueva et al. 2017, Jordan et al. 2019, Mencía 2020. When one studies the constitution of the plasmatic membrane along the two basal groups of cellular organisms (Bacteria and Archaea), it is observed that even if they present a lipoprotein membranes, the very constitution of the membranes is radically different between the two groups.
To start with, it has been noticed that the biochemical structure of the bacterial membrane is composed by (i) a lipid bilayer containing phospholipids linked through ester-like chemical bonds between an alcohol and the fatty acid. However, in archaeal membranes the chemical bond that bind the alcohol to the fatty acid is of the ether type (Lombard et al. 2012). In addition to these differences in the bond type, (ii) the alcohol used in the structure of the phospholipids is different as Bacteria use glycerol-3-phosphate and Archaea use glycerol-1-phosphate. Besides, (iii) the incorporated fatty acids are quite distinct once the fatty acids in Archaea usually present a side chain (Sojo 2019). Finally, (iv) the proteins used in each of the basal groups to metabolize lipids have low similarity and cannot be considered homologous (Koga et al. 1998, Sojo 2015

Did a cellular common ancestor ever exist?
Therefore, we believe to have considered a couple aspects that dispute the very fact that a cellular common ancestor such as LUCA has ever existed as originally proposed. Alternatively, could we consider that 2 ancestors for Bacteria and Archaea evolved independently? If we consider this idea as possible, it is mandatory that the translation system has been inherited by both ancestors, as well as other progenote systems that would form the pathways that operated the basal metabolism of cells. was clearly a first ancestor for entities sharing a chemical and informational code, i. e., those ones on in which the translation system has emerged and matured through the selforganization of an organic code ).
Then, the woesian LUCA seen as a cellular ancestor and sometimes better referred as LUCellA (Last Universal Cellular Ancestor; by Nasir, Kim and Caetano-Anollés 2012) possibly has never existed. In that case, it does not seem possible to define any common ancestor between Bacteria and Archaea that would have existed before cellularization, indicating that the paths through which compartmentalization has been achieved followed from parallel evolutionary routes. Nevertheless, we must account for the fact that the metabolism of carbohydrates, amino acids and other intermediate metabolites were most likely inherited together with the translation machinery from an ancestral clade of progenotes. Thus, it seems more appropriate to forget about a cellular LUCA or, alternatively, to consider the existence of at least two different cellular ancestors, or two LUCAs; one of them which would have given rise to Bacteria, and another that would have given rise to Archaea. Furthermore, we consider the possibility that other progenote populations originated one or more viral clades.

Last considerations
Since the consolidation of Cell Theory as one of the strongest theories in biology, the assumption that cells originate solely from preexisting cells has been established as one of the most fundamental bases for biological knowledge, a true and unshakable fact.
However, when we aim to understand the origin of living organisms on our planet and try to understand the diversity observed today, several questions about the origin of cellular systems are missing and lack a definitive explanation. Some particular scenarios for the origin of both biological systems and cells lead us to suggest that the cellular organization may have had more than one origin in the beginning of life.
Hence, when one takes seriously the possibility that cells could be polyphyletic, it becomes necessary to understand that the natural history of cells is different from the natural history of biological systems. In that sense, we use the term biological systems to refer to any system made of chemical molecules with the capacity to form organic codes , as the genetic code on which amino acids are encoded by codons in a nucleic acid. Those biological systems therefore include not only cells and viruses, seen as different contemporary strategies for maintaining organicity ), but also the pre-biotic open systems known as progenotes. And even though these life enduring strategies may have most likely emerged more than once, they have in common the fact that they are based on the same molecular language, since the organic phenomenon presents strong evidence that appeared only once on our planet. Thus, we presented earlier the proposal that biological systems emerged from a chemical symbiosis ) that started with an interaction between nucleic acids and amino acids. This interaction stabilized and increased the chance that these heteromolecular systems could be maintained and persisted. From this starting point, the system has probably evolved until the maturation of a ribonucleoprotein machinery that gave rise to the ribosome, generating the first universal common ancestor ( proteins may have produced more sophisticated forms for these wraps, evolving to be able to control lipids by biochemical pathways and allowing the emergence of a cytoplasm. The fact that we observe two chemically different pathways to control the link between lipids and the formation of membranes in the basal prokaryotic groups suggests that cells may have appeared more than once. Besides, although today we know only two different pathways capable to produce cell membranes (suggesting that this mechanism may have appeared twice), there is the possibility that other less efficient pathways to produce proteolipidic envelopes could have been originated and further extinguished. It is also possible to assume that different viral clades may have appeared at that time through the formation of different types of protein capsids. Because they are simpler than membranes, it is possible to suppose that such biological systems were even more common in the early days of life in Earth. This last fact parallels to the extreme abundance of viruses seen today. Thus, it is possible to suppose that different strains of organisms with a viral-type strategy co-existed before the emergence of cells.
Thus, when we consider that life can be defined through a process carried out by overlapping codes in multiple layers Prosdocimi, 2019, Farias et al. 2020), we realize that some arguments often used to define what cells, such as the presence elements of the translation system and the genetic code, should be more correctly used to define the monophyly of the biological systems. Therefore, both cells and viruses possibly originated in the early days of life on our planet from the evolution of quasi-species entities known as progenotes ( Figure 1).

Figure 1:
From pre-biotic to organismal age. The formation of the building blocks for life was produced in a pre-biotic age. After a chemical symbiosis, nucleic acids and amino acids started to interact for the benefit of both, creating stable structures that would evolve into the ribosome. The first universal common ancestor is born from this chemical symbiosis and maturates when the genetic code is initially formed. Openbiological systems evolve from FUCA creating progenote populations. Each population mature different biochemical pathways independently. Some populations develop the capability to produce peptidic wraps capable to maintain, save and store genetic material. Progenotes capable to bind and process carbohydrates, amino acids and co-factors evolve. Nucleic acid binding proteins organize the formation of two different types of replication machineries. Lipid binding proteins organize the formation of two different types of membrane and self-organize to the formation of the different basal groups of prokaryotes. Viruses are inherited from both progenotes but also are produced by escaping modules from bacterial and/or archaeal groups. Eukaryotes evolve from the symbiosis between archaeal (Lokyarchaeota) and bacterial groups. Viral strategy is also produced by escaping modules from eukaryotic genomes.
Finally, due to (i) the differences observed in the basic constitution of bacterial and archaeal membranes, to (ii) the differences observed in the constitution of viral capsids and considering (iii) the differences observed in the replication of nucleic acids by bacteria, archaea, and viruses, we suggest that these biological systems architectures