Regime Shift of Genome Size Crossing the Chasm of Eukaryogenesis

The origin of the nucleus remains a great mystery in life science, although nearly two centuries have passed since the discovery of nuclei. To date, studies of eukaryogenesis have focused largely on micro-evolutionary explanations. Here, we examined macro-patterns of Cvalues (the total amount of DNA within the haploid chromosome set of an organism) for over 110,000 species and the chromosome numbers for over 11,000 species and their potential links with the state of atmospheric oxidation over geological time. Eukaryogenesis was in sync with an over 2.5 order-of-magnitude increase in genome size from prokaryote to eukaryote, and also with a rapid rise of atmospheric oxidation, suggesting that eukaryogenesis would have resulted from a regime shift of genomes driven by the oxidation-driven complexification and structuralization (e.g. chromatin packing).


Introduction
Eukaryogenesis-the entire process by which the defining traits of eukaryotic cells arose in the lineage that eventually gave rise to all present-day eukaryotes-has long been of great interest to evolutionary biologists 1, 2 , with multiple hypotheses proposed to date (e.g., various symbiotic models and the autogenous and exomembrane hypotheses) (table S1) 3,4 . Undoubtedly, eukaryogenesis should be a product of both macro-and micro-evolutions adapting to the changing environments. Although almost certainly related to origination of the nucleus in some way, it is not clear how to evaluate the relation between evolutionary complexification (increase in size) of genomes and eukaryogenesis. The acquisition of mitochondria capable of producing adenosine triphosphate (ATP) through oxidative phosphorylation has been proposed as prerequisite to eukaryogenesis [5][6][7] . However, little is known about the possible link between genome complexification and atmospheric oxidation (with high energy efficiency), and its potential role in eukaryogenesis. Since this study focused mainly on genomic evolution from prokaryotes to eukaryotes, a relatively simple typology of five biological kingdoms was applied, although more sophisticated typologies do exist.
In the present study, we have collected C-values for 110,431 species (covering bacteria, archaea, fungi, plants, and animals) (text S1) as well as chromosome number data for 11,792 species and explored their potential evolutionary relationships with the state of atmospheric oxidation over geological time. The term "C-value" refers to the amount (in picograms) of DNA contained within a haploid nucleus of an organism, being widely used to reflect the expected genetic complexity of a species, and in practice the term is often used interchangeably with genome size 8 . A chromosome is a complex linear organization of a DNA molecule aided by various packaging proteins (histones) and chaperone proteins. Each chromosome contains part of or all the genetic material of an organism. In a tiny cell, genomic complexification and structuralization were likely inseparable processes, which perhaps led to the emergence of eukaryotes upon reaching a threshold (Fig. 1).

Fig. 1. Macro patterns of genome complexity (C-values) for the evolutionary lineages of various biological kingdoms. (A) C-value distributions of the kingdoms and percentage
atmospheric oxygen concentration (pO 2 ) along with geological time. The first appearance time was estimated at ca. 3.77 Bya for archaea 16 19 , and 0.7 Bya for animals 20  Relationship of C-value with chromosome number with fitting lines based on a potential landscape model. The data after the kingdoms shows the corresponding median (minimum, maximum) of the chromosome numbers. It should be noted that chromosomes for cyanobacteria are a kind of unstable circular form between chromatin and real chromosome 21 . The evolution of structure, quantity, and packing-associated proteins of chromosomes were also given along a gradient of organismal complexity and oxidation extent.

Materials and Methods
The approach of Potential Landscape Model (also termed as stability landscape) 9 has been used to infer the existence of stable states across a given state space from noisy data 10,11 . This approach assumes an underlying stochastic system with a potential function: where z represents the focal state variable (here chromosome size), U(z) represents the potential function, σ represents the noise level, and dW is a noise term. The equilibrium values were calculated by determining the local minima and maxima of the resulting potential landscape. In https://cvalues.science.kew.org/ ; Animals 15 : http://www.genomesize.com.

Results and discussion
There has been a consensus that atmospheric oxidation had profound impacts on the evolution of life on Earth. Changes in the atmospheric oxygen level (pO2) and the C-value distributions of the five kingdoms over geological time are presented in Fig. 1A [22][23][24][25] . The median C-value shows large differences between prokaryotes and eukaryotes, with the lowest being for According to the estimated first appearance time of each kingdom 16-18, 20, 26, 27 , the LC group corresponds to the low pO2 period, the MC group to the moderate pO2 period, long after the First Great Oxidation Event (2.4 billion years ago, Bya), and the HC group to a period slightly before or during the Second Great Oxidation Event (0.8-0.5 Bya). The origination of life on Earth (3.85 Bya) and the appearance of the first eukaryote (2.1-1.6 Bya) shows at least a lag of 1.75 Bya 28 .
However, the lag is only 0.3 Bya from fungus to plants 18,19 , and 0.5 Bya from plants to animals 19,20 , suggesting the potential role of atmospheric oxidation in promoting diversification and genome complexity, and thus speciation. Our view is supported by the idea that energy production by the acquisition of mitochondria could have enabled eukaryogenesis 5,6,29 .
Interestingly, the C-value distribution of prokaryotes (including bacteria and archaea) shows evidence of heavy crowding at the high-value end, while scattered at the low-value end and at both ends of the three eukaryotic kingdoms (Fig. 1A). This suggests an apparent barrier of upper C-values for prokaryotes, i.e., only if a C-value threshold was passed could a shift from prokaryote to eukaryote take place. If taking the highest 5% value of this high-value end of prokaryotes, the C-value threshold for the prokaryote-to-eukaryote shift can be defined as -2.15 The relationship between chromosome number and C-values showed a sigmoid shape (Fig.   1B). The test by a potential landscape model according to Livina et al. (2010) 9 suggested two contrasting states, with fungi and bacteria (represented by cyanobacteria, the taxa with chromosome number reported) in the low-value group, while plants and animals in the highvalue group. This suggests that the transition from the lower to the higher group takes the shape of a regime shift. Additionally, the fact that eukaryotes have much larger genome sizes than prokaryotes suggests that accidental symbioses could not make a nucleus. In other words, the genomic complexity of eukaryotes could be greatly increased only if the evolution of the nucleus had experienced a long sequential process.
Related evolution of the structure (i.e., circular vs. linear), quantity (numbers), and packingassociated proteins of chromosomes along with the organismal complexity and oxidation extent was further summarized (Fig. 1B). The atmospheric oxidation likely promoted complexification and structuralization of genomes and hence triggered the origination of the nucleus after a sharp increase of genomic size upon crossing a threshold, as supported by the following facts or inferences. First, atmospheric oxidation provided the energetic basis for genomic complexification 30,31 . Second, there exists an upper limit of genomic size for prokaryotes. Third, formation of nuclear membranes and increase in chromosome number enabled orderly management and control of the complex biochemical system with increasing genomic complexity. Fourth, some proteins neofunctionalized to help highly efficient chromatin packing 32,33 . Associated packing proteins are mainly simple nucleoid-associated ones in prokaryotes, whereas various complex ones (e.g., histones, cohesins, etc.) are present in eukaryotes 33,34 . Eukaryotic chromosomes are therefore highly condensed and linked in an orderly manner to ensure that the genome fits inside the nucleus. For example, the chromosome packing ratio is as high as 10,000:1 for human beings when comparing the length of uncoiled DNA to that of a chromosome in metaphase (Annunziato, 2008); this ratio can be as high as 60,400 for some plant species 35 and over 65,600 for fungi 36 based on the length of chromosome and its genome contents 35 .

Conclusions
In conclusion, a regime shift of genome size occurred crossing the chasm of eukaryogenesis, and genomic complexification and structuralization (especially chromatin packing) driven by elevated atmospheric oxidation was likely a key macro-evolutionary force for the long-lasting eukaryogenesis that had been collaboratively completed through natural selection by microevolution (e.g., symbiosis and random mutation). Eukaryogenesis paves the way for rapid speciation, leading to adaptive radiation of various life forms on Earth. Further efforts should be made to integrate multidisciplinary evidence to reveal in more detail how processes of different evolutionary levels may have interacted synergistically to complete eukaryogenesis with sequential stepwise hierarchies.

Competing interests
The authors declare no competing interests.

Table S1 A list of the prevalent hypotheses on the origin of nucleus Name
Connotation Reference

Viral Eukaryogenesis Membrane Fusion Hypothesis
The eukaryotic nucleus evolved from a complex DNA virus. The viral ancestor of the nucleus is proposed to have been a complex enveloped DNA virus similar to the present-day poxviridae/ASF viruses, except that its host was an archaeon. It possessed a large linear chromosome and could enter the archaeal host cell by membrane fusion.

37
The Viral Eukaryogenesis (VE) Hypothesis The first eukaryotic cell was a multimember consortium consisting of a viral ancestor of the nucleus, an archaeal ancestor of the eukaryotic cytoplasm, and a bacterial ancestor of the mitochondria, a cell wall-less archaeon and an αproteobacterium established a syntrophic relationship, and then a complex DNA virus permanently lysogenized the archaeal syntrophy to produce a consortium of three organisms that evolved into the eukaryotic cell. The nucleus derives from the incorporation of an archaeon within a bacterium. A eubacterial host that engulfed an archaebacterial endosymbiont that underwent a transformation into the nucleus (a) The key event in the origin of the eukaryotic cell is postulated to be the engulfment of an 'eocyte' archaebacterium by a Gram-negative eubacterium that presumably lacked a cell wail. (b) As the membrane of the host surrounded the guest species, its own membrane (containing ether-linked lipids) became redundant and was lost. (c) Eventual separation of the internalized membrane from the plasma membrane led to the formation of the nuclear envelope and the endoplasmic reticulum. The formation of these new compartments was preceded or accompanied by duplication of the genes for the chaperone proteins (e.g. hsp70, hsp90, etc.). The transfer of the host genome to the newly formed nucleus, and an assortment of the genes from the two parents, led to the formation of the ancestral eukaryotic cell.

Chimeric Symbiotic Models
Archaea and bacteria would be the two only survivor lines descending from the last common ancestor and eukaryotes would derive from the merging of the two. The incorporation of the future mitochondrion to an archaeal host triggered eukaryogenesis.

Hydrogen Hypothesis
A hydrogen-mediated symbiosis between a hydrogenoclastic methanogenic archaeon and a hydrogen-producing alphaproteoba cterium, with hydrogen being used to reduce the CO2 also released by the bacterium for methanogenesis.

43
Entangle-Engulf-Endogenize (E3) Model The host of anaerobic archaea is symbiotic with a kind of sulfate-reducing bacteria. Symbiotic bacteria are receptors of fatty acids and hydrogen produced by amino acid metabolism of the host. Subsequently, the host and another aerobic bacteria symbiosis to form mitochondria, involving three species of prokaryotes. A dual symbiosis, in microoxic environments, of an Asgard archaeon that degraded amino acids to short-chain fatty acids and hydrogen with a sulfate-reducing bacterium (SRB) and an aerobic organotrophic alphaproteobacterium that scavenged toxic O2. As the consortium progresses towards increasingly oxic zones, the interaction with the alphaproteobacterium becomes stronger until it is engulfed. The SRB symbiosis is transient and eventually lost.

Reverse Flow Model: An Updated Symbiogenetic Model
A metabolic syntrophy between anaerobic ancestral Asgard archaea and facultative anaerobic alphaproteobacteria has provided the selective force for the establishment of a stable symbiotic interaction that has subsequently led to the origin of the eukaryotic cell.

Updated hydrogen hypothesis
Hydrogen could transfer from the bacterial symbiont δ-Proteobacterium, which is a fermentative organoheterotroph to a hydrogen-dependent autotrophic Asgard archaea (Wukongarchaeota) host.

Syntrophic Chimeric Fusion Hypothesis
A chimeric cell evolved via symbiogenesis by syntrophic merger between an archaebacterium and a eubacterium. The archaebacterium, a thermoacidophil, generated hydrogen sulfide to protect the eubacterium. By eubacterialarchaebacterial genetic integration, the chimera, an amitochondriate heterotroph. This "earliest branching protist" that formed by permanent DNA recombination generated the nucleus as a component of the karyomastigont.

46
The Symbiosis of Pyrococcus into γ-Proteobacteria The nucleus emerged from an archaeal endosymbiont (Pyrococcus -like), which was engulfed by a γ-proteobacterium. 47

Symbiogenic Theory
A symbiosis between a spirochaete and an archaebacterium without a cell wall (most likely Thermoplasma-like in her view), leading to both the eukaryotic flagellum and the nucleus. 48 Koonin and Yutin's Hypothesis the archaeal ancestor of eukaryotes was a complex form, rooted deeply within the TACK (Thaumarchaeota, Aigarchaeota, Crenarchaeota, Korarchaeota) superphylum, that already possessed some quintessential eukaryotic features, in particular, a cytoskeleton, and perhaps was capable of a primitive form of phagocytosis that would facilitate the engulfment of potential symbionts 49 Coevolutionary Theory The origin of the nucleus depended on the prior evolution of a primitive endomembrane system and a primitive mitosis, both brought about by and associated with the origin of phagocytosis. The concerted origins of the endomembrane system and cytoskeleton, subsequently recruited to form the cell nucleus and coevolving mitotic apparatus.

50
Exocytosis Hypothesis The first endomembranes to evolve during eukaryote evolution had secretory, and not phagocytic, function. Eukaryogenesis was initiated not by phagocytosis, but by the evolution of secretion by exocytosis, with phagocytosis arising later. The establishment of a secretory endomembrane system happened before or in parallel with the evolution of a nucleus.

Cavalier-Smith' s Hypothesis
Both the nucleus and the endoplasmic reticulum come from invagination of the plasma membrane. Eukaryotic cells evolve directly from bacteria and do not need endosymbiosis. An autogenous (non-symbiotic) origin of a phagocytosing amitochondriate eukaryote (an archezoon) via point mutational changes leading to a host that does not need a mitochondrion at all to enjoy its phagocytotic lifestyle, but acquires one nonetheless.

52
Autogenous Models Nucleus and cytoplasm evolved from a single prokaryotic lineage. 4 Gould and Dring' s Hypothesis Endospore formation of Gram-positive bacteria resulted in the origin of the nucleus. The protoplast of a single cell divides during endospore formation in such a manner that the cell engulfs a portion of its own cytoplasm, which then becomes surrounded by a double membrane resulting in the cell's nucleus.

53
Exomembrane Hypothesis Nucleus originated from a single early cell which then evolved into a second outer cell membrane. 3