The eukaryotic last common ancestor was bifunctional for hopanoid and sterol production

Steroid and hopanoid biomarkers can be found in ancient rocks and, in principle, can give a glimpse of what life was present at that time. Sterols and hopanoids are produced by two related enzymes, though the evolutionary history of this protein family is complicated by losses and horizontal gene transfers, and appears to be widely misinterpretted. Here, I have added sequences from additional key species, and re-analysis of the phylogeny of SHC and OSC indicates a single origin of both enzymes among eukaryotes. This pattern is best explained by vertical inheritance of both enzymes from a bacterial ancestor, followed by widespread loss of SHC, and two subsequent HGT events to ferns and ascomycetes. Thus, the last common ancestor of eukaryotes would have been bifunctional for both sterol and hopanoid production. Later enzymatic innovations allowed diversiﬁcation of sterols in eukaryotes. Contrary to previous interpretations, the LCA of eukaryotes potentially would have been able to produce hopanoids as a substitute for sterols in anaerobic conditions. Without invoking any other metabolic demand, the LCA of eukaryotes could have been a facultative aerobe, living in unstable conditions with respect to oxygen level.


Introduction
Cells maintain fluidity of membranes through hopanoids and sterols [Rohmer et al., 1979], including molecules like cholesterol. Compared to many other biological molecules, the structure of hopenes and sterols makes them extremely stable over long time scales [Ourisson and Albrecht, 1992]. Through geological processes, these two types of compounds become chemically modified, being converted into reduced forms called hopanes and steranes, respectively [Brocks and Pearson, 2005]. These molecules (often referred to as biomarkers) may persist for millions of years, and are detectable in some ancient rocks, giving a window into what organisms were present at the time the rock was formed.
Hopanoids and sterols are produced by a series of shared enzymatic steps, but diverging at one critical step where oxygen is required to produce sterols but not hopenes [Fischer andPearson, 2007,Nes, 2011]. The enzyme squalene monooxygenase (SQMO, also called SQE for squalene epoxidase, or ERG1 from the yeast homolog involved in the ergosterol biosynthetic pathway) requires oxygen to produce an epoxide on squalene, changing the downstream specificity and products [Nes, 2011]. Following the oxygen-dependent step, one of two enzymes then forms the multi-ring structure, either squalene-hopene cyclase (SHC) for hopanoids or oxidosqualene cyclase (OSC) for sterols. Despite the different activity, the two enzymes are nonetheless similar at the sequence level. For instance, the bacterium Methylococcus capsulatus is one of the few known organisms that possesses both enzymes, which show 28% identity to each other. This is comparable to the 27% identity measured between human OSC and SHC from Alicyclobacillus acidocaldarius, the two members of this protein family with crystal structures [Wendt et al., 1999, Thoma et al., 2004.
A simplified view postulates that hopanoids are produced by bacteria, while sterols are produced by eukaryotes. However, some bacteria actually produce sterols [Jahnke et al., 1992, Pearson et al., 2003, Lamb et al., 2007,Wei et al., 2016 and some eukaryotes produce hopanoids [Mallory et al., 1963,Kemp et al., 1984, suggesting that the traditional interpretation is an oversimplification (see Figure 1A). The phylogenetic distribution is complicated, involving many gains, losses, and horizontal gene transfers (HGT). Hypotheses have been proposed to explain several aspects of the sterol biosynthesis, albeit referring to different parts of the phylogenetic history, while other parts remain uncertain: 1. HGT of sterol biosynthesis (of OSC and SQMO) from stem-eukaryotes to some bacteria (referred to as Bacterial Group 1 by [Gold et al., 2017] ) 2. HGT of sterol biosynthesis (of OSC and SQMO) from stem-SAR group or stem-Archaeplastida to other bacteria (some gammaproteobacteria, referred to as Bacterial Group 2 by [Wei et al., 2016, Gold et al., 2017 3. HGT of SHC from cyanobacteria to ferns [Takishita et al., 2012]. This reconciled observations of hopanoids in ferns [Pandey andMitra, 1969, Zander et al., 1969], as well as the presence of both enzymes [Shinozaki et al., 2008]. 4. HGT of SHC from proteobacteria (represented only by Anaeromyxobacter ) to ascomycete fungi [Takishita et al., 2012].
The latter three of these make sense. They show a restricted distribution (e.g. only in ferns, not to all plants or all archaeaplastida), potentially showing different origins (cyanobacteria vs. proteobacteria), and cannot be adequately explained by vertical inheritance from a putative bacterial ancestor at the origin of the mitochondrion. Even for the transfer from SAR eukaryotes to bacteria, the position in the phylogenetic tree from [Gold et al., 2017] is unambiguous, that is, the tree cannot be rotated in a way to place the bacterial group outside of eukaryotes.
The first HGT, however, is weak because it fails to explain the origin of OSC and has an equally parsimonious alternative. Here, HGT is proposed from a stem eukaryote to some bacteria, though as the origin is unknown, the direction of exchange could go either way and would be equally parsimonious, from bacteria to eukaryote, or vice versa. In this case, the split between eukaryotic and bacterial OSCs is one node from the root of the tree, so could be explained in either direction. Given that current endosymbiotic theory proposes a major transfer of genes from a bacterium to an "early" eukaryote, that is, the origin of eukaryotes via endosymbiosis of the mitochondrion, it would make the most sense for OSC to be one of the thousands of genes acquired this way from a bacterium. All else being equal, the presence of OSC in some bacteria and stem eukaryotes is nonetheless best explained by primary inheritance of OSC by a pre-eukaryotic host from a bacterial endosymbiont.
2 Results and discussion 2.1 Additional species still recover monophyletic groups for both eukaryotic STC and OSC Beginning from the dataset used by [Takishita et al., 2012], 405 additional sequences were added. Phylogenetic analysis (see methods) broadly recovers the same topology (Figure 1), though the question of where to place the root became increasingly important for subsequent interpretations. The backbone of the tree is poorly supported at most nodes (as it was for [Takishita et al., 2012]), making it difficult to decide on a root or confidently infer anything from the arrangement of groups.
If the two enzymes are homologs, that is, they share a common ancestor and so belong in the same tree at all, then where should the root of the tree be placed? For SHC and OSC, the root is commonly placed between the two enzymatic groups based on activity [Desmond and Gribaldo, 2009,Frickey and Kannenberg, 2009,Takishita et al., 2012,Gold et al., 2017. Even if unintended by the authors, this implies a parallel origin of the two enzymes relative to the unknown outgroup. It is not clear what that outgroup would be. Clearly, the tree should not be rooted at this position, or at least should not be drawn this way. The programs that generate phylogenetic trees typically run in a way that produces an unrooted tree (called time-reversible), where the branching and length (broadly speaking) display the weighted number of substitutions to convert one sequence into another. Walking from one end of the tree to the other implicitly is a walk back in time to the root, then forward again to another leaf. The user of the program then decides after the fact to place the root based on known information about the species or genes in question. Given this arbitrary decision, one is often left with the impression that the true divide in this enyzme family is between SHC and OSC, though this reflects functional differences rather than evolutionary history.

How did sterol biosynthesis evolve?
If the root is not between SHC and OSC, then where should it go? If bacteria and hopanoids appeared first on this planet, then it is very likely that the root should be set somewhere within bacteria. Again, as rooting is done after the fact when one knows the origin, this is currently unknown for bacteria. This means that the root of the tree is not the split of SHC and OSC, but somewhere among the bacterial lineages.
Starting from this root, the evolutionary history may be more adequately described as a duplication followed by many losses: 1. SHC was the original enzyme, distributed across many bacterial lineages 2. OSC arose from a gene duplication in some proteobacterial ancestor 3. Eukaryotes inherited both SHC and OSC from a bacterial endosymbiont, and would have been capable of producing both hopanoids and sterols 4. Either SHC or OSC was subsequently lost in all eukaryotes, leaving some anaerobic eukaryotes with only SHC and most eukaryotes with only OSC 5. HGT events occurred in both directions, resulting in a gain of SHC for ferns and ascomycete fungi from two different bacterial groups, and a gain of OSC for some gammaproteobacteria from a stem-SAR eukaryote 6. Secondary losses of OSC occur widely, as some organisms scavenge or acquire the molecules from their diets These steps are detailed below.

SHC was the original enzyme
Assuming that bacteria existed before the origin of eukaryotes, it therefore makes the most sense to assume that SHC existed first.
(3)  [Takishita et al., 2012], showing the commonly-drawn root position. This portrays the two proteins as having parallel origins, from an unknown source, hence fails to explain the origins of either SHC or OSC. (B) If bacteria are the true root (currently unknown where within bacteria), then a parsimonious history emerges. The tree from this study shows that SHC was the original enzyme and (1) a duplication within an ancient bacterial lineage produced OSC from SHC. This explains how some, but not all, bacterial lineages may have OSC. (2) Multiple lineages of anaerobic eukarotes still retain SHC/STC showing a single origin, which is sister group to bacteria. This is explained by vertical inheritance from bacteria at the origin of eukarotes (probably endosymbiosis). (3) The original eukaryote inherited OSC from the endosymbiont, and the bulk of eukaryotes retain this enzyme. (4) Two addition horizonal gene transfer events gave SHC back to two separate eukaryotic lineages from two different bacterial groups. possessed this enzyme due to horizontal gene transfer between prokaryotic lineages. Nonethless, clearly some ancient bacteria could produce hopanoids from squalene with SHC.

A gene duplication produced OSC from SHC
Next, a duplication of the SHC gene produced two copies in some ancient bacterial lineage (it is also currently unclear who this might be). Initially, these two copies would produce the same products (diploptene) from the same substrate (squalene). As these two copies diverged in sequence, the preference for substrate changed and so did the reaction mechanism and products, thus one of the two became OSC and allowed for production of sterols. It is very likely that this was coincident with the origin of SQMO. Some other enzymes may have originated later, like the C14-demthylase (called ERG11 in yeast, CYP51A1 in mammals), as these enzymes are absent in some bacteria that produce sterols [Pearson et al., 2003].
Who were these bacteria? This is still a mystery, though some things can be learned from the presence of both genes in some species, and from the tree itself. Several bacteria have been shown to possess both genes in the genomes, including the planctomycete Gemmata obscuriglobus, the gamma-proteobacterium Methylococcus capsulatus, and the delta-proteobacteria Plesiocystis pacifica and Enhygromyxa salina. The results of the phylogenetic inference place the SHC from Gemmata obscuriglobus firmly within other planctomycetes, but not so for the OSC, which is sandwiched between actinobacteria and proteobacteria. Thus, the presence of OSC in Gemmata obscuriglobus may represent a more recent horizontal transfer, rather than inheritance from the original duplication. Thus, this duplication may be better considered to have originated somewhere in the proteobacteria.

Eukaryotes inherited both enzymes, probably at the same time
The core eukaryotic group of SHC/STC (those not from HGT) includes representatives from only three eukaryotic lineages yet still shows a single origin; the same goes for eukaryotic OSC. That is, despite the relative paucity of SHC in eukaryotes, the groups that still contain the enzyme include opisthokonts, ciliates, and some excavata. If these three lineages are mapped onto current models of the species tree of eukaryotes (see [Derelle et al., 2015, Brown et al., 2018), vertical inheritance of SHC in those three lineages would parsimoniously require that the last eukaryotic common ancestor had SHC (Figure 2). Even some alternate arrangements [Derelle et al., 2015, Ren et al., 2016 would require that effectively all eukaryotic groups originally had the enzyme.
What is the alternative? It was proposed that multiple HGT events to eukaryotes allowed to survival in oxygen-depleted environments [Takishita et al., 2012]. HGT is seen in other parts of this tree, as discussed above and below. The pattern seen for eukaryotic STC would therefore require three HGT events from the same bacterial group, close enough in time that the inferred phylogenetic tree would give them a single, strongly-supported origin. That is, three HGT events would have had to occur and give the illusion of a single origin by our current phylogenetic models. This is abysmally unlikely. Firstly, HGT should only be invoked when it is clear that there is no possibility of primary inheritance in the LECA from a bacterial endosymbiont at the origin of mitochondria. The broad distribution of eukaryotic groups with a strongly-supported single origin suggests the presence of SHC/STC in the LECA, thus vertical inheritance cannot be ruled out.
Secondly, prokaryote-eukaryote HGT is not common enough for three parallel HGTs to make sense, nor is there a satisfying explanation as to why these lineages would all take a nearly identical enzyme from the same bacterial source, especially as that clearly was not the case for ferns or fungi. Even if HGT were still invoked only once to a stem eukaryote, this would parsimoniously be equal to primary inheritance in the LECA from a bacterial endosymbiont.

Eukaryotes lost either SHC or OSC
According to the most parsimonious view of the tree (Figure 1B), the ancestor of all eukaryotes then must have had both SHC and OSC. Yet, in a bizarre turn of events, either one or the other gene was later lost in all lineages, with two subsequent exceptions (discussed below). In other words, ignoring the HGT events, no  Recreated tree from [Brown et al., 2018], showing major clades of eukaryotes. The observed pattern of presence and absence is nonetheless best explained by a single origin of SHC/STC, retention by 3 lineages (4 groups), and 5 or more losses (indicated by gray 'X's). Even alternate topologies [Derelle et al., 2015,Ren et al., 2016 would still require that the LCA of most or all eukaryotic groups originally had SHC/STC, mostly due to the presence of STC in ciliates and Neocallimastigomycota fungi. Otherwise, three HGT events would need to occur at the red triangles. SAR group refers to the clade of Stramenopiles, Alveolates, and Rhizaria. CRuMs group refers to Collodictyonids, Rigifilids, and Mantamonas.
known eukaryote today possesses both enzymes. The lineages with SHC are precisely those that are lacking OSC, and vice versa. This suggests that a "choice" was made in each lineage to keep either SHC/STC or OSC, but never both.
Two questions may be considered: why was one of the two enzymes lost in all eukaryotic lineages? why did most lineages retain OSC? 1. Given the high similarity of the compounds produced, plausibly some enzyme products could interfere with the production of others. This precise effect was observed in the ciliate Tetrahymena pyriformis, where cholesterol was shown to inhibit production of tetrahymanol from labeled squalene, [Conner et al., 1968] possibly by affecting gene expression, while other steroid compounds were show to inhibit the cyclization in vitro, [Sipe and Holmlund, 1972] likely by directly inhibiting the enzyme. Thus, competitive inhibition of one enzyme with products from the other may have substantially reduced the utility of having both enzymes.
Lanosterol is heavily processed to produce a large variety of different sterols across many different groups [Nes, 2011]. The planctomycete Gemmata obscuriglobus was shown to produce only lanosterol [Pearson et al., 2003], as it lacks enzymes found in many eukaryotes for demethylation and other functions. Thus, it may be that the disadvantage of having both enzymes is due to subsequent processing of one or both, which may make products that interfere with the functioning of the other.
2. If SHC and OSC were truly equivalent, and the losses were random, then one would expect to find more eukaryotes with STC. For instance, the ciliate Tetrahymena normally produces the hopanoid tetrahymenol, but can substitute cholesterol if available [Conner et al., 1968]. Of the eukaryotic groups that kept SHC, many are obligate anaerobes, such as the fungus Piromyces. However, the vast majority of known eukaryotes have a gene for OSC and not SHC. Is there some advantage of OSC, or of sterols, over SHC and hopanoids? One advantage could be intrinic properties of the molecules themselves. It was shown that membranes with cholesterol had substantially improved viscosity compared to lanosterol, or no sterol at all (reviewed by [Bloch, 1983]). These studies addressed sterols, but did not consider hopanoids. However, [Saenz et al., 2012] reported that the hopanoid diplopterol is functionally similar to cholesterol in terms of membrane-ordering effects, based on multiple experiments.
The second advantage may relate to the abundance of oxygen. If environmental oxygen were low, or inconsistently available, then having an oxygen-independent substitute may prove to be useful. However, if environmental oxygen levels were consistently high, such substitute may never be needed, and thus favor the loss of SHC over OSC.

Some lineages gained OSC or SHC by horizontal gene transfer
The only modern eukaryotes with both enzymes (some ascomycete fungi [Hayashi et al., 1996, Isaka et al., 2011 and ferns [Pandey andMitra, 1969, Zander et al., 1969]) have lost the original STC-type and then re-acquired SHC by horizontal gene transfer from two different bacterial groups (label (4) in Figure 1B). Fungi have acquired SHC from an ancestor of Anaeromyxobacter, a delta-proteobacterium. On the other hand, ferns have acquired SHC from a cyanobacterial ancestor. Considering this, it should be noted that no modern eukaryote can serve as a "living fossil" proxy for early eukaryotes.
Nonetheless, for the fungi, this may confer some advantage in anaerobic conditions. Yeast, for instance, can grow under anaerobic conditions if ergosterol is supplied to the medium [Andreasen and Stier, 1953], suggesting that the only metabolic limitation for yeast under anoxia is the production of sterols. For ferns, however, it seems more likely that SHC would have an ecological role, rather than physiological. Many other flowering plants have multiple paralogs of OSC, named by their different activities: cycloartenol synthase (CAS) and beta-amyrin synthase (BAS). The HGT of SHC to ferns may simply be a way of adding to the diversity of compounds produced, or could serve as toxins against microbial pathogens or herbivorous insects.
Some bacteria appear have genes for both SHC and OSC in their genomes, such as the planctomycete Gemmata obscuriglobus, or the gammaproteobacterium Methylococcus capsulatus. Two recent studies [Gudde et al., 2019, Rivas-Marin et al., 2019 have demonstrated that knocking out either OSC or SQMO in G. obscuriglobus results in a misshapen membrane structures and an overall loss of viability, which could be rescued by addition of lanosterol to the medium. At first glance, it would appear that this would counter the theory that SHC could serve as a replacement for OSC. However, the G. obscuriglobus SHC is around 100 amino acids shorter than the nearest planctomycete neighbors, so it is not clear that this enzyme in functional at all. This may therefore explain the observed requirement for sterols [Gudde et al., 2019, Rivas-Marin et al., 2019, as there is no chance of a substitute.
The gammaproteobacterium Methylococcus capsulatus, on the other hand, appears to be making both hopanoids and sterols under normal laboratory conditions [Jahnke et al., 1992]. These sterols, however, are still methylated at the C4 [Jahnke et al., 1992], showing that this bacterium possesses enzymes to demthylate at the C14 position, but ineffectively at the C4 position. When compared to yeast, the total pathway would therefore be considered incomplete.

Many other lineages secondarily lost OSC
Considering the animal kingdom, OSC is clearly found in the genomes of species of porifera, placozoans, echinoderms, molluscs, chordates, hemichordate worms, nemertean worms, annelid worms, brachiopods, phoronids, and tardigrades. However, it is reported sterol biosynthesis is absent in arthropods [Zandee, 1964], nematodes [Chitwood, 1999], and platyhelminths [Meyer et al., 1970]. Even within molluscs, sterol biosynthesis appears to be absent in some bivalves [Walton and Pennock, 1972] and squid. Neither SHC or OSC are found in the genomes of cnidarians (corals and jellyfish), ctenophores (comb jellies), though sterols were present in lipids of several cnidarians and ctenophores [Nelson et al., 2000], suggesting that dietary acquisition is sufficient.
A number of microbial eukaryotes appear to lack both OSC and STC [Takishita et al., 2017]. Much like the above schemes, "dietary" acquisition may fulfill this need, or that the function may be dispensible entirely. This could also be a reflection of the overall cost of the pathway. As many animal groups have lost OSC, it is conceivable that many single-celled eukaryotes could have lost OSC as well, and also STC. This further argues for the repeated losses seen in Figure 2.

Where was all the oxygen?
The enzyme OSC is widely assumed to be a feature of the LECA [Desmond andGribaldo, 2009, Gold et al., 2017], which is the clear and obvious interpretation of the tree. Adding SHC/STC as a feature of the LECA indicates that both enzymes were ancestral in early eukaryotes. This further suggests that early eukaryotes could have made both hopanoids and sterols.
Contrary to previous hypotheses, this additionally raises the intriguing possibility that early eukaryotes were facultative aerobes capable of living in low-oxygen environments, rather than obligate aerobes living in fully-oxygenated environments [Desmond and Gribaldo, 2009]. Much like the requirement of sterols for anaerobic growth in yeast [Andreasen and Stier, 1953], it is possible that early eukaryotes may have survived in low-oxygen conditions, and only produced sterols when oxygen was available. These conditions are uncertain. Experiments in yeast suggest that the oxidation of squalene could occur at oxygen levels as low as 7nM [Waldbauer et al., 2011], so very low atmospheric levels of oxygen may have been sufficient.
This means that many rocks that contain only hopanes could still derive from ecosystems with abundant eukaryotes, and that those eukaryotes could have made hopanoids in parallel to the bacteria. The clearest case is that of ciliates [Venkatesan, 1989, Harvey andMcmanus, 1991], where tetrahymanol is found is modern sediments, and its product, gammacerane, is found in many ancient sediments or rocks. This would also reconcile other evidence from fossils about the timing of the origin of eukaryotes. Many fossils have been found from over 1 billion years ago that appear to be eukaryotes [Butterfield, 2000, Cohen and Macdonald, 2015, Bengtson et al., 2017], yet evidence from the biomarkers would suggest that the environment was dominated by bacteria [Brocks et al., 2017], and eukaryotes were rare or absent. This would therefore validate the interpretation of these fossils, arguing that they are bona fide eukaryotes, but were living in low-oxygen environments where only hopanoid synthesis was possible.
A stark contrast is then found in rocks from 600-700Ma, where steranes are found at appreciable quantites, suggesting that eukaryotic organisms then became globally abundant until modern day [Brocks et al., 2017]. Rather than an ecological shift, this potentially could also be indicative of a global change in atmospheric oxygen.

Conclusions
The origin of STC in eukaryotes has not received the attention of other parts of the history of this enzyme family. The presence of STC is best explained by primary inheritance in the LECA, and repeated losses. From this, the LECA could have been able to produce both sterols and hopanoids, potentially making it viable in both high and low-oxygen environments. Much of this is still uncertain, and several things may help resolve the biology and implications to the geology. STC enzymes do not appear to be well characterized outside of ciliates. Potential targets include Sawyeria marylandensis, Stygiella incarcerata, or Paratrimastix pyriformis. These species may be most informative as to the chemistry of the putative original SHC/STC enzyme in eukaryotes, and when and how STC activity evolved in this branch. Tetrahymanol is predicted to transform into gammacerane by diagenesis. A thorough review of the record of this molecule in rocks would need to be conducted. Clarifying these things may improve our understanding of the capabilities and environment of the earliest eukaryotic organisms on our planet.