ARTICLE | doi:10.20944/preprints202003.0097.v1
Subject: Life Sciences, Microbiology Keywords: base modification; methyltransferases; flavoenzymes; tRNA; rRNA; mycoplasmas; spiroplasmas; acholeplasmas; evolution; minimal cell; moonlighting function
Online: 6 March 2020 (02:28:12 CET)
The C5-methylation of uracil to form 5-methyluracil (m5U) is a ubiquitous base modification of nucleic acids. Four enzyme families have converged to catalyze this methylation using different chemical solutions. Here, we investigate the evolution of 5-methyluracil synthase families in Mollicutes, a class of bacteria that has undergone extensive genome erosion. Many mollicutes have lost some of the m5U methyltransferases present in their common ancestor. Cases of duplication and subsequent shift of function are also described. For example, most members of the Spiroplasma subgroup, use the ancestral tetrahydrofolate-dependent TrmFO enzyme, to catalyze the formation of m5U54 in tRNA, while a TrmFO paralog (termed RlmFO) is responsible for m5U1939 formation in 23S RNA. RlmFO has replaced the S-adenosyl-l-methionine (SAM)-enzyme RlmD that adds the same modification in the ancestor and which is still present in mollicutes from the Hominis subgroup. Another paralog of this family, the TrmFO-like protein, has a yet unidentified function that differs from the TrmFO and RlmFO homologs. Despite having evolved towards minimal genomes, the mollicutes possess a repertoire of m5U modifying enzymes that is highly dynamic and has undergone horizontal transfer. This emphasizes the necessity for combining bioinformatics predictions with empirical testing and structural information to get a reliable functional annotation of these enzymes.
ARTICLE | doi:10.20944/preprints201607.0096.v1
Subject: Life Sciences, Other Keywords: number of paralogs; comparative genomics; combinatorial optimization; Mycoplasmas; Halophiles; Orientia; Mycobacterium leprae; genome size
Online: 29 July 2016 (16:24:29 CEST)
The existence of multiple copies of genes is a well-known phenomenon. A gene family is a set of sufficiently similar genes, formed by gene duplication. In earlier works conducted on limited number of completely sequenced and annotated genomes it was found that size of gene family and size of genome are positively correlated. Additionally, it was found that several atypical microbes deviated from the observed general trend. In this study, we reexamined these associations on a larger dataset consisting of 1484 prokaryotic genomes and using several ranking approaches. We applied ranking methods in such a way that genomes with lower number of paralogs would have lower rank. Until now only simple ranking methods were used; we applied the Kemeny optimal aggregation approach as well. Regression and correlation analysis were utilized in order to accurately quantify and characterize the relationships between measures of paralog indices and genome size. In addition, boxplot analysis was employed as a method for outlier detection. We found that, in general, all paralog indexes positively correlate with an increase of genome size. As expected, different groups of atypical prokaryotic genomes were found for different types of paralog quantities.