Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Defining the Rhizobium Leguminosarum Species Complex

Version 1 : Received: 11 December 2020 / Approved: 12 December 2020 / Online: 12 December 2020 (11:51:22 CET)

How to cite: Young, J.P.W.; Moeskjær, S.; Afonin, A.; Rahi, P.; Maluk, M.; James, E.K.; Cavassim, M.I.A.; Rashid, M.H.; Aserse, A.A.; Perry, B.J.; Wang, E.T.; Velázquez, E.; Andronov, E.E.; Tampakaki, A.; Flores Félix, J.D.; Rivas González, R.; Youseif, S.H.; Lepetit, M.; Boivin, S.; Jorrin, B.; Kenicer, G.J.; Peix, Á.; Hynes, M.F.; Ramírez-Bahena, M.H.; Gulati, A.; Tian, C. Defining the Rhizobium Leguminosarum Species Complex. Preprints 2020, 2020120297 (doi: 10.20944/preprints202012.0297.v1). Young, J.P.W.; Moeskjær, S.; Afonin, A.; Rahi, P.; Maluk, M.; James, E.K.; Cavassim, M.I.A.; Rashid, M.H.; Aserse, A.A.; Perry, B.J.; Wang, E.T.; Velázquez, E.; Andronov, E.E.; Tampakaki, A.; Flores Félix, J.D.; Rivas González, R.; Youseif, S.H.; Lepetit, M.; Boivin, S.; Jorrin, B.; Kenicer, G.J.; Peix, Á.; Hynes, M.F.; Ramírez-Bahena, M.H.; Gulati, A.; Tian, C. Defining the Rhizobium Leguminosarum Species Complex. Preprints 2020, 2020120297 (doi: 10.20944/preprints202012.0297.v1).

Abstract

Bacteria currently included in Rhizobium leguminosarum are too diverse to be considered a single species, so we can refer to this as a species complex (the Rlc). We have found 429 publicly available genome sequences that fall within the Rlc and these show that the Rlc is a distinct entity, well separated from other species in the genus. Its sister taxon is R. anhuiense. We constructed a phylogeny based on concatenated sequences of 120 universal (core) genes, and calculated pairwise average nucleotide identity (ANI) between all genomes. From these analyses, we concluded that the Rlc includes 18 distinct genospecies, plus 7 unique strains that are not placed in these genospecies. Each genospecies is separated by a distinct gap in ANI values, usually at around 96% ANI, implying that it is a 'natural' unit. Five of the genospecies include the type strains of named species: R. laguerreae, R. sophorae, R. ruizarguesonis, "R. indicum" and R. leguminosarum itself. The 16S ribosomal RNA sequence is remarkably diverse within the Rlc, but does not distinguish the genospecies. Partial sequences of housekeeping genes, which have frequently been used to characterise isolate collections, can mostly be assigned unambiguously to a genospecies, but alleles within a genospecies do not always form a clade, so single genes are not a reliable guide to the true phylogeny of the strains. We conclude that access to a large number of genome sequences is a powerful tool for characterising the diversity of bacteria, and that taxonomic conclusions should be based on all available genome sequences, not just those of type strains.

Subject Areas

Rhizobium; species complex; bacterial taxonomy; core genes; housekeeping genes; average nucleotide identity; speciation; genospecies

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.