Preprint
Article

This version is not peer-reviewed.

Molecular Markers Specific for the Pseudomonadaceae Genera Provide Novel and Reliable Means for the Identification of Other Pseudomonas Strains/spp. Related to These Genera

A peer-reviewed article of this preprint also exists.

Submitted:

31 December 2024

Posted:

03 January 2025

You are already at the latest version

Abstract
Conserved Signature Indels (CSIs) in protein sequences, which are specific for species from dif-ferent genera demonstrate strong predictive potential of being found in other members of these genera. For several recently described Pseudomonadaceae genera (viz Aquipseudomonas, Atopomonas, Caenipseudomonas, Chryseomonas Ectopseudomonas, Geopseudomonas, Halopseudomonas, Metapseudo-monas, Phytopseudomonas, Serpens, Stutzerimonas, Thiopseudomonas, and Zestomonas), multiple tax-on-specific CSIs have been identified. This study examines the potential applications of these CSIs for identifying unclassified Pseudomonas spp. (strains) related to these genera. This was done using the AppIndels.com server, which uses information for the known taxon-specific CSIs in a genome sequence for predicting its taxonomic affiliation. For these studies, sequence information for different CSIs specific for the Pseudomonadaceae genera, and specific for P. aeruginosa and P. parae-ruginosa, were added to the server’s database. The server was then used to analyze the genomes of 1972 Pseudomonas spp. (strains/isolates) of unknown taxonomic affiliation. Based upon the analyses conducted by the AppIndels server, which determined the presence of significant number of taxon-specific CSIs in the analyzed genome, the server predicted that 299 of the analyzed genomes corresponded to the following clades/genera: Pseudomonas sensu stricto clade (46 strains), Ec-topseudomonas (46 strains), Chryseomonas (32 strains), Stutzerimonas (31 strains), Metapseudomonas (22 strains), Aquipseudomonas (21 strains), Phytopseudomonas (17 strains), Halopseudomonas (9 strains), Geopseudomonas (4 strains), Thiopseudomonas (3 strains), Serpens (2 strains), Caenipseudomonas (1 strain) and Zestomonas (1 strain). Additionally, 64 examined Pseudomonas spp. genomes were identified as P. aeruginosa. Phylogenetic studies performed here show that the taxonomic predic-tions made by the server were 100% accurate. Thus, based upon genome sequence information, the identified taxon-specific CSIs provide a novel and useful means for identifying other spe-cies/strains affiliated with the specific genera. The results of phylogenetic studies also suggest that many unclassified Pseudomonas strains whose taxonomic affiliations was predicted would likely constitute novel species in the indicated genera
Keywords: 
;  ;  ;  ;  ;  ;  ;  

1. Introduction

The family Pseudomonadaceae harbors several genera of which the genus Pseudomonas is one of the largest and earliest known prokaryotic genera [1,2]. The genus Pseudomonas encompasses >300 species representing more than 2/3rd of the Pseudomonadaceae species. Extensive earlier work on Pseudomonas species, using phylogenetic trees constructed based on multiple different sets of genes/proteins including core genomic proteins, has reliably established that the species from this genus do not form a monophyletic lineage. In phylogenetic trees, Pseudomonas species generally form three main groupings or lineages, referred to as the Pertucinogena, the Aeruginosa, and the Fluorescens lineages [3,4,5,6,7,8,9,10]. Additionally, species from both the Aeruginosa and Fluorescens lineages form multiple distinct genus-level clades, which are not specifically (i.e., evolutionarily) related to each other [5,9,11]. Species from other genera, including Azomonas, Azotobacter, Chryseomonas, Entomomonas and Thiopseudomonas, branch in between these clades/lineages demonstrating the polyphyletic nature of Pseudomonas species [3,4,5,6,7]. It is now generally accepted that, in accordance with the Code governing the nomenclature of Prokaryotes [12], of the observed Pseudomonas species clades, only the species from a specific clade referred to as the “Aeruginosa clade”, which contains the type species P. aeruginosa of the genus Pseudomonas, should be recognized as corresponding to the genus Pseudomonas [11].
With the availability of genome sequences, extensive work has been carried out in the past few years to clarify the evolutionary relationships and classification of Pseudomonas species using multiple genome sequence-based approaches. The approaches used include the construction of phylogenetic trees based upon different large datasets of core genomic proteins [4,5,6,7,11,13,14,15], assessment of overall relatedness of species from different clades based on genomic similarity matrices such as average nucleotide identity (ANIb) [4,14], average amino acid identity (AAI) [4,11] and percentage of conserved proteins (POCP) [4,11,15]. In addition, analyses of genome sequences have also proven instrumental in the identification of highly specific molecular markers, such as conserved signature indels (CSIs) in genes/proteins, which are uniquely shared characteristics of species from different clades and afford unambiguous means for both distinguishing and demarcation of different specific clades [5,11,16,17,18,19,20]. Based upon the consistent evidence acquired using different genomic approaches, most of the Pseudomonas species from the Pertucinogena and Aeruginosa lineages have now been reclassified into several novel genera (viz. Aquipseudomonas, Atopomonas, Caenipseudomonas, Ectopseudomonas, Geopseudomonas, Halopseudomonas, Metapseudomonas, Phytopseudomonas, Stutzerimonas, and Zestomonas) [4,5,6,11], and some preexisting genera (Chryseomonas, Paraburkholderia, Serpens, Stenotrophomonas, Thiopseudomonas and Xanthomonas) [21,22]. Importantly, these studies have led to the identification of multiple highly specific molecular markers (i.e., CSIs) that are uniquely shared characteristics of the species noted above. Additionally, several molecular markers have also been identified, which are exclusively found in the species from the genus Pseudomonas sensu stricto, Azotobacter, Azomonas, and for the species P. aeruginosa.
Due to the presence of Pseudomonas-related species in diverse niches and environments, including soil, water, plants and animal tissues [10,23], and as its type species, P. aeruginosa, is an important human pathogen [24,25], species related to this genus are subjects of extensive studies and novel species and strains related to this genus are continually being discovered at a rapid pace [26]. Since 2022 alone, more than 100 novel species related to Pseudomonas are listed in the List of Prokaryotic Names with Standing in Nomenclature (LPSN) server [26]. However, in addition to the species with validly published names, the NCBI server holds genome sequences for >2000 uncharacterized Pseudomonas spp. (strains or isolates). Several of these uncharacterized strains/isolates will likely be identified as novel species. However, there is no information available at present regarding their taxonomic affiliation. In our earlier work on Bacillus related and other genera we have provided convincing evidence that the CSIs specific for different genera exhibit a high degree of predictive ability to be found in other members of these genera, and the presence of known taxon-specific CSIs in a genome sequence can be used to predict its taxonomic affiliation. The predictive abilities of the CSIs to be found in other related species form the basis of the recently developed AppIndels.com server, which based upon the presence of known taxon-specific CSIs in a submitted genome sequence, can predict its taxonomic affiliation [27].
In this study, we have used the AppIndels.com server to determine whether based upon the information for the CSIs specific to different Pseudomonadaceae genera, it can predict the phylogenetic/taxonomic affiliations of several of the unclassified Pseudomonas spp. (strains). Results of these studies presented here show that based upon the information for identified Pseudomonadaceae CSIs, the server was able to predict the taxonomic affiliation of 299 unclassified Pseudomonas strains/isolates into 14 Pseudomonadaceae clades/genera. Phylogenetic studies conducted on these strains show that the predictions made by the server regarding the taxonomic affiliations of these 299 strains were 100% accurate. Thus, the identified CSIs specific for the Pseudomonadaceae genera provide a novel and useful means for the identification of other novel or unclassified Pseudomonas species/strains related to these genera.

2. Materials and Methods

2.1. Analysis of Pseudomonas spp. Using the AppIndels Server

Sequence information for the CSIs specific to different Pseudomonadaceae clades/genera was added to the database of the AppIndels server (https://appindels.com/) [27]. Genome sequences for 2000 unclassified strains/isolates of Pseudomonas spp. were downloaded (in .faa format) from the NCBI database. Of these genomes, some genomes which contained either <100 Kb sequence information or were indicated as contaminated were excluded from analyses. The remaining 1972 genomes were analyzed using the AppIndels server one at a time as indicated in earlier work and on the server’s main page. The predictions made by the server regarding the taxonomic affiliation of the submitted sequence and the number of CSIs identified in it specific for the predicted genus were recorded.
A maximum-likelihood phylogenetic tree for the Pseudomonas spp. strains for which taxonomic assignments were made by the server, along with sequences of representative species from different examined Pseudomonadaceae genera, was constructed based on the concatenated sequences for 118 conserved proteins comprising the phyloeco set for the class Gammaproteobacteria [28]. The tree was constructed using an internally developed pipeline, as described in our recent work [5,16,20,29]. The tree was labeled and formatted using MEGA X [30].

3. Results

3.1. Predictive Ability of a CSI Specific for the Genus Halopseudomonas

Earlier work on CSIs specific for multiple prokaryotic taxa provides compelling evidence that these molecular characteristics exhibit a high degree of predictive ability to be found in other species related to a specific taxon. To illustrate, in Figure 1 we show the results for a CSI specific for the genus Halopseudomonas [5]. This genus was created in 2021 by the reclassification of Pseudomonas species which corresponded to the Pertucinogena lineage. More than 20 CSIs specific for the genus Halopseudomonas were identified in this earlier study and the example depicted in Figure 1 shows the results for one of these CSIs, where a 2 aa insert in a conserved region of the flagellar protein FlgN, was present exclusively in all 19 Pseudomonas species that corresponded to the genus Halopseudomonas. Two of these species are listed as “Pseudomonas” as they have not yet been reclassified as Halopseudomonas due to the lack of availability of type strains in two different culture collections. Since the publication of this work, 6 other species related to Halopseudomonas have been described [4,6,31,32]. Some of these species presently are either not validly published (indicated by their placement within “ ”) or they are misclassified into the genus Neopseudomonas [6], which is a homotypic synonym of Halopseudomonas [26]. Nonetheless, as shown in Figure 1, the 2 aa CSI specific for the Halopseudomonas is commonly and uniquely shared by all six newly described species related to Halopseudomonas, but it is not found in any other Pseudomonadaceae species. In a phylogenetic tree that we have constructed, all species sharing this CSI group reliably within a clade corresponding to the genus Halopseudomonas (Figure S1). These results provide further evidence supporting the predictive ability of taxon-specific CSIs to be found in other species/strains which are related to them.

3.2. Examining the Usefulness of the CSIs Specific for the Pseudomonadaceae Genera for Determining the Taxonomic Affiliation of Unclassified Pseudomonas spp. Using the AppIndels.com Server

As indicated earlier, in addition to the genomes for >300 Pseudomonas species with validly published names, the NCBI database also holds genome sequences for >2000 unclassified strains/isolates of Pseudomonas spp. Earlier work by Hess et al. [7] provides evidence that these unclassified strains encompass enormous genetic diversity which remains to be understood. Thus, it is important to develop novel means or tools by which the genetic diversity and taxonomic affiliation of these unclassified strains could be assessed. In this work, we have investigated whether the identified CSIs specific to several Pseudomonadaceae genera can be used for identifying unclassified Pseudomonas spp./strains that are related to these genera. These analyses were carried out using the AppIndels.com server, which has been specifically created to take advantage of the predictive abilities of the known taxon-specific CSIs, to identify other species/strains related to them. The working of the AppIndels.com server described in detail elsewhere is briefly explained below [27].
The core of the AppIndels.com server is a database of sequence information for diverse previously identified CSIs specific to different (>100) prokaryotic genera. To this database, we have now added sequence information for the CSIs which have been identified for the Pseudomonadaceae genera. In Table 1, we have provided information regarding the Pseudomonadaceae genera/taxa for which CSIs have been identified and the numbers of CSIs, which are specific for each of these genera or taxa. This list includes several CSIs which are specific for the species Pseudomonas aeruginosa and Pseudomonas paraeruginosa. The last column in this Table indicates the weight values given to individual CSIs from different taxa. The rationale of weight assignment to the CSIs is discussed in detail in earlier work [27]. Its main purpose is to increase the specificity of taxon prediction by requiring that multiple CSIs specific for a given taxon should be present before a positive identification is made. When a genome sequence is uploaded or submitted to the AppIndels.com server, it conducts BLASTp searches on the submitted genome against the sequences of all CSIs in its database to identify matching sequences where the indels are present in protein sequences in the exact location as in the CSI database. The server then gathers information regarding the taxon specifications of different CSIs. If the total weight of all identified CSIs specific for a particular taxon exceeds the threshold value of 1.0, the server makes a positive identification that the submitted genome is affiliated to the indicated taxon. As all CSIs specific for the Pseudomonadaceae genera/clades, have a weight value of 0.4 or less, the server will make a positive identification for any Pseudomonadaceae genus/clade only when three or more CSIs matching that taxon are found in the submitted genome. As all described CSIs for the Pseudomonadaceae exhibit a high degree of specificity for the indicated taxon (barring an isolated exception) [5,11], the possibility of finding three CSIs matching a specific genus/taxon in the genome of an unrelated species/strain is highly unlikely.
To test the usefulness of identified CSIs using the AppIndels server, genome sequences were downloaded for 2000 stains/isolates of Pseudomonas spp. from the NCBI genome database. Of these, 28 genomes where the sequence genome consisted of <100 Kb, or was indicated as contaminated, were not further analyzed. Of the remaining 1972 genomes, 266 genomes were chromosomes or complete, 1197 consisted of contigs, and 509 were scaffolds. Some information regarding these genomes, including their strain numbers, accession numbers, assembly stage, G-C content (mol%), and genome sizes, is provided in Supplementary Tables S1–S3. The analyses on these genomes were conducted using the AppIndels server by uploading the sequences of these genomes, one at a time, onto the server. The server checks the uploaded genome sequence for the presence of CSIs matching different taxa in its database. If the server identifies significant numbers of CSIs matching any specific taxon, then the result from the server shows a positive match to that taxon. In such cases, the server also provides information regarding the number of CSIs matching the predicted taxon. However, if the submitted genome corresponds to a taxon/genus for which no CSIs are present in the server or if the total weight of the identified CSIs is less than the threshold value of 1.0, then the server shows a negative “None” result.
In Figure 2, we show the results obtained from the server for two Pseudomonas strains/isolates. The server indicates that the strain ZM24 is related to the Pseudomonas sensu stricto clade, and its genome contained five CSIs specific for this clade (Figure 2A). On the other hand, the server predicted that the genome of strain ABC1 is related to the genus Stutzerimonas, and six CSIs specific for this genus were identified in its genome (Figure 2B). In addition to indicating the numbers of CSIs specific to the predicted taxon, the server also provides sequence information for all matching CSIs, which can be viewed upon clicking the down arrow beside the number of CSIs. In Figure 2, we have also shown sequence information for three CSIs matching the two clades/genera. Sequence information for the other matching CSIs for these two genomes is provided in Figure S2.
Based on the analysis of genome sequences for 1972 examined Pseudomonas strains/isolates, the server made specific predictions regarding taxonomic affiliations of 299 of these genomes to specific Pseudomonadaceae genera. Results from the server for the genomes of all 299 Pseudomonas strains/isolates for which specific predictions were made are shown in Table S4, and a summary of these results is presented in Table 2. In Table 2, we have organized the results from the server for different strains according to their predicted affiliation for Pseudomonadaceae genera. Table 2 also shows the numbers of CSIs (range) specific for the indicated genus, which were identified in the analyzed genomes. As seen from Table 1, in all cases, the predicted affiliation of any genome to specific Pseudomonadaceae genera is based on the shared presence of a minimum of three CSIs specific for that genus. The numbers of CSIs identified for different genera in the analyzed genomes varied from a low of three for the genus Serpens to more than 20 for Halopseudomonas. This variation is solely due to the differences in the number of identified CSIs for different genera [5,11]. Results presented in Table 2 show that of the genomes for which the server made specific predictions, about 20%, corresponded to P. aeruginosa. Other Pseudomonadaceae genera to which the large numbers of analyzed strains (genomes) belonged included Pseudomonas sensu stricto clade (46 strains), Ectopseudomonas (46 strains), Chryseomonas (32 strains), Stutzerimonas (31 strains), Metapseudomonas (22 strains), Aquipseudomonas (21 strains), Phytopseudomonas (17 strains), Halopseudomonas (9 strains) and Geopseudomonas (4 strains). The server also predicted that a limited number of strains are affiliated to the genera Caenipseudomdonas, Serpens, Thiopseudomonas, and Zestomonas, which consist of only a few species [11].
We have examined the reliability of taxon predictions by the server by constructing a phylogenomic tree based on genome sequences of different Pseudomonas strains for which the server made taxonomic predictions. This tree was constructed based on concatenated sequences of 118 conserved proteins (corresponding to the phyloeco set for the class Gammaproteobacteria), and it also included the sequences of representative species from relevant Pseudomonadaceae genera. We show the results from this tree in Figure 3. Due to the considerable number of strains in this tree, we have compressed the clades for some Pseudomonadaceae genera in Figure 3. However, the uncompressed results for these clades are presented in Figure 4. In the phylogenetic tree shown in Figure 3 and Figure 4, all Pseudomonas strains/isolates for which the server made taxonomic predictions, they all grouped reliably (100% concordance) with the other species from the indicated genera (Figure 3, Figure 4 and Figure S3). Based upon the branching of different Pseudomonas strains/isolates in Figure 3 and Figure 4, while many unclassified strains appear closely related to the known species, many examined strains branched distinctly from the known species. Thus, several of these strains may constitute novel species within the indicated genera.

4. Discussion

Members of the genus Pseudomonas, which are genetically and evolutionarily highly diverse, are widely distributed in different environments. This group include species which are opportunistic pathogens of humans, animals, and plants, and other species of economic and ecological significance [24,25,33,34,35]. Specifically, the type species of Pseudomonas, P. aeruginosa, is an important human pathogen capable of causing a wide array of life-threatening acute and chronic diseases [36,37]. In view of the importance of these species from clinical and other perspectives, this group of species is extensively studied, and they constitute one of the fastest growing groups of bacteria [4]. In recent years, extensive work using genomic approaches has been carried out to more reliably delineate the evolutionary relationships and classification scheme for Pseudomonas and related species. These studies have led to reclassification of >150 Pseudomonas species into 14 novel genera [5,11,38]. Members of all these newly described genera can be reliably distinguished from each other based upon multiple highly specific molecular markers (i.e., CSIs) that are uniquely shared characteristics of the species from these genera. Similarly, the clade corresponding to the genus Pseudomonas sensu stricto, which harbors P. aeruginosa, can also be reliably distinguished from all other Pseudomonas based on multiple exclusively shared CSIs [11]. However, the genetic diversity of Pseudomonas extends far beyond the known species (>300) with validly published names. The NCBI [39] harbors genomes for >2000 uncharacterized strains/isolates of Pseudomonas species, for which no information is available regarding their phylogenetic affiliation. As these uncharacterized strains are likely to harbor many novel species related to both the known Pseudomonadaceae genera, as well as other novel taxa related to these bacteria [6], it is important to characterize them. However, there is no easy to use methods available for dependably identifying strains that are related to the existing Pseudomonadaceae genera.
With this in mind, the objective of this study was to determine whether the CSIs specific for different Pseudomonadaceae genera, due to their known predictive ability to be found in other group members, can be used to identify other unclassified Pseudomonas (spp.) strains that are related to these genera. These investigations were greatly facilitated by the recent development of AppIndels.com server, which based upon the presence of known taxon-specific CSIs in a genome sequence, can predict its taxonomic affiliation [27]. In this work, the AppIndels server was used after supplementing its database with sequence information for different CSIs specific for the Pseudomonadaceae genera, for predicting the taxonomic affiliations of genome sequences for >1972 unclassified Pseudomonas strains/isolates. Results presented here show that based upon the CSIs that have been identified for Pseudomonadaceae genera, the server was able to predict the taxonomic affiliation of 299 of these unclassified Pseudomonas strains into 14 distinct Pseudomonadaceae genera. The genera or species groups into which these unclassified Pseudomonas strains/isolates were assigned included, Pseudomonas sensu stricto clade (46 strains), Ectopseudomonas (46 strains), Chryseomonas (32 strains), Stutzerimonas (31 strains), Metapseudomonas (22 strains), Aquipseudomonas (21 strains), Phytopseudomonas (17 strains), Halopseudomonas (9 strains), Geopseudomonas (4 strains), Thiopseudomonas (3 strains), Serpens (2 strains), Caenipseudomonas (1 strain) and Zestomonas (1 strain). In addition, 64 Pseudomonas strains/isolates were identified as P. aeruginosa. In all cases, the assignment of Pseudomonas strains to different Pseudomonadaceae genera (or to P. aeruginosa) was based on the shared presence of multiple (minimum 3) CSIs, which are exclusive characteristics of the indicated genera. The results of phylogenetic studies conducted here confirm that the taxonomic predictions made by the server were 100% in agreement with the branching of these strains with the species from the indicated genera. These results provide further strong evidence regarding, (i) the predictive abilities of the taxon-specific CSIs to be found in other (unclassified) members of these taxa, and (ii) that the use of these molecular markers provide a novel and trustworthy means for the identification of other species/strains related to these genera [27].
Although the AppIndels server accurately predicted the taxonomic affiliations of 299 Pseudomonas strains, it provided no results for the remainder of the strains. This is not surprising because the AppIndels server can make taxonomic predictions for only those strains that are related to the taxa for which CSIs are known and present in the server’s database [27]. As noted previously, the genus Pseudomonas is a very large and diverse grouping of microorganisms, harboring >300 validly named species that form multiple distinct clades/lineages [4,6,7,8,9,11,14]. Thus far, CSIs have been identified for only a limited number of these grouping consisting mainly of the genus Halopseudomonas and some clades/genera within the Aeruginosa lineage. However, a vast majority of the Pseudomonas species, representing more than two-thirds of named species, are part of the Fluorescens lineage which is composed of multiple distinct genus level clades and subclades [7,9,13,14,15,23]. No CSIs are known at present for the species from different clades and subclades of the Fluorescens lineage. In addition, no CSIs have been identified for the Anguilliseptica clade of species and many other species within the Aeruginosa lineage (viz. P. benzenivorans, P. cuatrocienegasensis, P. indica, P. kuykendallii, P. lalucatii, P. mangiferae, P. mangrovi, P. matsuisoli and P. pohangensis), which branch distinctly from the described clades/genera. In view of the paucity of CSIs for these other groups/clades of Pseudomonas species, if an examined strain (genome) is affiliated to these species clades/genera, the server will not be able to make any taxonomic predictions for those strains. Therefore, as indicated on the server’s website, while the absence of any taxonomic prediction by the server is not very informative, a specific prediction by the server regarding taxonomic affiliation is a highly trustworthy result.
It should be noted that the genomes of Pseudomonas spp./strains for which the server was able to make correct taxonomic predictions were of different assembly stages ranging from chromosome and complete to contigs and scaffolds (see Tables S1–S3). Previously, we have also shown that based on genome sequence information, the server can also predict the taxonomic affiliation of uncultured strains/isolates [39]. These results and observations indicate that the AppIndels server provides a valuable and easy-to-use tool for the identification and taxonomic characterization of cultured and uncultured strains/isolates for the genera for which CSIs are known. Based upon the phylogenetic branching of the Pseudomonas spp./strains for which taxonomic predictions regarding affiliation to specific genera was made by the server, several of these strains branched distinctly from the other known species within the predicted genera (Figure 3 and Figure 4). Thus, upon further characterization, a number of these strains will likely constitute novel species within these genera. This should lead to a considerable increase in the genetic diversity of species within these genera advancing our understanding of the Pseudomonas-related species/genera. It is also noteworthy that based upon the identified CSIs, the server can also reliably distinguish P. aeruginosa from all other members of Pseudomonas species including other species from the Pseudomonas sensu stricto clade. As P. aeruginosa, is an important human pathogen capable of causing a wide array of life-threatening acute and chronic diseases, accurate identification of this species from other closely related species using genome sequence data should be very useful in clinical settings.
Lastly, based upon earlier work on CSIs in genes/proteins sequences, these molecular characteristics, in addition to their specificity and predictive abilities for reliable identification of species from different clades, also play important/essential functions in the group of organisms for which they are specific [16,18,40,41,42,43,44]. Hence, genetic, biochemical, and functional studies on the CSIs specific for different genera provide means for the identification of novel biochemical and other characteristics that are specific to these organisms.

Supplementary Materials

The following supporting information can be downloaded at the website of this paper posted on Preprints.org, Table S1. List of downloaded 266 Pseudomonas spp. genomes (Chromosome and Complete) for analysis in this study. Table S2. List of downloaded 1197 Pseudomonas spp. genomes (Contigs) to analyze in this study. Table S3. List of downloaded 510 Pseudomonas spp. genomes (Scaffold) to analyze in this study. Table S4. Information on Genome sequences of 299 uncharacterized Pseudomonas spp. whose taxonomic affiliations were predicted by the AppIndels web server. Figure S1. A maximum-likelihood tree, constructed using concatenated sequences of 118 conserved proteins, depicts the branching of all newly identified species/or species with new name combinations that share the CSIs specific to the genus Halopseudomonas. Newly described species are highlighted in bold and non-validly published species are shown within “ ”. Figure S2. Results from the AppIndels server for the genome sequences of two representative uncharacterized Pseudomonas spp./strains. Figure S3. A Maximum-likelihood tree based on concatenated sequences of 118 conserved proteins showing all strains from Aeruginosa clade (Genus Pseudomonas sensu stricto)

Author Contributions

BR and RSG carried out analysis using the AppIndels server; BR constructed phylogenetic trees; RSG, Planning and supervision of the work, obtained funding for the project and writing and finalizing of the manuscript; BR, updating the sequence information for the CSIs and checking and formatting different Figures and Tables, RSG and BR, writing and finalizing of the manuscript.

Acknowledgments

This work was supported by a by the research grant (RGPIN-2019-06397) from the Natural Science and Engineering Research Council of Canada awarded to Radhey S. Gupta.

Conflicts of Interest

The authors declare no conflict of interest,

References

  1. Migula W. Uber ein neues System der Bakterien. Arb. Bakt. Inst. Kar1sruhe 1894;1(235):238.
  2. Skerman VBD, McGowan V, Sneath PHA. Approved lists of bacterial names. Int J Syst Bacteriol 1980;30:225-420.
  3. Jun SR, Wassenaar TM, Nookaew I, Hauser L, Wanchai V et al. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis. Appl Environ Microbiol 2016;82(1):375-383.
  4. Lalucat J, Gomila M, Mulet M, Zaruma A, Garcia-Valdes E. Past, present and future of the boundaries of the Pseudomonas genus: Proposal of Stutzerimonas gen. Nov. Syst Appl Microbiol 2022;45(1):126289. [CrossRef]
  5. Rudra B, Gupta RS. Phylogenomic and comparative genomic analyses of species of the family Pseudomonadaceae: Proposals for the genera Halopseudomonas gen. nov. and Atopomonas gen. nov., merger of the genus Oblitimonas with the genus Thiopseudomonas, and transfer of some misclassified species of the genus Pseudomonas into other genera. Int J Syst Evol Microbiol 2021;71(9):005011. [CrossRef]
  6. Saati-Santamaria Z, Peral-Aranega E, Velazquez E, Rivas R, Garcia-Fraile P. Phylogenomic Analyses of the Genus Pseudomonas Lead to the Rearrangement of Several Species and the Definition of New Genera. Biology (Basel) 2021;10(8). [CrossRef]
  7. Hesse C, Schulz F, Bull CT, Shaffer BT, Yan Q et al. Genome-based evolutionary history of Pseudomonas spp. Environ Microbiol 2018;20(6):2142-2159.
  8. Gomila M, Pena A, Mulet M, Lalucat J, Garcia-Valdes E. Phylogenomics and systematics in Pseudomonas. Front Microbiol 2016;6:214. [CrossRef]
  9. Peix A, Ramirez-Bahena MH, Velazquez E. The current status on the taxonomy of Pseudomonas revisited: An update. Infect Genet. Evol 2018;57:106-116. [CrossRef]
  10. Peix A, M.H. R-B, Velázquez E. Historical evolution and current status of the taxonomy of genus Pseudomonas. Infect Genet. Evol 2009;9:1132-1147. [CrossRef]
  11. Rudra B, Gupta RS. Phylogenomics studies and molecular markers reliably demarcate genus Pseudomonas sensu stricto and twelve other Pseudomonadaceae species clades representing novel and emended genera. Frontiers in Microbiology 2024;14:1273665. [CrossRef]
  12. Oren A, Arahal DR, Goker M, Moore ERB, Rossello-Mora R et al. International Code of Nomenclature of Prokaryotes. Prokaryotic Code (2022 Revision). Int J Syst Evol Microbiol 2023;73(5a).
  13. Lalucat J, Mulet M, Gomila M, Garcia-Valdes E. Genomics in Bacterial Taxonomy: Impact on the Genus Pseudomonas. Genes (Basel) 2020;11:139. [CrossRef]
  14. Girard L, Lood C, Hofte M, Vandamme P, Rokni-Zadeh H et al. The Ever-Expanding Pseudomonas Genus: Description of 43 New Species and Partition of the Pseudomonas putida Group. Microorganisms 2021;9(8). [CrossRef]
  15. Passarelli-Araujo H, Franco GR, Venancio TM. Network analysis of ten thousand genomes shed light on Pseudomonas diversity and classification. Microbiol Res 2022;254:126919. [CrossRef]
  16. Adeolu M, Alnajar S, Naushad S, Gupta S. Genome-based phylogeny and taxonomy of the ’Enterobacteriales’: proposal for Enterobacterales ord. nov. divided into the families Enterobacteriaceae, Erwiniaceae fam. nov., Pectobacteriaceae fam. nov., Yersiniaceae fam. nov., Hafniaceae fam. nov., Morganellaceae fam. nov., and Budviciaceae fam. nov. Int J Syst Evol Microbiol 2016;66(12):5575-5599.
  17. Gupta RS. Identification of conserved indels that are useful for classification and evolutionary studies. Methods Microbiol 2014;41:153-182.
  18. Gupta RS. Impact of genomics on the understanding of microbial evolution and classification: the importance of Darwin’s views on classification. FEMS Microbiol Rev 2016;40(4):520-553. [CrossRef]
  19. Gupta RS, Chander P, George S. Phylogenetic framework and molecular signatures for the class Chloroflexi and its different clades; proposal for division of the class Chloroflexia class. nov. [corrected] into the suborder Chloroflexineae subord. nov., consisting of the emended family Oscillochloridaceae and the family Chloroflexaceae fam. nov., and the suborder Roseiflexineae subord. nov., containing the family Roseiflexaceae fam. nov. Antonie Van Leeuwenhoek 2013;103(1):99-119.
  20. Gupta RS, Patel S, Saini N, Chen S. Robust demarcation of 17 distinct Bacillus species clades, proposed as novel Bacillaceae genera, by phylogenomics and comparative genomic analyses: description of Robertmurraya kyonggiensis sp. nov. and proposal for an emended genus Bacillus limiting it only to the members of the Subtilis and Cereus clades of species. Int J Syst Evol Microbiol 2020;70:5753-5798. [CrossRef]
  21. Holmes B, Steigerwalt A, Weaver R, Brenner DJ. Chryseomonas polytricha gen. nov., sp. nov., a Pseudomonas-like organism from human clinical specimens and formerly known as group Ve-1. Int J Syst Evol Microbiol 1986;36(2):161-165. [CrossRef]
  22. Hespell RB. Serpens flexibilis gen. nov., sp. nov., an unusually flexible, lactate-oxidizing bacterium. Int J Syst Evol Microbiol 1977;27(4):371-381.
  23. Palleroni NJ. Pseudomonas. John Wiley and Sons in Association with Bergey’s Manual Trust; 2015. pp. 1-105.
  24. Rossi E, La Rosa R, Bartell JA, Marvig RL, Haagensen JAJ et al. Pseudomonas aeruginosa adaptation and evolution in patients with cystic fibrosis. Nat Rev Microbiol 2021;19(5):331-342. [CrossRef]
  25. Lund-Palau H, Turnbull AR, Bush A, Bardin E, Cameron L et al. Pseudomonas aeruginosa infection in cystic fibrosis: pathophysiological mechanisms and therapeutic approaches. Expert. Rev. Respir. Med 2016;10:685-697.
  26. Parte AC, Sarda Carbasse J, Meier-Kolthoff JP, Reimer LC, Goker M. List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ. Int J Syst Evol Microbiol 2020;70(11):5607-5612. [CrossRef]
  27. Gupta RS, Eivin DA. AppIndels.com Server: A Web Based Tool for the Identification of Known Taxon-Specific Conserved Signature Indels in Genome Sequences: Validation of Its Usefulness by Predicting the Taxonomic Affiliation of >700 Unclassified strains of Bacillus Species. Int J Syst and Evol Microbiol 2023;73::005844.
  28. Wang Z, Wu M. A phylum-level bacterial phylogenetic marker database. Mol. Biol. Evol 2013;30(6):1258-1262. [CrossRef]
  29. Chen S, Rudra B, Gupta RS. Phylogenomics and molecular signatures support division of the order Neisseriales into emended families Neisseriaceae and Chromobacteriaceae and three new families Aquaspirillaceae fam. nov., Chitinibacteraceae fam. nov., and Leeiaceae fam. nov. Syst Appl Microbiol 2021;44(6):126251.
  30. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol Biol Evol 2018;35(6):1547-1549. [CrossRef]
  31. Girard L, Lood C, De Mot R, van Noort V, Baudart J. Genomic diversity and metabolic potential of marine Pseudomonadaceae. Front Microbiol 2023;14:1071039. [CrossRef]
  32. Kujur RRA, Ghosh M, Basak S, Das SK. Phylogeny and structural insights of lipase from Halopseudomonas maritima sp. nov., isolated from sea sand. Int Microbiol 2023;26(4):1021-1031. [CrossRef]
  33. Winsor GL, Griffiths EJ, Lo R, Dhillon BK, Shay JA et al. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database. Nucleic Acids Res 2016;44:D646-D653. [CrossRef]
  34. Xin XF, Kvitko B, He SY. Pseudomonas syringae: what it takes to be a pathogen. Nat. Rev. Microbiol 2018;16:316-328. [CrossRef]
  35. Palleroni NJ. Genus I. Pseudomonas Migula 1894. In Bergey’s Manual of Systematic Bacteriology (The Proteobacteria), part B (The Gammaproteobacteria), 2nd edn,. Edited by D. J. Brenner, N. R. Krieg, James T. Staley & G. M. Garrity. New York: Springer 2005;2:323-379.
  36. Stover CK, Pham XQ, Erwin A, Mizoguchi S, Warrener P et al. Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen. Nature 2000;406(6799):959-964. [CrossRef]
  37. Planquette B, Timsit J-F, Misset BY, Schwebel C, Azoulay E et al. Pseudomonas aeruginosa ventilator-associated pneumonia. predictive factors of treatment failure. Am J Respir Crit Care Med 2013;188(1):69-76.
  38. Duman M, Mulet M, Altun S, Saticioglu IB, Gomila M et al. Corrigendum: Pseudomonas piscium sp. nov., Pseudomonas pisciculturae sp. nov., Pseudomonas mucoides sp. nov. and Pseudomonas neuropathica sp. nov. isolated from rainbow trout. Int J Syst Evol Microbiol 2021;71(12).
  39. Sayers EW, Agarwala R, Bolton EE, Brister JR, Canese K et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2019;47(D1):D23-D28. [CrossRef]
  40. Khadka B, Persaud D, Gupta RS. Novel Sequence Feature of SecA Translocase Protein Unique to the Thermophilic Bacteria: Bioinformatics Analyses to Investigate Their Potential Roles. Microorganisms 2020;8:59. [CrossRef]
  41. Singh B, Gupta RS. Conserved inserts in the Hsp60 (GroEL) and Hsp70 (DnaK) proteins are essential for cellular growth. Mol. Genet. Genomics 2009;281:361-373. [CrossRef]
  42. Hashimoto K, Panchenko AR. Mechanisms of protein oligomerization, the critical role of insertions and deletions in maintaining different oligomeric states. Proceedings of the National Academy of Sciences 2010;107(47):20352-20357. [CrossRef]
  43. Miton CM, Tokuriki N. Insertions and deletions (indels): a missing piece of the protein engineering jigsaw. Biochemistry 2022;62(2):148-157. [CrossRef]
  44. Khadka B, Gupta RS. Identification of a conserved 8 aa insert in the PIP5K protein in the Saccharomycetaceae family of fungi and the molecular dynamics simulations and structural analysis to investigate its potential functional role. Proteins: Structure, Function, and Bioinformatics 2017;85(8):1454-1467. [CrossRef]
Figure 1. Partial sequence alignment showing a two amino acid insertion (CSI) in the flagellar FlgN protein (boxed) described in our earlier work [5], which is specific for the genus Halopseudomonas. Sequences for six new species Halopseudomonas related species have since become available and all of them share this CSI, demonstrating the predictive ability of this CSI. The species marked with the symbol * have not yet been reclassified as Halopseudomonas due to the lack of availability of type strains in two different culture collections or Some of these species are listed in the LPSN under the genus Neopseudomonas, which is a synonym of Halopseudomonas. Quotation marks “ ” surrounding a species name indicates that this name is not yet validly published. The dashes (-) in the alignment indicate identity with the amino acids on the top line. Accession numbers for different sequences are indicated in the second column and the numbers at the top indicate the position of this sequence fragment within the protein sequences. .
Figure 1. Partial sequence alignment showing a two amino acid insertion (CSI) in the flagellar FlgN protein (boxed) described in our earlier work [5], which is specific for the genus Halopseudomonas. Sequences for six new species Halopseudomonas related species have since become available and all of them share this CSI, demonstrating the predictive ability of this CSI. The species marked with the symbol * have not yet been reclassified as Halopseudomonas due to the lack of availability of type strains in two different culture collections or Some of these species are listed in the LPSN under the genus Neopseudomonas, which is a synonym of Halopseudomonas. Quotation marks “ ” surrounding a species name indicates that this name is not yet validly published. The dashes (-) in the alignment indicate identity with the amino acids on the top line. Accession numbers for different sequences are indicated in the second column and the numbers at the top indicate the position of this sequence fragment within the protein sequences. .
Preprints 144796 g001
Figure 2. Results from the AppIndels server analysis for the genome sequences of two representative unclassified Pseudomonas spp./strains. (A) The genome of Pseudomonas strain ZM24 is predicted by the server as affiliated to the Pseudomonas sensu stricto clade and it contained five CSIs specific for this clade. (B) The Pseudomonas strain ABC1 was identified by the server as belonging to the genus Stutzerimonas, and it contained 6 CSIs specific for this genus. Sequence information for only three of the identified CSIs is shown in this figure. The information for the other shared CSIs is presented in Figure S2.
Figure 2. Results from the AppIndels server analysis for the genome sequences of two representative unclassified Pseudomonas spp./strains. (A) The genome of Pseudomonas strain ZM24 is predicted by the server as affiliated to the Pseudomonas sensu stricto clade and it contained five CSIs specific for this clade. (B) The Pseudomonas strain ABC1 was identified by the server as belonging to the genus Stutzerimonas, and it contained 6 CSIs specific for this genus. Sequence information for only three of the identified CSIs is shown in this figure. The information for the other shared CSIs is presented in Figure S2.
Preprints 144796 g002
Figure 3. A phylogenetic tree based on genome sequences for the representative species, including type species of different Pseudomonadaceae genera and genomes of different Pseudomonas spp. (strains/isolates) for which positive predictions were made by the server regarding affiliation to specific clades/genera (Table 2 and Table S4). For the ease of visualization of information for different strains, the clades for some genera viz. Chryseomonas, Ectopseudomonas, Metapseudomonas, Phytopseudomonas, Pseudomonas sensu stricto, and Stutzerimonas, are compressed in this Figure. All Pseudomonas strains for which the server made taxonomic predictions branched with 100% accuracy with the indicated genera in these trees.
Figure 3. A phylogenetic tree based on genome sequences for the representative species, including type species of different Pseudomonadaceae genera and genomes of different Pseudomonas spp. (strains/isolates) for which positive predictions were made by the server regarding affiliation to specific clades/genera (Table 2 and Table S4). For the ease of visualization of information for different strains, the clades for some genera viz. Chryseomonas, Ectopseudomonas, Metapseudomonas, Phytopseudomonas, Pseudomonas sensu stricto, and Stutzerimonas, are compressed in this Figure. All Pseudomonas strains for which the server made taxonomic predictions branched with 100% accuracy with the indicated genera in these trees.
Preprints 144796 g003
Figure 4. Phylogenetic branching of Pseudomonas spp. (strains/isolates), which based upon the results obtained from AppIndels server (Table 2 and Table S4) were predicted to be related to the genera Chryseomonas, Ectopseudomonas, Metapseudomonas, Phytopseudomonas, Pseudomonas sensu stricto, and Stutzerimonas. All strains for which the server made taxonomic predictions branched with 100% accuracy with the indicated genera in this tree.
Figure 4. Phylogenetic branching of Pseudomonas spp. (strains/isolates), which based upon the results obtained from AppIndels server (Table 2 and Table S4) were predicted to be related to the genera Chryseomonas, Ectopseudomonas, Metapseudomonas, Phytopseudomonas, Pseudomonas sensu stricto, and Stutzerimonas. All strains for which the server made taxonomic predictions branched with 100% accuracy with the indicated genera in this tree.
Preprints 144796 g004
Table 1. List of Pseudomonadaceae Genera for which CSIs have been Identified.
Table 1. List of Pseudomonadaceae Genera for which CSIs have been Identified.
Genera/Species name No. of Identified CSIs Weight Value
for Each CSI
Aquipseudomonas 6 0.4
Atopomonas 22 0.2
Azomonas 5 0.4
Azotobacter 10 0.4
Caenipseudomonas 8 0.4
Chryseomonas 11 0.3
Ectopseudomonas 5 0.4
Geopseudomonas 15 0.3
Halopseudomonas 24 0.2
Metapseudomonas 5 0.4
Phytopseudomonas 12 0.3
Pseudomonas sensu stricto 6 0.4
Serpens 3 0.5
Stutzerimonas 7 0.4
Thiopseudomonas 6 0.3
Zestomonas 5 0.4
Pseudomonas aeruginosa 7 0.3
Pseudomonas paraeruginosa 5 0.4
Table 2. Results from the AppIndels Server Regarding the Taxonomic Affiliations of Genome Sequences of 299 Unclassified Pseudomonas spp.
Table 2. Results from the AppIndels Server Regarding the Taxonomic Affiliations of Genome Sequences of 299 Unclassified Pseudomonas spp.
Genera/Species No. of strains Range of CSIs Pseudomonas Spp. strain Nos.
Pseudomonas sensu stricto 46 5-6 21, 273, 30_B, AAC, ADPe, ATCC 13867, AU11447, AU12215, BJa5, EGD-AKN5, GCEP-101, GD03691, GD03903, GD04087, HMSC75E02, HS-18, LA21, M1, NBRC 111135, NBRC100443, PDM17, PDM18, PDM19, PDM20, PDM21, PDM22, PDM23, PDM33, PDNC002, PI1, PSE14, R3.Fl, RW407, SCB32, UMA601, UMA603, UMA643, UMC3103, UMC3106, UMC3129, UMC631, UMC76, UME83, ZM23, ZM24, ZM25.
Pseudomonas aeruginosa 64 5-7 203-8, 17023526, 17023671, 17033095, 17053182, 17053418, 17053703, 17063399, 17072548, 17073326, 17102422, 17103552, 17104299, 18073667, 18082547, 18081308, 18082551, 18082574, 18083194, 18083202, 18083259, 18083286, 18084127, 18092229, 18093371, 18101001-2, 18102011, 18103014, 18113298, 19062259, 19064969, 19072337-2, 19082381, 2VD, 3PA37B6, AF1, AFW1, AK6U, B111, BDPW, BIS, BIS1, CP-1, FDAARGOS_761, HMSC057H01, HMSC072F09, HMSC16B01, HMSC076A11, HMSC060F12, HMSC065H01, HMSC066A08, HMSC065H02, HMSC067G02, HMSC063H08, HMSC058C05, P179, P20, P22, PAH14, Pseudomonas_assembly, PS1(2021), RGIG3665, S33, S68.
Aquipseudomonas 21 4-6 8AS, BLCC-B13, BMS12, F(2018), GD03869, GD03875, GD03985, GD04015, GD04019, GD04042, GD04045, GOM6, J452, L-22-4S-12, ML96, PDM15, PDM16, R-28-1W-6, UBA6718, SO81,WS 5013.
Caenipseudomonas 1 7 Go_SlPrim_bin_81
Chryseomonas 32 6-11 313, AS2.8, BAV 2493, BAV 4579, GM_Psu_1, GM_Psu_2, HUK17, LTJR-52, MAG002Y, PS02302, RIT 411, S1C77_SP397, S2C3242, SP152, SP29, SP3, SP403, SP421, WAC2, HPB0071, Snoq117.2, MS15, JUb52, EpSL25, PLB05, HR1, CBMAI 2609,UBA6549, UBA7233, UBA3149, UBA4102.
Ectopseudomonas 46 3-5 297, 07-Jan, 905_Psudmo1, AA-38, ALS1131, ALS1279, AOB-7, B11D7D, BMW13, DS1.001, EGD-AK9, EggHat1, GD03721, GD03722, GD03919, GD04158, GOM7, GV_Bin_12, Gw_UH_bin_155, HS-2, KB-10, KHPS1, LPH1, Leaf83, MDMC17, MDMC216, MDMC224, MSPm1, Marseille-Q0931, NCCP-436, NFACC19-2, NFPP33, o96, OA3, P818, 8O, 8Z, REST10, RGIG627, THAF187a, THAF42, WS 5019, YY-1, Z8(2022), ZH-FAD, phDV1.
Geopseudomonas 4 4-15 A-1, OF001, R2F_R2FSRR_metabat.60, Gw_Prim_bin_4.
Halopseudomonas 9 20-24 5Ae-yellow, FME51, MYb185, NORP239, NORP330, OIL-1, SSM44, WN033, gcc21.
Metapseudomonas 22 3-5 57B-090624, 1D4, A46, BN102, BN411, BN414, BN415, BN417, BN515, BN606, D(2018), DY-1, ENNP23, FeS53a, JG-B, JM0905a, LFM046, PDM13, Pc102, Q1-7, SLBN-26, TCU-HL1.
Phytopseudomonas 17 9-12 AG1028, Bi70, BIGb0408, CrR14, CNPSo 3701, MEJ086, MM211, PDM11, PDM12, S2C11432_SP223, S2C78296_SP133,
sia0905, SP200_1_metabat2_genome_mining.44, SP236_1_metabat2_genome_mining.8, PA1, PA15, PA27.
Serpens 2 3 N24CT, RL.
Stutzerimonas 31 4-7 10B238, 9Ag, A192_concoct.bin.7, ABC1, ALOHA_A2.5_105, BAY1663, BRH_c35, C42_metabat.bin.8, Choline-3u-10, DF_1_3.23, DNDY-54, IC_126,
JI-2, KSR10, M30B71, MCMED-G45, MT-1, MT4, MTM4, N17CT, NP21570, Q2-TVG4-2, RS261_metabat.bin.8, S5(2021), SCT, SST3, TTU2014-066ASC,
TTU2014-096BSC, TTU2014-105ASC, WS 5018, s199.
Thiopseudomonas 3 4-5 AS08sgBPME_395, C27(2019), SO_2017_LW2 bin 68.
Zestomonas 1 3 LS44
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated