Preprint
Article

This version is not peer-reviewed.

Genomic Discovery of Robust Molecular Markers Differentiating Lactobacillaceae Genera and Providing Novel Tools for Functional Insights

Submitted:

27 June 2025

Posted:

30 June 2025

Read the latest preprint version here

Abstract
Background: Members of the family Lactobacillaceae, encompassing 23 distinct genera, play essential roles in food fermentation processes such as wine, yogurt, and cheese production and contribute significantly to human health through their probiotic properties. Despite their importance, accurately distinguishing between Lactobacillaceae genera has been challenging due to the absence of reliable biochemical or molecular markers. Currently, these genera are primarily differentiated based on phylogenetic relationships. Methods: To address this limitation, we have performed comprehensive phylogenomic and comparative analyses of protein sequences from 411 publicly available Lactobacillaceae genomes. Results: The results of these analyses have identified 171 novel conserved signature indels (CSIs), within proteins involved in diverse cellular functions, which are specific for the species from different Lactobacillaceae genera. The taxon-specificities of these CSIs make them robust molecular markers for differentiation of Lactobacillaceae genera and for functional insights. Using these taxon-specific CSIs and the AppIndels.com server, we were able to successfully predict the taxonomic affiliation of 112 uncharacterized genomes of Lactobacillus isolates, demonstrating the practical utility of these CSIs for genus-level identification and classification. Structural analyses on representative CSIs specific for Lactobacillaceae genera reported here show that all examined CSIs are located in surface-exposed loops of proteins, suggesting their potential roles in genus-specific functional traits, such as interaction with specific proteins and ligands, host interactions, or environmental adaptations. Conclusions: The CSIs identified here not only provide reliable tools for diagnostic and taxonomic studies but also open new avenues for exploring the functional diversity and biotechnological potential of species from different Lactobacillaceae genera.
Keywords: 
;  ;  ;  ;  ;  

1. Introduction

The family Lactobacillaceae [1,2,3] encompasses Gram-positive, non-spore-forming bacteria with diverse metabolic profiles, including obligate heterofermentative, facultative heterofermentative, or obligate homofermentative [4,5]. Recent taxonomic revisions, most notably the reclassification of Lactobacillus species into 23 new genera [3] and the merger of the former Leuconostocaceae family [6,7] with Lactobacillaceae, have expanded the family to 36 genera with validly published names. Many species within Lactobacillaceae, particularly those from genera such as Lactobacillus, Lacticaseibacillus, Lactiplantibacillus, Leuconostoc, Oenococcus, and Weissella, are widely utilized in food production and as probiotics due to their health promoting properties [8,9,10,11,12]. Currently, species from different Lactobacillaceae genera are primarily distinguished based on phylogenetic clustering, genomic similarity metrics such as average amino acid identity (AAI), and, in some cases, ecological characteristics [3,5,13,14,15].
However, despite extensive research, no consistent biochemical or molecular markers have been identified that reliably differentiate species from different Lactobacillaceae genera. The discovery of such markers would not only refine taxonomic boundaries but also enable novel diagnostic and functional studies. In our previous work on the former Leuconostocaceae family [7], which is now part of Lactobacillaceae, we identified 46 conserved signature indels (CSIs) in various proteins that were specific to the genera Convivina, Fructobacillus, Leuconostoc, Oenococcus, Periweissella, and Weissella [16]. These Taxon-specific Indels (TAXIs), which results from rare genetic changes in a common ancestor [17,18,19], serve as molecular synapomorphies, providing strong evidence for the evolutionary relatedness of species from these lineages [20,21,22].
Given the ecological diversity and biotechnological relevance of Lactobacillaceae species [8,9,10,13,23,24,25,26], new species belonging to this family are being rapidly discovered. Since the taxonomic overhaul by Zheng et al. [3], large numbers of new Lactobacillaceae species have been described [27], with genome sequences available in the NCBI database [28]. These genomic resources offer a valuable opportunity for identifying genus-specific molecular markers [28,29,30,31,32]. In this study, we analyzed 411 Lactobacillaceae genome sequences (available in the NCBI database as of March 1, 2024) using phylogenomic and comparative protein sequence analyses. In our comparative genomic analysis, we use the INDELIBLE (Indel-based Identification of Bacterial Lineages and Evolution) approach to identify molecular markers (i.e. CSIs) that are specific to various Lactobacillaceae genera.
In this work, we initially constructed a robust phylogenomic tree for all 411 genomes of Lactobacillaceae species to clarify their evolutionary relationships. Subsequently, we performed comparative analyses of protein sequences using the INDELIBLE method to identify CSIs unique to specific genera [16,21,33]. These analyses led to the discovery of 171 novel TAXIs in diverse proteins, each uniquely present in species from specific Lactobacillaceae genera. These TAXIs serve as reliable molecular markers for distinguishing between Lactobacillaceae genera. To demonstrate the practical utility of these TAXIs, we used these markers in conjunction with the AppIndels.com server [34] to predict the taxonomic affiliation of 113 previously uncharacterized Lactobacillus strains. Based on the presence or absence of specific TAXIs, we accurately assigned 112 of these 113 strains to 11 Lactobacillaceae genera.
Structural analyses of representative TAXIs identified in this work revealed that they are consistently located in surface-exposed loops of the proteins, suggesting their functional importance. Further genetic and biochemical studies on these TAXIs could contribute to the discovery of novel genus-specific functional traits, including those involved in metabolic innovations, host interactions or environmental adaptations [5,26]. Thus, the genus-specific molecular markers identified in this study not only enhance taxonomic resolution within Lactobacillaceae but also provide a foundation for exploring functional diversity and characteristics of its different genera. These findings have important implications for advancing our understanding of the Lactobacillaceae species and their possible applications in food technology, probiotics, and health sciences.

2. Materials and Methods

2.1. Construction of Phylogenetic Trees

Genome sequences for the type strains and/or reference strains of 411 Lactobacillaceae species, whose annotated protein sequences were available in the NCBI database as of July 1, 2024, were downloaded. The genomes of 2 Bacillus species (Bacillus subtilis and B. cereus) were included for rooting purposes. Based on these genome sequences, a phylogenomic tree was constructed based on concatenated sequences of 87 conserved proteins, which constitute the phyloeco marker set for the phylum Bacillota [35]. The tree was constructed using an internally developed pipeline described in our earlier work [22,36]. Briefly, the CD-HIT program and the profile Hidden Markov Models (HMMs) [37] of the phyloeco set of proteins, were used to identify homologs of these proteins in the input genomes that were present in at least 80% of the input genomes and shared a minimum of 50% sequence identity and length. Multiple sequence alignments of these protein families were generated using the Clustal Omega algorithm [38]. After removing the poorly aligned regions using trimAl [39], the final concatenated sequence alignment for the 87 proteins consisted of 26527 aligned positions. A maximum-likelihood tree based on this sequence alignment was constructed based on the Whelan and Goldman model of protein sequence evolution [40] and formatted using MEGA X [41].

2.2. Identification of Conserved Signature Indels (CSIs)

Identification of CSIs for different Lactobacillaceae genera was carried out using the methods described in our earlier work [16,18,33]. Briefly, local BLASTp searches were carried out on protein sequences from the genomes of representative Lactobacillus species from different clades of interest and several outgroup species. Based on these BLASTp searches, sequences of high-scoring homologs (E value <1e-20) of different proteins were retrieved for several species (generally between 4 to 10) from the group of interest, and 10-15 outgroup species. Multiple sequence alignments of different proteins were performed using Clustal X 2.1 [42]. The alignments were visually inspected for the presence of insertions or deletions (indels) of fixed lengths that were present in conserved regions (i.e. flanked on both sides by minimally 5-6 conserved aa residues in the neighboring 40-50 aa) and which were only found in the species from a specific Lactobacillaceae genus. The indels which were not present in conserved regions were not further considered. To confirm the specificity of these candidate CSIs, the query sequences containing the indel and 30–50 flanking amino acids on either side (typically beginning and ending with conserved residues) were used in a second, broader BLASTp search against the NCBI non-redundant (nr) database. The top 300–500 hits were examined to assess the taxonomic distribution of the indel. Indels found exclusively in species from a single Lactobacillaceae genus were identified as CSIs or TAXIs and formatted using the SIGCREATE and SIGSTYLE tools [18,33], available via the Gleans.net server (http://gleans.net/). Due to space constraints, representative sequence data are shown in the main figures for a limited number of species. However, unless otherwise noted, the CSIs (TAXIS) described are genus-specific and absent in all other bacterial homologs among the top BLASTp hits. Additional sequence details for each CSI are provided in the Supplemental Figures.
2.3. Taxonomic Predictions of Lactobacillus Strains/Isolates Using AppIndels.com Sever
The protein sequences of 113 uncharacterized Lactobacillus isolates analyzed in this study were obtained from the NCBI genome database in .faa format. These sequences were processed using the AppIndels.com server [34], which performs local BLASTp searches to identify TAXIs specific for different genera. If the number of TAXIs in the input protein file exceeds the threshold for identification of a given taxon, the server assigns the genome to that taxon. A detailed explanation of the server’s methodology is available in previous work [34].

2.4. Determination of Protein Structures Using AlphaFold Model Generation to Map the Locations of CSIs

These analyses were performed on four proteins containing CSIs specific to the genera Apilactobacillus, Lacticaseibacillus, Lactiplantibacillus, and Lactobacillus. FASTA sequences of both the CSI-containing proteins and their homologs lacking the CSIs were retrieved from the NCBI protein database and they were used as input for structural prediction. Protein structures were predicted using the AlphaFold 3 server with default parameters [43]. The top-ranked predicted models were selected for visualization and analysis using PyMOL v2.5.5 [44]. To assess the confidence of the predicted structures, we used the predicted Local Distance Difference Test (pLDDT) and predicted Template Modeling (pTM) scores [45,46]. Only models with high-confidence predictions (pLDDT > 50 and pTM > 0.8) were included in subsequent analyses [43,47]. The final models were superimposed in PyMOL using default settings to localize the CSIs within the protein structures. Structural similarity between CSI-containing and CSI-lacking proteins was evaluated using root mean square deviation (RMSD) values.

3. Results

3.1. Phylogenomic Tree for the Lactobacillaceae Species

To determine the evolutionary relationships and generic affiliations of all 411 Lactobacillaceae species with available genomes as of July 1, 2024, a maximum likelihood phylogenomic tree was constructed using concatenated sequences of 87 conserved proteins. The resulting tree is presented in a compressed form in Figure 1, where the species clades from different genera are coalesced. However, an uncompressed form of this tree is provided in Figure S1. In this tree (Figure 1 and Figure S1), which we will be referring to as the phyloeco tree, nearly all nodes exhibit 100% statistical support, indicating strong confidence in the inferred relationships. All examined Lactobacillaceae species clustered within clades corresponding to their respective genera, with branching patterns consistent with previous studies [3,5]. Species from several Lactobacillaceae genera (viz. Acetilactobacillus, Philodulcilactobacillus, Holzapfeliella, Nicoliella, Paralactobacillus), which contain only a single species also showed distinct branching, supporting their classification as separate genera [3].

3.2. Conserved Signature Indels Specific for Different Lactobacillaceae Genera

The phylogenetic tree shown in Figure 1 (and Figure S1) provides strong support for the monophyly of different Lactobacillaceae genera. This tree serves as the framework for the central focus of this study, which is the identification of molecular markers, referred to as taxon-specific conserved signature indels (TAXIs) that are unique to different genera. Previous research on other prokaryotic taxa has demonstrated that TAXIs provide valuable molecular markers for evolutionary and taxonomic studies [16,18,22,33,36,48,49,50]. Building on this foundation, we conducted detailed analyses of protein sequences from Lactobacillaceae species using the INDELIBLE approach to identify CSIs specific to individual genera. These analyses have led to the discovery of 171 novel CSIs in diverse proteins, each uniquely present in species from specific Lactobacillaceae genera. The results of these analyses are summarized and discussed below.

3.3. Molecular Markers Specific for the Genera Lactobacillus, Lacticaseibacillus Lactiplantibacillus and Apilactobacillus

The genus Lactobacillus is the type genus of the family Lactobacillaceae and remains its most populous and extensively studied member [3,4]. Prior to 2020, the genus included over 260 species, which exhibited polyphyletic branching alongside species from the family Leuconostocaceae and displayed substantial phenotypic and ecological diversity [5,14,23,51,52]. However, because of the recent reclassification of Lactobacillus species, which led to their division into 23 distinct genera [3], Lactobacillus species now form a monophyletic clade in phylogenetic analyses. The composition and branching of species from the clade corresponding to genus Lactobacillus in our phylogenetic tree are shown in Figure 2A.
Despite the division of this genus into multiple genera, the genus Lactobacillus still comprises more than 46 named species [27], which exhibit considerable genetic diversity (Figure 2A). Some species from this genus are widely used as probiotics [25,53,54], and some, such as Lactobacillus delbrueckii subsp. bulgaricus, play key roles in dairy fermentation processes like yogurt production [55,56]. However, despite the industrial and scientific importance of species from this genus, no molecular characteristics have previously been identified that are specific to Lactobacillus species.
Our analyses have identified 16 novel CSIs in diverse proteins, most of which are uniquely shared by all or most Lactobacillus species. One example is shown in Figure 2B, where a 2 aa insertion within a conserved region of the 50S ribosomal protein subunit L10 is found exclusively in all 46 genome-sequenced Lactobacillus species. The ribosomal protein L10 is a component of the L7/L12 stalk of the 50S ribosomal subunit and it plays a central role in protein synthesis by recruitment of translation factors to the ribosome and stimulation of GTP hydrolysis [57]. In this and other sequence alignments, dashes (–) indicate identity with the amino acid shown on the top line. This CSI is absent in all other Lactobacillaceae species and in other examined bacteria. Due to space constraints, Figure 2B displays sequence information for only a subset of Lactobacillus and other Lactobacillaceae species. More comprehensive sequence data for this and the remaining 15 CSIs specific to Lactobacillus are provided in Figures S2–S17, with key features summarized in Table 1. Given their specificity for the Lactobacillus species, these TAXIs likely originated in a common ancestor of the genus Lactobacillus and serve as reliable molecular markers for distinguishing its species from those of other Lactobacillaceae genera.
The genus Lacticaseibacillus includes species that have been extensively studied for their probiotic properties, their role in dental caries, and their increasing association with cases of bacteremia [58,59]. In our phylogenetic analysis (Figure S1), members of this genus form a well-supported, deeply branching monophyletic clade. The species composition and branching pattern within this clade are shown in Figure 3A. Using the INDELIBLE approach, we have identified nine novel conserved CSIs in various proteins that are uniquely present in Lacticaseibacillus species. One representative example is shown in Figure 3B, where a one amino acid insertion in a conserved region of the manganese-dependent inorganic pyrophosphatase protein is found in all 30 genome-sequenced Lacticaseibacillus species but is absent in all other Lactobacillaceae species. Detailed sequence information for this CSI, along with the other eight other TAXIs for this genus, is provided in Figures S18–S26. Key characteristics of these CSIs are summarized in Table 1. Collectively, these markers offer reliable molecular tools for distinguishing Lacticaseibacillus species from those of other Lactobacillaceae genera.
The genus Apilactobacillus comprises 12 validly published species, which form a distinct clade in our phylogenomic tree (Figure 1). The species composition and branching pattern within this clade are shown in Figure 4A. Species from this genus are predominantly associated with fructose-rich environments, such as the guts of bees and flowers, reflecting their ecological link to insects [3,5,60,61]. Our analysis has identified four CSIs that are uniquely present in Apilactobacillus species. One representative example is shown in Figure 4B, where a two–amino acid insertion in the cyclopropane-fatty-acyl-phospholipid synthase family protein is uniquely shared by all 12 Apilactobacillus species but absent in all other Lactobacillaceae species. Detailed sequence information for this CSI and the other three Apilactobacillus-specific CSIs is provided in Figures S27–S30, and key characteristics are summarized in Table 1. These CSIs serve as reliable molecular markers for distinguishing Apilactobacillus species from all other genera within the Lactobacillaceae family.
Species of the genus Lactiplantibacillus inhabit a wide range of environments, including fermented foods (e.g., sauerkraut, kimchi), plant material, and the human gastrointestinal tract [3]. Among them, Lactiplantibacillus plantarum has been extensively studied for its probiotic benefits, owing to its ability to ferment plant-derived and phenolic compounds, its antioxidant properties, and its antimicrobial activity through bacteriocin production [62,63].The genus Lactiplantibacillus currently includes 20 validly published species, which form a well-supported clade in the phylogenomic tree constructed in this study. Branching of the species from this genus in our phylogenetic tree is shown in Figure 4C. Our comparative genomic analysis has identified eight CSIs in various proteins that are uniquely present in species from this genus. One representative CSI is shown in Figure 4D, where a one amino acid insert in a highly conserved region of the protein pyridoxal phosphate-dependent aminotransferase is shared exclusively by all Lactiplantibacillus species and absent in all other Lactobacillaceae species. Detailed sequence information for this CSI, along with the other seven Lactiplantibacillus-specific CSIs, is provided in Figures S31–S38. Key characteristics of these CSIs are summarized in Table 1.
Figure 2, Figure 3 and Figure 4 present selected examples of TAXIs specific to a few Lactobacillaceae genera. However, similar to the results shown for Lactobacillus, Lacticaseibacillus, Apilactobacillus, and Lactiplantibacillus, our comprehensive protein sequence analyses across other Lactobacillaceae genera have identified an additional 134 novel CSIs. Most of these CSIs are uniquely shared by species from a specific genus, serving as reliable molecular signatures.
The numbers of CSIs that we have identified for the other Lactobacillaceae genera in the present study are as follows: Agrilactobacillus (4), Amylolactobacillus (4), Bombilactobacillus (7), Companilactobacillus (10), Dellaglioa (6), Fructilactobacillus (8), Furfurilactobacillus (19), Lapidilactobacillus (4), Latilactobacillus (8), Lentilactobacillus (3), Levilactobacillus (4), Ligilactobacillus-Liquorilactobacillus cluster (7) Limosilactobacillus (8), Loigolactobacillus (6), Paucilactobacillus (3), Pediococcus (10), Schleiferilactobacillus (15), Secundilactobacillus (4), and Xylocopilactobacillus (4).
Detailed sequence information for these CSIs is provided in Figures S39–S172, and their key characteristics are summarized in Tables S1–S4. In addition to these CSIs, our previous work [16] has identified multiple CSIs specific for several other Lactobacillaceae genera, which were formerly part of the family Leuconostocaceae [7]. The names of these genera and the numbers of identified CSIs specific for them are as follows: Fructobacillus (5) Leuconostoc (5), Oenococcus (13), Periweissella (5), and Weissella (6) [16]. A summary of the species compositions of different Lactobacillaceae genera and the numbers of identified CSIs (TAXIs) that are specific for these genera is presented in Figure 5.
Based on these findings, all Lactobacillaceae genera that include two or more species can now be reliably distinguished from one another based on multiple, genus-specific TAXIs.

3.4. Predictive Ability of the CSIs and Their Application in Predicting the Taxonomic Affiliations of other Lactobacillus spp./Isolates

Previous studies on CSIs specific to various taxa and genera have demonstrated their strong predictive value i.e., TAXIs identified in known members of a group are often found in newly discovered or sequenced members of the same group [22,34,64,65,66]. The predictive ability of the CSIs is also demonstrated by the results presented in Figure 6 for two CSIs specific for Leuconostocaceae genera identified in our previous work [16]. Figure 6A shows a CSI consisting of an 8 aa insertion in the protein phospho-N-acetylmuramoyl-pentapeptide-transferases which was identified as specific for the genus Weissella [16], whose members exhibit probiotic and biotechnological potential [12]. When this CSI was identified, sequence information was available for 18 Weissella species. However, since then sequence information has become available for four additional species (shown in bold in Figure 6A), and this CSIs is present in all of them demonstrating its predictive ability. Figure 6B shows another example of a CSI identified in our previous work, specific for the genus Fructobacillus [67], which consists of fructose-fermenting microorganisms [26,68]. When this CSI was identified in 2022, sequence information was available for 5 Fructobacillus species. However, since then, eight new species belonging to this genus have been described [27], and this CSI is present in all of them. These results provide compelling evidence highlighting the long-term stability and predictive abilities of the CSIs (TAXIs).
Based on the predictive capability of the TAXIs, we have recently developed a web-based tool, AppIndels.com, which uses the presence of known TAXIs in genome sequences to predict taxonomic affiliations [34]. To evaluate the utility of the TAXIs for Lactobacillaceae genera identified in this study, we have added the sequence information for these CSIs to the AppIndels.com server, and then used this server to analyze 113 uncharacterized Lactobacillus isolates whose genome sequence information was available in the NCBI database. In Figure 7, we show the results obtained from the AppIndels.com server for two representative Lactobacillus isolates. The server indicates that the strain Lactobacillus sp. CBA3605 is related to the genus Lactiplantibacillus, and its genome contains eight CSIs specific for this genus (Figure 7A). In contrast, the server predicted that the genome of Lactobacillus sp. UW_DM_LACCAS1_1 is related to the genus Lacticaseibacillus, as it contained nine CSIs specific for this genus (Figure 7B). In addition to indicating the numbers of CSIs specific to the predicted taxon, the server also provides the sequence information for the matching CSIs. Due to space considerations, the results of these analyses for all other analyzed strains are presented in Table S5 and they include, the accession numbers of the analyzed genomes, the predictions made by the AppIndels.com regarding the taxonomic affiliations of these genomes, and the number of TAXIs identified matching a specific genus in these genomes. As seen from Figure 7 and Table S5, the server successfully predicted the genus-level affiliation for 112 out of 113 isolates based on the presence of multiple genus-specific TAXIs in their genome sequences. These isolates were assigned to the following 11 genera: Agrilactobacillus (1), Apilactobacillus (1), Bombilactobacillus (7), Fructilactobacillus (1), Lacticaseibacillus (8), Lactiplantibacillus (2), Lactobacillus (81), Lentilactobacillus (1), Levilactobacillus (2), Limosilactobacillus (7), and Ligilactobacillus-Liquorilactobacillus cluster (1). One genome (accession number GCA_014796685.1) was not assigned to any genus, as it likely belongs to a taxon for which no TAXIs are currently available in the server’s database.
To validate the accuracy of these taxonomic predictions, we constructed a phylogenetic tree which included the information for most uncharacterized isolates and representative species from various Lactobacillaceae genera (Figure 8). As seen from this tree, it showed 100% concordance between the predicted genus assignments by the AppIndels. com server and the phylogenetic placements of the isolates. These results demonstrate that the TAXIs for the Lactobacillaceae genera identified in this study provide a robust and practical tool for the taxonomic classification of novel or uncharacterized isolates from this family.

3.5. Taxon-specific CSIs are Localized in Surface Exposed Loops of Proteins

Earlier studies on CSIs specific for other prokaryotic taxa have shown that genetic changes are frequently located in surface-exposed loop regions of proteins, which are flexible, unstructured areas often involved in mediating novel protein-protein or protein-ligand interactions [69,70,71,72,73,74]. In view of these findings, we have also investigated the location of some representative CSIs specific for the Lactobacillaceae genera, for whom sequence information is presented in Figure 2, Figure 3 and Figure 4 in protein structures.
As described in the Methods section, AlphaFold was used to predict the structures of selected proteins containing conserved signature inserts (CSIs) (Figure 2, Figure 3 and Figure 4), as well as homologous proteins lacking the CSIs. To determine the structural localization of each CSI, we superimposed the predicted structures of the protein with and without the CSI. Figure 9 presents the results for four representative CSIs. In each case, the figure shows the structural alignment of the two protein models, with the CSI regions highlighted in red. These overlays clearly illustrate the position of the CSIs within the protein structures, and they show that all four examined CSIs viz. (i) a 2 aa insertion in the 50S ribosomal protein L10 specific for Lactobacillus (Figure 2B); (ii) 1 aa insertion in the protein manganese-dependent inorganic pyrophosphatase specific for the genus Lacticaseibacillus (Figure 3B); (iii) a 2 aa insertion in the protein cyclopropane-fatty-acyl-phospholipid synthase family specific for Apilactobacillus (Figure 4B), and (iv) a 1 aa insertion in the protein pyridoxal phosphate-dependent aminotransferase specific for Lactiplantibacillus (Figure 4D), are localized in surface-exposed loop regions of the proteins (Figure 9A–D). These findings demonstrate that like the structural locations of CSIs in previous studies [69,73,75,76], the identified CSIs specific for the Lactobacillaceae genera are also structurally localized to surface-exposed loops of the proteins and may contribute to genus-specific functional traits by mediating novel biologically important interactions.

4. Discussion

The family Lactobacillaceae, comprising a diverse group of lactic acid bacteria, plays a central role in food production, probiotic development, and human health due to its metabolic versatility and beneficial properties [1,2,3,7]. This study presents comprehensive phylogenomic and comparative genomic analyses of 411 Lactobacillaceae genomes, leading to the identification of 171 novel molecular markers in the form of conserved signature indels (CSIs) or TAXIs in diverse proteins, each specific to various Lactobacillaceae genera. These findings represent a significant advancement in our understanding of this family. By integrating molecular markers with a robust phylogenomic framework, this work provides powerful tools for taxonomic delineation, the development of reliable diagnostic strategies, and insights into genus-specific functional traits, thereby enhancing both scientific understanding and practical applications of Lactobacillaceae species.
Prior to 2020, the genus Lactobacillus, which encompassed the majority of the family Lactobacillaceae, was taxonomically problematic due to its genetic heterogeneity and polyphyletic placement in phylogenetic trees [5,11,14,23,51,52]. The landmark reclassification by Zheng et al. [3] addressed this issue by dividing Lactobacillus into 23 new genera and merging the former Leuconostocaceae family [7] into Lactobacillaceae, resulting in a more coherent and phylogenetically consistent taxonomy [3,13]. However, the lack of reliable molecular or biochemical markers to distinguish among the newly defined genera has limited the precision of taxonomic assignments, and further advancements in understanding unique aspects of the species from different genera.
The 171 CSIs identified in this study, each uniquely shared by species within a specific genus, offer a molecular solution to this challenge. These TAXIs, which result from rare genetic changes in conserved protein regions, serve as molecular synapomorphies providing independent evidence of evolutionary relationships [18,20,33]. The phylogenomic tree constructed from 411 genomes, based on 87 conserved proteins, further supports the monophyly of these genera and reinforces the revised taxonomic framework proposed by Zheng et al. [3].
Unlike traditional metrics such as average amino acid identity (AAI) or 16S rRNA sequence similarity, which often lack resolution at the genus level [77,78,79], CSIs provide highly specific and predictive genus-level molecular signatures [64]. Their reliability is also demonstrated in the present work by their consistent presence in newly sequenced species from the genera such as Fructobacillus and Weissella. This INDELIBLE (Indel-based Identification of Bacterial Lineages and Evolution) approach, which uses taxon-specific CSIs (TAXIs), has proven effective in clarifying the taxonomy of several other microbial groups, including Bacillus [22,80], Burkholderia [81,82], Enterobacterales [36], and Pseudomonas [83,84], and now establishes a robust molecular basis for Lactobacillaceae taxonomy. The predictive power of these TAXIs is further demonstrated by their successful application in classifying 112 out of 113 uncharacterized Lactobacillus isolates into 11 Lactobacillaceae genera using the AppIndels.com server [34]. This tool, developed to predict taxonomic affiliations based on the presence of known CSIs in genome sequences, enhances diagnostic efficiency, particularly in high-throughput genomic analyses. Given the rapid pace of discovery of Lactobacillaceae species, with over 150 new species described since 2020 [27], tools such as the Appindels.com server are invaluable for the accurate and scalable classification of novel or uncultured isolates as genomic databases continue to expand [34].
The absence of reliable diagnostic markers also hinders the identification of Lactobacillaceae species in clinical, industrial, and environmental settings. The TAXIs identified in this study, located in conserved regions of diverse proteins, offer a novel and highly specific diagnostic toolkit. The conserved flanking regions of these CSIs can be used to design PCR primers, or probes for qPCR and pyrosequencing, enabling selective amplification or detection of CSI-containing organisms. Similar TAXIs-based diagnostic assays have been successfully developed for Bacillus anthracis [85] and Escherichia coli O157:H7 [86], and primers targeting TAXIs have also been used to detect species from Actinobacteria [87] and Chlamydiae [88].
Beyond taxonomy and diagnostics, CSIs provide a gateway to exploring genus-specific functional traits. Located in conserved regions of proteins involved in essential cellular processes, these indels are likely to be functionally significant [18,72,89]. Prior studies have shown that such genetic changes often underlie unique biochemical or phenotypic characteristics [69,75,76,89,90,91,92]. Structural predictions using AlphaFold 3 revealed that all examined CSIs are located in surface-exposed loops, suggesting roles in host interactions, environmental adaptation, or substrate specificity. Such traits are likely to be critical to the ecological and industrial relevance of Lactobacillaceae species [13,68]. For example, Apilactobacillus species contain a surface-exposed CSI in the protein cyclopropane-fatty-acyl-phospholipid synthase (Figure 9C), which is known to facilitate acid stress resistance in gastric bacteria [93,94] and is predicted to enhance bile salt resistance through lipid synthesis [95]. Given that Apilactobacillus species thrive in highly acidic environments (pH < 3) [3], further investigation into this CSI may provide insights into their acidophilic nature and probiotic potential [60,96]. Similarly, Lactiplantibacillus species are distinguished by a CSI in a surface loop of pyridoxal phosphate-dependent aminotransferase (Figure 9D), an enzyme involved in amino acid metabolism [97], making it a promising marker for studying metabolic adaptations in this genus.
In summary, biochemical and functional studies of these genus-specific CSIs are likely to uncover novel traits unique to each Lactobacillaceae genus. Given the widespread industrial use of Lactobacillaceae species in food-related applications [24,25,26,98], the insights gained from such studies could have significant implications for biotechnology, health, and microbial ecology.

Supplementary Materials

The following are available online at Preprints.org, Table S1. Summary of CSIs specific for species from the Ligilactobacillus-Liquorilactobacillus cluster, Lapidilactobacillus, Amylolactobacillus, Bombilactobacillus, and Schleiferilactobacillus. Table S2. Summary of CSIs specific for species from the genera Agrilactobacillus, Latilactobacillus, Loigolactobacillus, and Furfurilactobacillus. Table S3. Summary of CSIs specific for species from the genera Paucilactobacillus, Limosilactobacillus, Fructilactobacillus Lentilactobacillus, and Levilactobacillus. Table S4. Summary of CSIs Specific for species from the genera, Pediococcus, Companilactobacillus, Xylocopilactobacillus and Dellaglioa. Table S5. Information on the genome sequences of 112 Uncharacterized Lactobacillus isolates and their taxonomic affiliations predicted by the AppIndels.com server. Figure S1. An uncompressed version of the maximum likelihood tree for 411 genome-sequenced Lactobacillaceae species shown in Figure 1. Figure S2. Partial sequence alignment of the 50S ribosomal protein L10, showing a 2 aa insert specific for the genus Lactobacillus. Figure S3. Sequence alignment of the excinuclease ABC subunit UvrC, showing a 5-6 aa insert specific for the genus Lactobacillus. Figure S4. Sequence alignment of the ribonucleoside-triphosphate reductase, showing a 2 aa insert specific for the species/strains from genus Lactobacillus. Figure S5. Sequence alignment of the DNA-binding protein WhiA, showing a 1 aa insert specific for the genus Lactobacillus. Figure S6. Sequence alignment of the Translation initiation factor IF-2, showing a 3 aa insert specific for the genus Lactobacillus. Figure S7. Sequence alignment of the 50S ribosomal protein L4, showing a 2 aa deletion specific for the genus Lactobacillus. Figure S8. Sequence alignment of the TIGR01457 family HAD-type hydrolase, showing a 1 aa insert specific for the genus Lactobacillus. Figure S9. Sequence alignment of the C69 family dipeptidase, showing a 1 aa deletion specific for the genus Lactobacillus. Figure S10. Sequence alignment of the YfbR-like 5’-deoxynucleotidase, showing a 1 aa insert specific for the genus Lactobacillus. Figure S11. Sequence alignment of the class I SAM-dependent methyltransferase, showing a 1 aa deletion specific for the genus Lactobacillus. Figure S12. Sequence alignment of the Phosphate acyltransferase PlsX, showing a 1 aa deletion specific for the genus Lactobacillus. Figure S13. Sequence alignment of the DNA helicase PcrA, showing a 2 aa insert shared by most species from the genus Lactobacillus. Figure S14. Sequence alignment of the NADP-dependent phosphogluconate dehydrogenase, showing a 1 aa deletion specific for the genus Lactobacillus. Figure S15. Sequence alignment of the calcium-translocating P-type ATPase, showing a 1 aa deletion specific for the genus Lactobacillus. Figure S16. Sequence alignment of the ATP-binding protein, showing a 1 aa insert specific for the genus Lactobacillus. Figure S17. Sequence alignment of the 16S rRNA (cytosine(1402)-N(4))-methyltransferase RsmH, showing a 1 aa insert specific for the genus Lactobacillus. Figure S18. Sequence alignment of the manganese-dependent inorganic pyrophosphatase, showing a 1 aa insert specific for the genus Lacticaseibacillus. Figure S19. Sequence alignment of the hemolysin family protein, showing a 1 aa insert specific for the genus Lacticaseibacillus. Figure S20. Sequence alignment of the 1-acyl-sn-glycerol-3-phosphate acyltransferase, showing a 1 aa deletion specific for the genus Lacticaseibacillus. Figure S21. Sequence alignment of the DUF1002 domain-containing protein, showing a 1 aa deletion specific for the genus Lacticaseibacillus. Figure S22. Sequence alignment of the DeoR/GlpR family DNA-binding transcription regulator, showing a 1 aa deletion specific for the genus Lacticaseibacillus. Figure S23. Sequence alignment of the DNA polymerase IV, showing a 1 aa deletion specific for the genus Lacticaseibacillus. Figure S24. Sequence alignment of the DNA polymerase IV, showing a 1 aa deletion specific for the genus Lacticaseibacillus. Figure S25. Sequence alignment of the YfcE family phosphodiesterase, showing a 1 aa deletion specific for the genus Lacticaseibacillus. Figure S26. Sequence alignment of the methionine adenosyltransferase, showing a 1 aa deletion specific for the genus Lacticaseibacillus. Figure S27. Sequence alignment of the cyclopropane-fatty-acyl-phospholipid synthase family protein, showing a 2 aa insert specific for the genus Apilactobacillus. Figure S28. Sequence alignment of the DEAD/DEAH box helicase, showing a 1 aa specific for the genus Apilactobacillus. Figure S29. Sequence alignment of the Phosphate acetyltransferase, showing a 1 aa deletion specific for the genus Apilactobacillus. Figure S30. Sequence alignment of the glucose-6-phosphate dehydrogenase, showing a 1 aa insert specific for the genus Apilactobacillus. Figure S31. Sequence alignment of the pyridoxal phosphate-dependent aminotransferase protein, showing a 1 aa insert specific for the genus Lactiplantibacillus. Figure S32. Sequence alignment of the ABC transporter ATPase protein, showing a 1 aa deletion specific for the genus Lactiplantibacillus. Figure S33. Sequence alignment of the acetyl-CoA carboxylase protein, showing a 1 aa insert specific for the genus Lactiplantibacillus. Figure S34. Sequence alignment of the 50S ribosomal protein L15 protein, showing a 1 aa deletion specific for the genus Lactiplantibacillus. Figure S35. Sequence alignment of the C69 family dipeptidase protein, showing a 2 aa deletion specific for the genus Lactiplantibacillus. Figure S36. Sequence alignment of the GRP family sugar transporter protein, showing a 1 aa deletion specific for the genus Lactiplantibacillus. Figure S37. Sequence alignment of the glycoside hydrolase family 13 protein, showing a 1 aa deletion specific for the genus Lactiplantibacillus. Figure S38. Sequence alignment of the undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphate transferase protein, showing a 1 aa specific for the genus Lactiplantibacillus. Figure S39. Sequence alignment of the PolC-type DNA polymerase III protein showing a 1 aa insert specific for the genus Ligilactobacillus and Liquorilactobacillus. Figure S40. Sequence alignment of the heat-inducible transcriptional repressor HrcA protein showing a 1 aa deletion specific for the genus Ligilactobacillus and Liquorilactobacillus. Figure S41. Sequence alignment of the transcription termination/antitermination protein NusG showing a 1 aa insert specific for the genus Ligilactobacillus and Liquorilactobacillus. Figure S42. Sequence alignment of the UDP-N-acetylmuramoyl-L-alanine-D-glutamate ligase protein showing a 1 aa deletion specific for the genus Ligilactobacillus and Liquorilactobacillus. Figure S43. Sequence alignment of the dihydroorotate dehydrogenase protein showing a 1 aa insert specific for the genus Ligilactobacillus and Liquorilactobacillus. Figure S44. Sequence alignment of the tRNA dihydrouridine synthase B protein showing a 1 aa insert specific for the genus Ligilactobacillus and Liquorilactobacillus. Figure S45. Sequence alignment of the DNA repair protein RecN protein showing a 1 aa deletion specific for the genus Ligilactobacillus and Liquorilactobacillus. Figure S46. Sequence alignment of the PolC-type DNA polymerase III protein, showing a 4 aa insert specific for the genus Lapidilactobacillus. Figure S47. Sequence alignment of the type 2 isopentenyl-diphosphate Delta-isomerase protein, showing a 2 aa insert specific for the genus Lapidilactobacillus. Figure S48. Sequence alignment of the polysaccharide biosynthesis protein, showing a 2 aa insert specific for the genus Lapidilactobacillus. Figure S49. Sequence alignment of the hydroxymethylglutaryl-CoA synthase protein, showing a 1 aa insert specific for the genus Lapidilactobacillus. Figure S50. Sequence alignment of the protein trigger factor, showing a 2 aa insert specific for the genus Amylolactobacillus. Figure S51. Sequence alignment of the protein xanthine phosphoribosyltransferase, showing a 1 aa insert specific for the genus Amylolactobacillus. Figure S52. Sequence alignment of the protein 16S rRNA (cytidine(1402)-2’-O)-methyltransferase, showing a 2 aa insert specific for the genus Amylolactobacillus. Figure S53. Sequence alignment of the tRNA uracil 4-sulfurtransferase ThiI protein, showing a 1 aa insert specific for the genus Amylolactobacillus, except for two Lacticaseibacillus species which also share this CSI. Figure S54. Sequence alignment of the protein SMC-Scp complex subunit SepB, showing a 1 aa insert specific for the genus Bombilactobacillus. Figure S55. Sequence alignment of the protein response regulator transcription factor, showing a 1 aa deletion specific for the genus Bombilactobacillus, except for B. mellifer. Figure S56. Sequence alignment of the protein Translation initiation factor IF-3, showing a 1 aa deletion specific for the genus Bombilactobacillus. Figure S57. Sequence alignment of the protein YqeG family HAD IIIA-type phosphatase, showing a 1 aa insert specific for the genus Bombilactobacillus. Figure S58. Sequence alignment of the protein response regulator transcription factor, showing a 1 aa insert specific for the genus Bombilactobacillus. Figure S59. Sequence alignment of the protein arginine-tRNA ligase, showing a 1 aa deletion specific for the genus Bombilactobacillus. Figure S60. Sequence alignment of the protein DNA polymerase III subunit alpha, showing a 1 aa insert specific for the genus Bombilactobacillus. Figure S61. Sequence alignment of the protein excinuclease ABC subunit UvrC, showing a 5 aa insert specific for the genus Schleiferilactobacillus. Figure S62. Sequence alignment of the 50S ribosomal protein L15, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S63. Sequence alignment of the response regulator transcription factor protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S64. Sequence alignment of the HD domain-containing protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S65. Sequence alignment of the metallophosphoesterase protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S66. Sequence alignment of the 1,4-dihydroxy-2-naphthoate polyprenyltransferase protein, showing a 2 aa insert specific for the genus Schleiferilactobacillus. Figure S67. Sequence alignment of the DNA polymerase III subunit gamma/tau protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S68. Sequence alignment of the UDP-N-acetylmuramoyl-L-alanine-D-glutamate ligase protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S69. Sequence alignment of the ABC-F family ATP-binding cassette domain-containing protein, showing a 1 aa deletion specific for the genus Schleiferilactobacillus. Figure S70. Sequence alignment of the glutamine-hydrolyzing GMP synthase protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S71. Sequence alignment of the phosphopentomutase protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S72. Sequence alignment of the citrate lyase acyl carrier protein, showing a 1–2 aa insert specific for the genus Schleiferilactobacillus. Figure S73. Sequence alignment of the orotidine-5’-phosphate decarboxylase protein, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S74. Sequence alignment of the protein molecular chaperone DnaK, showing a 1 aa insert specific for the genus Schleiferilactobacillus . Figure S75. Sequence alignment of the protein oligoendopeptidase F, showing a 1 aa insert specific for the genus Schleiferilactobacillus. Figure S76. Sequence alignment of the helix-turn-helix domain-containing protein, showing a 1 aa insert specific for the genus Agrilactobacillus. Figure S77. Sequence alignment of the protein heat-inducible transcriptional repressor HrcA, showing a 1 aa deletion specific for the genus Agrilactobacillus. Figure S78. Sequence alignment of the MBL fold metallo-hydrolase protein, showing a 1 aa insert specific for the genus Agrilactobacillus. Figure S79. Sequence alignment of the RluA family pseudouridine synthase protein, showing a 2 aa insert specific for the genus Agrilactobacillus. Figure S80. Sequence alignment of the protein MDR family MFS transporter, showing a 2 aa insert specific for the genus Latilactobacillus. Figure S81. Sequence alignment of the protein tRNA (adenine(22)-N(1))-methyltransferase TrmK showing a 1 aa insert specific for the genus Latilactobacillus. Figure S82. Sequence alignment of the alanine racemase protein showing a 1 aa insert specific for the genus Latilactobacillus. Figure S83. Sequence alignment of the competence protein ComEA showing a 1 aa insert specific for the genus Latilactobacillus. Figure S84. Sequence alignment of the protein RNA polymerase recycling motor HelD showing a 2 aa deletion specific for the genus Latilactobacillus. Figure S85. Sequence alignment of the Two-component system regulatory protein YycI showing a 1 aa insert specific for the genus Latilactobacillus. Figure S86. Sequence alignment of the nucleobase.cation symporter-2 family protein showing a 1 aa insert specific for the genus Latilactobacillus. Figure S87. Sequence alignment of the protein calcium ABC transporter ATPase, showing a 3 aa deletion specific for the genus Latilactobacillus. Figure S88. Sequence alignment of the protein phosphoglycerate dehydrogenase showing a 2 aa insert specific for the genus Loigolactobacillus. Figure S89. Sequence alignment of the protein cation-translocating P-type ATPase showing a 3 aa insert specific for the genus Loigolactobacillus . Figure S90. Sequence alignment of the preprotein translocase subunit SecA showing a 1 aa insert specific for the genus Loigolactobacillus except for L. bifermentans, L. backii and L. iwatensis. Figure S91. Sequence alignment of the pyruvate carboxylase protein, showing a 2 aa insert specific for the genus Loigolactobacillus. Figure S92. Sequence alignment of the protein amino acid permease showing a 1 aa insert specific for the genus Loigolactobacillus. Figure S93. Sequence alignment of the protein SAM-dependent methyltransferase with a 1 aa deletion specific for the genus Loigolactobacillus. Figure S94. Sequence alignment of the protein GTPase ObgE, showing a 6 aa insert specific for the genus Furfurilactobacillus. Figure S95. Sequence alignment of the protein DHH family phosphoesterase, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S96. Sequence alignment of the protein phosphate acyltransferase PlsX, showing a 1 aa deletion specific for the genus Furfurilactobacillus. Figure S97. Sequence alignment of the protein CtsR family transcriptional regulator, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S98. Sequence alignment of the energy-coupling factor ABC transporter ATP-binding protein, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S99. Sequence alignment of the phosphoglycerate kinase protein, showing a 1 aa deletion specific for the genus Furfurilactobacillus. Figure S100. Sequence alignment of the Nramp family divalent metal transporter protein, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S101. Sequence alignment of polysaccharide biosynthesis protein, showing a 3 aa insert specific for the genus Furfurilactobacillus. Figure S102. Sequence alignment of peptidylprolyl isomerase protein, showing a 3 aa insert specific for the genus Furfurilactobacillus. Figure S103. Sequence alignment of the protein ribonuclease J, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S104. Sequence alignment of the protein DNA-formamidopyrimidine glycosylase, showing a 1 aa deletion specific for the genus Furfurilactobacillus. Figure S105. Sequence alignment of the protein class I mannose-6-phosphate isomerase, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S106. Sequence alignment of the mechanosensitive ion channel family protein, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S107. Sequence alignment of the dihydrolipoyl dehydrogenase protein, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S108. Sequence alignment of the glutamate-tRNA ligase protein, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S109. Sequence alignment of the polysaccharide biosynthesis protein, showing a 3 aa insert specific for the genus Furfurilactobacillus. Figure S110. Sequence alignment of the LTA synthase family protein, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S111. Sequence alignment of the PolC-type DNA polymerase III protein, showing a 2 aa insert specific for the genus Furfurilactobacillus. Figure S112. Sequence alignment of the peptidase M13 protein, showing a 1 aa insert specific for the genus Furfurilactobacillus. Figure S113. Sequence alignment of the iron-containing alcohol dehydrogenase protein, showing a 1aa deletion specific for the genus Paucilactobacillus. Figure S114. Sequence alignment of the protein phosphogluconate dehydrogenase (NAD(+)-dependent, decarboxylating), showing a 2aa insert specific for the genus Paucilactobacillus. Figure S115. Sequence alignment of the ROK family glucokinase protein, showing a 1 aa insert specific for the genus Paucilactobacillus, except for P. oligofermentans and P. nenjiangensis. Figure S116. Sequence alignment of the protein UTP-glucose-1-phosphate uridylyltransferase GalU, showing a 3aa insert specific for the genus Limosilactobacillus. Figure S117. Sequence alignment of the proline-tRNA ligase, showing a 1 aa deletion specific for the genus Limosilactobacillus. Figure S118. Sequence alignment of the S1-like domain-containing RNA-binding protein, showing a 1 aa insert specific for the genus Limosilactobacillus. Figure S119. Sequence alignment of the (d)CMP kinase protein, showing a 1 aa insert specific for the genus Limosilactobacillus. Figure S120. Sequence alignment of the Ammonia-dependent NAD(+) synthetase protein, showing a 1 aa insert specific for the genus Limosilactobacillus. Figure S121. Sequence alignment of the tRNA uracil-4-sulfurtransferase ThiI protein, showing a 1 aa specific for the genus Limosilactobacillus, except for Limosilactobacillus difficile. Figure S122. Sequence alignment of the redox-regulated ATPase YchF protein, showing a 1 aa deletion specific for the genus Limosilactobacillus, except [Lactobacillus] timonensis. Figure S123. Sequence alignment of the class I SAM-dependent RNA methyltransferase protein, showing a 1 aa insert specific for the genus Limosilactobacillus. Figure S124. Sequence alignment of the DNA repair protein RecN protein, showing a 1 aa insert specific for the genus Fructilactobacillus. Figure S125. Sequence alignment of the undecaprenyldiphospho-muramoylpentapeptide beta-N- protein, showing a 1 aa insert specific for the genus Fructilactobacillus. Figure S126. Sequence alignment of the ribulose-phosphate 3-epimerase protein, showing a 1 aa insert specific for the genus Fructilactobacillus. Figure S127. Sequence alignment of the nucleoside hydrolase protein, showing a 1 aa deletion specific for the genus Fructilactobacillus. Figure S128. Sequence alignment of the zinc-dependent alcohol dehydrogenase family protein, showing a 1 aa insert specific for the genus Fructilactobacillus. Figure S129. Sequence alignment of the PBPLA family penicillin-binding protein, showing a 1 aa deletion specific for the genus Fructilactobacillus. Figure S130. Sequence alignment of the ribonuclease J protein, showing a 1 aa insert specific for the genus Fructilactobacillus. Figure S131. Sequence alignment of the DNA topoisomerase (ATP-hydrolyzing) subunit B protein, showing a 1 aa insert specific for the genus Fructilactobacillus. Figure S132. Sequence alignment of the AI-ZE family transporter protein showing a 1 aa insertion specific for the genus Lentilactobacillus. Figure S133. Sequence alignment of the endonuclease Mut S2 protein showing a 1 aa insertion specific for the genus Lentilactobacillus. Figure S134. Sequence alignment of the heat-inducible transcriptional representation showing a 1 aa deletion specific for the genus Lentilactobacillus. Figure S135. Sequence alignment of the ATP-dependent DNA helicase RecG protein showing a 1 aa insert that is commonly and exclusively shared by all members of the genus Levilactobacillus.Figure S136. Sequence alignment of the pyridoxal phosphate-dependent aminotransferase protein showing a 1 aa insert specific for the genus Levilactobacillus. Figure S137. Sequence alignment of the iron-sulfur cluster biosynthesis protein showing a 1 aa deletion specific for the genus Levilactobacillus. Notably, the homolog for Levilactobacillus bambusae is missing. Figure S138. Sequence alignment of the EamA family transporter protein showing a 1 aa insert specific for the genus Levilactobacillus. Figure S139. Sequence alignment of 50S ribosomal protein L15 protein showing a 1 aa insert that is commonly and shared by all members of the genus Secundilactobacillus. Figure S140. Sequence alignment of the LacI family DNA-binding transcriptional regulator protein showing a 1 aa deletion specific for the genus Secundilactobacillus. Figure S141. Sequence alignment of the trypsin-like peptidase domain-containing protein showing a 1 aa insert specific for the genus Secundilactobacillus. Figure S142. Sequence alignment of the BCCT family transporter protein showing a 1 aa deletion specific for the genus Secundilactobacillus. Figure S143. Sequence alignment of the 6 phosphofructokinase protein showing a 3 aa specific for the genus Pediococcus. Figure S144. Sequence alignment of the glutamine-fructose-6-phosphate transaminase protein showing a 1 aa insert specific for the genus Pediococcus. Figure S145. Sequence alignment of the ATP-dependent chaperone ClpB protein showing a 2 aa insert specific for the genus Pediococcus. Figure S146. Sequence alignment of the endolytic transglycosylase MltG protein showing a 1 aa insert specific for the genus Pediococcus. Figure S147. Sequence alignment of the PBP1A family penicillin-binding protein showing a 2 aa insert specific for the genus Pediococcus. Figure S148. Sequence alignment of the cell division protein FtsA protein showing a 1 aa deletion specific for the genus Pediococcus. Figure S149. Sequence alignment of the histidine phosphatase family protein showing a 3-4 aa insert specific for the genus Pediococcus. Figure S150. Sequence alignment of the proline-specific peptidase family protein showing a 1 aa insert specific for the genus Pediococcus. Figure S151. Sequence alignment of the cyclopropane-fatty-acyl-phospholipid synthase family protein showing a 2 aa insert specific for the genus Pediococcus. Figure S152. Sequence alignment of the aminopeptidase C protein showing a 1 aa insert specific for the genus Pediococcus. Figure S153. Sequence alignment of the SkL family PASTA domain-containing Ser/Thr kinase protein, showing a 3 aa insert specific for the genus Companilactobacillus. Figure S154. Sequence alignment of the type I glyceraldehyde-3-phosphate dehydrogenase protein, showing a 1 aa insert that is shared by all species from the genus Companilactobacillus. Figure S155. Sequence alignment of the protein DNA polymerase III subunit beta, showing a 1 aa insert specific for the genus Companilactobacillus. Figure S156. Sequence alignment of the protein IMP dehydrogenase, showing a 1 aa deletion specific for the genus Companilactobacillus. Figure S157. Sequence alignment of the protein cysteine-tRNA ligase, showing a 1 aa deletion specific for the genus Companilactobacillus. Figure S158. Sequence alignment of the protein L-threonylcarbamoyladenylate synthase, showing a 1 aa deletion specific for the genus Companilactobacillus. Figure S159. Sequence alignment of the protein ribonuclease J, showing a 1 aa insert specific for the genus Companilactobacillus. Figure S160. Sequence alignment of the protein ABC-F family ATP-binding cassette domain-containing protein, showing a 3 aa deletion specific for the genus Companilactobacillus. Figure S161. Sequence alignment of the protein DNA-directed RNA polymerase subunit beta, showing a 1 aa insert specific for the genus Companilactobacillus. Figure S162. Sequence alignment of the protein DNA polymerase III subunit beta, showing a 1 aa insert specific for the genus Companilactobacillus. Figure S163. Sequence alignment of the DNA polymerase III subunit alpha, showing a 1 aa insert specific for the genus Xylocopilactobacillus. Figure S164. Sequence alignment of the excinuclease ABC subunit UvrC, showing a 1 aa insert specific for the genus Xylocopilactobacillus. Figure S165. Sequence alignment of the ribosome biogenesis GTP-binding protein YihA/YsxC, showing a 1 aa insert specific for the genus Xylocopilactobacillus. Figure S166. Sequence alignment of the tRNA uracil 4-sulfurtransferase ThiI protein, showing a 2 aa insert specific for the genus Xylocopilactobacillus. Figure S167. Sequence alignment of the DEAD/DEAH box helicase protein, showing a 1 aa insert specific for the genus Dellaglioa. Figure S168. Sequence alignment of the amino acid ABC transporter substrate-binding protein/permease, showing a 1 aa insert specific for the genus Dellaglioa. Figure S169. Sequence alignment of the FtsW/RodA/SpoVE family cell cycle protein, showing a 1 aa insert specific for the genus Dellaglioa. Figure S170. Sequence alignment of the transglycosylase domain-containing protein showing a 1 aa specific for the genus Dellaglioa. Figure S171. Sequence alignment of the RNA polymerase sigma factor RpoD protein showing a 3 aa insert specific for the genus Dellaglioa. Figure S172. Sequence alignment of the amino acid ABC transporter permease protein, showing a 2-5 aa insert specific for the genus Dellaglioa.

Author Contributions

SB and RSG carried out analysis using the AppIndels server; SB constructed phylogenetic trees; RSG, Planning and supervision of the work, obtained funding for the project and writing and finalizing of the manuscript; SB, updating the sequence information for the CSIs and checking and formatting different Figures and Tables, RSG and SB, writing and finalizing of the manuscript.

Acknowledgments

This work was supported by the research grant (RGPIN-2019-06397) from the Natural Science and Engineering Research Council of Canada awarded to Radhey S. Gupta.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Skerman, V.B.D.; McGowan, V.; Sneath, P.H.A. Approved lists of bacterial names. Int J Syst Bacteriol 1980, 30, 225–420. [Google Scholar] [CrossRef]
  2. Winslow, C.; Broadhurst, J.B., RE.; Krumwiede, C.R. , LA, Smith GH. The Families and Genera of the Bacteria: Preliminary Report of the Committee of the Society of American Bacteriologists on Characterization and Classification of Bacterial Types. J. Bacteriol. Res. 1917, 2, 505–566. [Google Scholar] [CrossRef]
  3. Zheng, J.; Wittouck, S.; Salvetti, E.; Franz, C.M.A.P.; Harris, H.M.B.; Mattarelli, P.; O’Toole, P.W.; Pot, B.; Vandamme, P.; Walter, J.; et al. A taxonomic note on the genus Lactobacillus: Description of 23 novel genera, emended description of the genus Lactobacillus Beijerinck 1901, and union of Lactobacillaceae and Leuconostocaceae. Int J Syst Evol Microbiol 2020, 70, 2782–2858. [Google Scholar] [CrossRef]
  4. Gänzle, M.G. Lactic metabolism revisited: metabolism of lactic acid bacteria in food fermentations and food spoilage. Curr. Opin. Food Sci. 2015, 2, 106–117. [Google Scholar] [CrossRef]
  5. Salvetti, E.; Harris, H.M.B.; Felis, G.E.; O’Toole, P.W. Comparative Genomics of the Genus Lactobacillus Reveals Robust Phylogroups That Provide the Basis for Reclassification. Appl. Environ. Microbiol. 2018, 84, e00993–00918. [Google Scholar] [CrossRef]
  6. Schleifer, K.H. Family V. Leuconostocaceae fam. nov. Bergey’s Manual of Systematic Bacteriology (The Firmicutes), 2nd edn, 2009, 3, 624. [Google Scholar] [CrossRef]
  7. Nieminen, T.T.S., E; Endo, A.; Johansson, P. ,; Bjorkroth,J. The family Leuconostocaceae In The Prokaryotes: Firmicutes and Tenericutes:; Springer-Verlag: 2014; pp. 215-240.
  8. Danza, A.; Lucera, A.; Lavermicocca, P.; Lonigro, S.L.; Bavaro, A.R.; Mentana, A.; Centonze, D.; Conte, A.; Del Nobile, M.A. Tuna Burgers Preserved by the Selected Lactobacillus paracasei IMPC 4.1 Strain. Food Bioproc Tech 2018, 11, 1651–1661. [Google Scholar] [CrossRef]
  9. Chen, Y.; Yu, L.; Qiao, N.; Xiao, Y.; Tian, F.; Zhao, J.; Zhang, H.; Chen, W.; Zhai, Q. Latilactobacillus curvatus: A Candidate Probiotic with Excellent Fermentation Properties and Health Benefits. Foods 2020, 9. [Google Scholar] [CrossRef]
  10. Liang, J.R.; Deng, H.; Hu, C.Y.; Zhao, P.T.; Meng, Y.H. Vitality, fermentation, aroma profile, and digestive tolerance of the newly selected Lactiplantibacillus plantarum and Lacticaseibacillus paracasei in fermented apple juice. Front. Nutr. 2022, 9. [Google Scholar] [CrossRef]
  11. Dicks, L.; Endo, A. The Family Lactobacillaceae: Genera Other than Lactobacillus. In The Prokaryotes: Firmicutes and Tenericutes, Rosenberg, E., DeLong, E.F., Lory, S., Stackebrandt, E., Thompson, F., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2014; pp. 203–212. [Google Scholar]
  12. Fusco, V.; Chieffi, D.; Fanelli, F.; Montemurro, M.; Rizzello, C.G.; Franz, C.M. The Weissella and Periweissella genera: Up-to-date taxonomy, ecology, safety, biotechnological, and probiotic potential. Frontiers in Microbiology 2023, 14, 1289937. [Google Scholar] [CrossRef]
  13. Qiao, N.; Wittouck, S.; Mattarelli, P.; Zheng, J.; Lebeer, S.; Felis, G.E.; Gänzle, M.G. After the storm-Perspectives on the taxonomy of Lactobacillaceae. JDS Commun 2022, 3, 222–227. [Google Scholar] [CrossRef]
  14. Wittouck, S.; Wuyts, S.; Meehan, C.J.; Noort, V.v.; Lebeer, S. A Genome-Based Species Taxonomy of the Lactobacillus Genus Complex. mSystems 2019, 4, e00264–00219. [Google Scholar] [CrossRef]
  15. Duar, R.M.; Lin, X.B.; Zheng, J.; Martino, M.E.; Grenier, T.; Pérez-Muñoz, M.E.; Leulier, F.; Gänzle, M.; Walter, J. Lifestyles in transition: evolution and natural history of the genus Lactobacillus. FEMS Microbiol Rev 2017, 41, S27–s48. [Google Scholar] [CrossRef]
  16. Bello, S.; Rudra, B.; Gupta, R.S. Phylogenomic and comparative genomic analyses of Leuconostocaceae species: identification of molecular signatures specific for the genera Leuconostoc, Fructobacillus and Oenococcus and proposal for a novel genus Periweissella gen. nov. Int J Syst Evol Microbiol 2022, 72. [Google Scholar] [CrossRef]
  17. Gupta, R.S. Protein phylogenies and signature sequences: A reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol Mol Biol Rev 1998, 62, 1435–1491. [Google Scholar] [CrossRef]
  18. Gupta, R.S. Impact of genomics on the understanding of microbial evolution and classification: the importance of Darwin’s views on classification. FEMS Microbiol Rev 2016, 40, 520–553. [Google Scholar] [CrossRef]
  19. Rokas, A.; Holland, P.W.H. Rare genomic changes as a tool for phylogenetics. Trends Ecol. Evol. 2000, 15, 454–459. [Google Scholar] [CrossRef]
  20. Bhandari, V.; Naushad, H.S.; Gupta, R.S. Protein based molecular markers provide reliable means to understand prokaryotic phylogeny and support Darwinian mode of evolution. Front Cell Infect Microbiol 2012, 2, 98. [Google Scholar] [CrossRef]
  21. Naushad, H.S.; Lee, B.; Gupta, R.S. Conserved signature indels and signature proteins as novel tools for understanding microbial phylogeny and systematics: identification of molecular signatures that are specific for the phytopathogenic genera Dickeya, Pectobacterium and Brenneria. Int J Syst Evol Microbiol 2014, 64, 366–383. [Google Scholar] [CrossRef]
  22. Gupta, R.S.; Patel, S.; Saini, N.; Chen, S. Robust demarcation of 17 distinct Bacillus species clades, proposed as novel Bacillaceae genera, by phylogenomics and comparative genomic analyses: description of Robertmurraya kyonggiensis sp. nov. and proposal for an emended genus Bacillus limiting it only to the members of the Subtilis and Cereus clades of species. Int J Syst Evol Microbiol 2020, 70, 5753–5798. [Google Scholar] [CrossRef]
  23. Sun, Z.; Harris, H.M.; McCann, A.; Guo, C.; Argimón, S.; Zhang, W.; Yang, X.; Jeffery, I.B.; Cooney, J.C.; Kagawa, T.F.; et al. Expanding the biotechnology potential of lactobacilli through comparative genomics of 213 strains and associated genera. Nat Commun 2015, 6, 8322. [Google Scholar] [CrossRef]
  24. Giraffa, G.; Chanishvili, N.; Widyastuti, Y. Importance of lactobacilli in food and feed biotechnology. Research in Microbiology 2010, 161, 480–487. [Google Scholar] [CrossRef]
  25. Zhang, Z.; Lv, J.; Pan, L.; Zhang, Y. Roles and applications of probiotic Lactobacillus strains. Applied Microbiology and Biotechnology 2018, 102, 8135–8143. [Google Scholar] [CrossRef]
  26. Endo, A.; Maeno, S.; Tanizawa, Y.; Kneifel, W.; Arita, M.; Dicks, L.; Salminen, S. Fructophilic Lactic Acid Bacteria, a Unique Group of Fructose-Fermenting Microbes. Appl Environ Microbiol 2018, 84. [Google Scholar] [CrossRef]
  27. Parte, A.C.; Sarda Carbasse, J.; Meier-Kolthoff, J.P.; Reimer, L.C.; Goker, M. List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ. Int J Syst Evol Microbiol 2020, 70, 5607–5612. [Google Scholar] [CrossRef]
  28. Sayers, E.W.; Agarwala, R.; Bolton, E.E.; Brister, J.R.; Canese, K.; Clark, K.; Connor, R.; Fiorini, N.; Funk, K.; Hefferon, T. Database resources of the national center for biotechnology information. Nucleic Acids Res 2019, 47, D23. [Google Scholar] [CrossRef]
  29. Parte, A.C. LPSN–List of Prokaryotic names with Standing in Nomenclature (bacterio. net), 20 years on. Int J Syst Evol Microbiol 2018, 68, 1825–1829. [Google Scholar] [CrossRef]
  30. Wu, L.; McCluskey, K.; Desmeth, P.; Liu, S.; Hideaki, S.; Yin, Y.; Moriya, O.; Itoh, T.; Kim, C.Y.; Lee, J.S.; et al. The global catalogue of microorganisms 10K type strain sequencing project: closing the genomic gaps for the validly published prokaryotic and fungi species. Gigascience 2018, 7. [Google Scholar] [CrossRef]
  31. Mukherjee, S.; Seshadri, R.; Varghese, N.J.; Eloe-Fadrosh, E.A.; Meier-Kolthoff, J.P.; Göker, M.; Coates, R.C.; Hadjithomas, M.; Pavlopoulos, G.A.; Paez-Espino, D.; et al. 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. Nat Biotechnol 2017, 35, 676–683. [Google Scholar] [CrossRef]
  32. Whitman, W.B. Genome sequences as the type material for taxonomic descriptions of prokaryotes 1. Syst Appl Microbiol 2015, 38, 217–222. [Google Scholar] [CrossRef]
  33. Gupta, R.S. Identification of Conserved Indels that are Useful for Classification and Evolutionary Studies. In Bacterial Taxonomy, Methods in microbiology, Goodfellow, M., Sutcliffe, I., Chun, J., Eds.; Elsevier: 2014; Volume 41, pp. 153-182.
  34. Gupta, R.S.; Kanter-Eivin, D. AppIndels.com Server: A Web Based Tool for the Identification of Known Taxon-Specific Conserved Signature Indels in Genome Sequences: Validation of Its Usefulness by Predicting the Taxonomic Affiliation of >700 Unclassified strains of Bacillus Species. Int J Syst Evol Microbiol 2023, 73. [Google Scholar]
  35. Wang, Z.; Wu, M. A phylum-level bacterial phylogenetic marker database. Mol Biol Evol 2013, 30, 1258–1262. [Google Scholar] [CrossRef]
  36. Adeolu, M.; Alnajar, S.; Naushad, S.; R, S.G. Genome-based phylogeny and taxonomy of the ‘Enterobacteriales’: proposal for Enterobacterales ord. nov. divided into the families Enterobacteriaceae, Erwiniaceae fam. nov., Pectobacteriaceae fam. nov., Yersiniaceae fam. nov., Hafniaceae fam. nov., Morganellaceae fam. nov., and Budviciaceae fam. nov. Int J Syst Evol Microbiol 2016, 66, 5575–5599. [Google Scholar] [CrossRef]
  37. Eddy, S.R. Profile hidden Markov models. Bioinformatics 1998, 14, 755–763. [Google Scholar] [CrossRef]
  38. Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T.J.; Karplus, K.; Li, W.; Lopez, R.; McWilliam, H.; Remmert, M.; Söding, J.; et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 2011, 7, 539. [Google Scholar] [CrossRef]
  39. Capella-Gutiérrez, S.; Silla-Martínez, J.M.; Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 2009, 25, 1972–1973. [Google Scholar] [CrossRef]
  40. Whelan, S.; Goldman, N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 2001, 18, 691–699. [Google Scholar] [CrossRef]
  41. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol Biol Evol 2018, 35, 1547–1549. [Google Scholar] [CrossRef]
  42. Jeanmougin, F.; Thompson, J.D.; Gouy, M.; Higgins, D.G.; Gibson, T.J. Multiple sequence alignment with Clustal X. Trends Biochem Sci 1998, 23, 403–405. [Google Scholar] [CrossRef]
  43. Abramson, J.; Adler, J.; Dunger, J.; Evans, R.; Green, T.; Pritzel, A.; Ronneberger, O.; Willmore, L.; Ballard, A.J.; Bambrick, J.; et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 2024, 630, 493–500. [Google Scholar] [CrossRef]
  44. Schrödinger, L. The PyMOL Molecular Graphics System, Version 1.8. (No Title) 2015.
  45. Mariani, V.; Biasini, M.; Barbato, A.; Schwede, T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 2013, 29, 2722–2728. [Google Scholar] [CrossRef]
  46. Zhang, Y.; Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins: Struct., Funct., Bioinf. 2004, 57, 702–710. [Google Scholar] [CrossRef]
  47. Guo, H.B.; Perminov, A.; Bekele, S.; Kedziora, G.; Farajollahi, S.; Varaljay, V.; Hinkle, K.; Molinero, V.; Meister, K.; Hung, C.; et al. AlphaFold2 models indicate that protein sequence determines both structure and dynamics. Sci Rep 2022, 12, 10696. [Google Scholar] [CrossRef]
  48. Dobritsa, A.P.; Linardopoulou, E.V.; Samadpour, M. Transfer of 13 species of the genus Burkholderia to the genus Caballeronia and reclassification of Burkholderia jirisanensis as Paraburkholderia jirisanensis comb. nov. Int J Syst Evol Microbiol 2017, 67, 3846–3853. [Google Scholar] [CrossRef]
  49. Montecillo, J.A.V.; Bae, H. Reclassification of Brevibacterium frigoritolerans as Peribacillus frigoritolerans comb. nov. based on phylogenomics and multiple molecular synapomorphies. Int J Syst Evol Microbiol 2022, 72. [Google Scholar] [CrossRef]
  50. Jiang, L.; Wang, D.; Kim, J.-S.; Lee, J.H.; Kim, D.-H.; Kim, S.W.; Lee, J. Reclassification of genus Izhakiella into the family Erwiniaceae based on phylogenetic and genomic analyses. Int J Syst Evol Microbiol 2021, 70, 3541–3546. [Google Scholar] [CrossRef]
  51. Zheng, J.; Ruan, L.; Sun, M.; Gänzle, M. A Genomic View of Lactobacilli and Pediococci Demonstrates that Phylogeny Matches Ecology and Physiology. Appl Environ Microbiol 2015, 81, 7233–7243. [Google Scholar] [CrossRef]
  52. Collins, M.D.; Rodrigues, U.; Ash, C.; Aguirre, M.; Farrow, J.A.E.; Martinez-Murcia, A.; Phillips, B.A.; Williams, A.M.; Wallbanks, S. Phylogenetic analysis of the genus Lactobacillus and related lactic acid bacteria as determined by reverse transcriptase sequencing of 16S rRNA. FEMS Microbiol Lett 1991, 77, 5–12. [Google Scholar] [CrossRef]
  53. Salvetti, E.; Torriani, S.; Felis, G.E. The Genus Lactobacillus: A Taxonomic Update. Probiotics and Antimicrobial Proteins 2012, 4, 217–226. [Google Scholar] [CrossRef]
  54. Shah, A.B.; Baiseitova, A.; Zahoor, M.; Ahmad, I.; Ikram, M.; Bakhsh, A.; Shah, M.A.; Ali, I.; Idress, M.; Ullah, R.; et al. Probiotic significance of Lactobacillus strains: a comprehensive review on health impacts, research gaps, and future prospects. Gut Microbes 2024, 16, 2431643. [Google Scholar] [CrossRef]
  55. van de Guchte, M.; Penaud, S.; Grimaldi, C.; Barbe, V.; Bryson, K.; Nicolas, P.; Robert, C.; Oztas, S.; Mangenot, S.; Couloux, A.; et al. The complete genome sequence of Lactobacillus bulgaricus reveals extensive and ongoing reductive evolution. Proc Natl Acad Sci U S A 2006, 103, 9274–9279. [Google Scholar] [CrossRef]
  56. Dan, T.; Hu, H.; Tian, J.; He, B.; Tai, J.; He, Y. Influence of Different Ratios of Lactobacillus delbrueckii subsp. bulgaricus and Streptococcus thermophilus on Fermentation Characteristics of Yogurt. Molecules 2023, 28, 2123. [Google Scholar] [CrossRef]
  57. Diaconu, M.; Kothe, U.; Schlünzen, F.; Fischer, N.; Harms, J.M.; Tonevitsky, A.G.; Stark, H.; Rodnina, M.V.; Wahl, M.C. Structural basis for the function of the ribosomal L7/12 stalk in factor binding and GTPase activation. Cell 2005, 121, 991–1004. [Google Scholar] [CrossRef]
  58. Lai, W.-K.; Lu, Y.-C.; Hsieh, C.-R.; Wei, C.-K.; Tsai, Y.-H.; Chang, F.-R.; Chan, Y. Developing Lactic Acid Bacteria as an Oral Healthy Food. Life 2021, 11, 268. [Google Scholar] [CrossRef]
  59. De Groote, M.A.; Frank, D.N.; Dowell, E.; Glode, M.P.; Pace, N.R. Lactobacillus rhamnosus GG bacteremia associated with probiotic use in a child with short gut syndrome. Pediatr Infect Dis J 2005, 24, 278–280. [Google Scholar] [CrossRef]
  60. Maeno, S.; Nishimura, H.; Tanizawa, Y.; Dicks, L.; Arita, M.; Endo, A. Unique niche-specific adaptation of fructophilic lactic acid bacteria and proposal of three Apilactobacillus species as novel members of the group. BMC Microbiology 2021, 21, 41. [Google Scholar] [CrossRef]
  61. Bradford, E.L.; Wax, N.; Bueren, E.K.; Walke, J.B.; Fell, R.; Belden, L.K.; Haak, D.C. Comparative genomics of Lactobacillaceae from the gut of honey bees, Apis mellifera, from the Eastern United States. G3 (Bethesda) 2022, 12. [Google Scholar] [CrossRef]
  62. Fidanza, M.; Panigrahi, P.; Kollmann, T.R. Lactiplantibacillus plantarum-Nomad and Ideal Probiotic. Front Microbiol 2021, 12, 712236. [Google Scholar] [CrossRef]
  63. Wang, Z.; Wu, J.; Tian, Z.; Si, Y.; Chen, H.; Gan, J. The Mechanisms of the Potential Probiotic Lactiplantibacillus plantarum against Cardiovascular Disease and the Recent Developments in its Fermented Foods. Foods 2022, 11. [Google Scholar] [CrossRef]
  64. Barbour, A.G.; Adeolu, M.; Gupta, R.S. Division of the genus Borrelia into two genera (corresponding to Lyme disease and relapsing fever groups) reflects their genetic and phenotypic distinctiveness and will lead to a better understanding of these two groups of microbes (Margos et al. (2016) There is inadequate evidence to support the division of the genus Borrelia. Int. J. Syst. Evol. Microbiol. https://doi.org/10.1099/ijsem.0.001717). Int J Syst Evol Microbiol 2017, 67, 2058-2067.
  65. Rudra, B.; Gupta, R.S. Molecular Markers Specific for the Pseudomonadaceae Genera Provide Novel and Reliable Means for the Identification of Other Pseudomonas strains/spp. Related to These Genera. Genes 2025, 16, 183. [Google Scholar] [CrossRef]
  66. Malhotra, M.; Bello, S.; Gupta, R.S. Phylogenomic and molecular markers based studies on clarifying the evolutionary relationships among Peptoniphilus species. Identification of several Genus-Level clades of Peptoniphilus species and transfer of some Peptoniphilus species to the genus Aedoeadaptatus. Syst Appl Microbiol 2024, 47, 126499. [Google Scholar] [CrossRef]
  67. Endo, A.; Okada, S. Reclassification of the genus Leuconostoc and proposals of Fructobacillus fructosus gen. nov., comb. nov., Fructobacillus durionis comb. nov., Fructobacillus ficulneus comb. nov. and Fructobacillus pseudoficulneus comb. nov. Int J Syst Evol Microbiol 2008, 58, 2195–2205. [Google Scholar] [CrossRef]
  68. Endo, A.; Tanizawa, Y.; Tanaka, N.; Maeno, S.; Kumar, H.; Shiwa, Y.; Okada, S.; Yoshikawa, H.; Dicks, L.; Nakagawa, J.; Arita, M. Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp. BMC Genomics, 1117. [Google Scholar] [CrossRef]
  69. Akiva, E.; Itzhaki, Z.; Margalit, H. Built-in loops allow versatility in domain-domain interactions: lessons from self-interacting domains. Proc Natl Acad Sci U S A 2008, 105, 13292–13297. [Google Scholar] [CrossRef]
  70. Hormozdiari, F.; Salari, R.; Hsing, M.; Schönhuth, A.; Chan, S.K.; Sahinalp, S.C.; Cherkasov, A. The effect of insertions and deletions on wirings in protein-protein interaction networks: a large-scale study. J Comput Biol 2009, 16, 159–167. [Google Scholar] [CrossRef]
  71. Khadka, B.; Persaud, D.; Gupta, R.S. Novel Sequence Feature of SecA Translocase Protein Unique to the Thermophilic Bacteria: Bioinformatics Analyses to Investigate Their Potential Roles. Microorganisms 2020, 8, 59. [Google Scholar] [CrossRef]
  72. Miton, C.M.; Tokuriki, N. Insertions and Deletions (Indels): A Missing Piece of the Protein Engineering Jigsaw. Biochemistry 2023, 62, 148–157. [Google Scholar] [CrossRef]
  73. Gupta, R.S.; Nanda, A.; Khadka, B. Novel molecular, structural and evolutionary characteristics of the phosphoketolases from Bifidobacteria and Coriobacteriales. PLoS One 2017, 12, e0172176. [Google Scholar] [CrossRef]
  74. Geszvain, K.; Gruber, T.M.; Mooney, R.A.; Gross, C.A.; Landick, R. A hydrophobic patch on the flap-tip helix of E.coli RNA polymerase mediates sigma(70) region 4 function. J Mol Biol 2004, 343, 569–587. [Google Scholar] [CrossRef]
  75. Khadka, B.; Gupta, R.S. Identification of a conserved 8 aa insert in the PIP5K protein in the Saccharomycetaceae family of fungi and the molecular dynamics simulations and structural analysis to investigate its potential functional role. Proteins 2017, 85, 1454–1467. [Google Scholar] [CrossRef]
  76. Hashimoto, K.; Panchenko, A.R. Mechanisms of protein oligomerization, the critical role of insertions and deletions in maintaining different oligomeric states. Proc Natl Acad Sci U S A 2010, 107, 20352–20357. [Google Scholar] [CrossRef]
  77. Gupta, R.S. Distinction between Borrelia and Borreliella is more robustly supported by molecular and phenotypic characteristics than all other neighbouring prokaryotic genera: Response to Margos’ et al. “The genus Borrelia reloaded” (PLoS ONE 13(12): e0208432). PLoS One 2019, 14, e0221397. [Google Scholar] [CrossRef]
  78. Caudill, M.T.; Brayton, K.A. The Use and Limitations of the 16S rRNA Sequence for Species Classification of Anaplasma Samples. Microorganisms 2022, 10. [Google Scholar] [CrossRef]
  79. Janda, J.M.; Abbott, S.L. 16S rRNA Gene Sequencing for Bacterial Identification in the Diagnostic Laboratory: Pluses, Perils, and Pitfalls. Journal of Clinical Microbiology 2007, 45, 2761–2764. [Google Scholar] [CrossRef]
  80. Patel, S.; Gupta, R.S. A phylogenomic and comparative genomic framework for resolving the polyphyly of the genus Bacillus: Proposal for six new genera of Bacillus species, Peribacillus gen. nov., Cytobacillus gen. nov., Mesobacillus gen. nov., Neobacillus gen. nov., Metabacillus gen. nov. and Alkalihalobacillus gen. nov. Int J Syst Evol Microbiol 2020, 70, 406–438. [Google Scholar] [CrossRef]
  81. Sawana, A.; Adeolu, M.; Gupta, R.S. Molecular signatures and phylogenomic analysis of the genus Burkholderia: proposal for division of this genus into the emended genus Burkholderia containing pathogenic organisms and a new genus Paraburkholderia gen. nov. harboring environmental species. Front Genet 2014, 5, 429. [Google Scholar] [CrossRef]
  82. Dobritsa, A.P.; Samadpour, M. Reclassification of Burkholderia insecticola as Caballeronia insecticola comb. nov. and reliability of conserved signature indels as molecular synapomorphies. Int J Syst Evol Microbiol 2019, 69, 2057–2063. [Google Scholar] [CrossRef]
  83. Rudra, B.; Gupta, R.S. Phylogenomic and comparative genomic analyses of species of the family Pseudomonadaceae: Proposals for the genera Halopseudomonas gen. nov. and Atopomonas gen. nov., merger of the genus Oblitimonas with the genus Thiopseudomonas, and transfer of some misclassified species of the genus Pseudomonas into other genera. International Journal of Systematic and Evolutionary Microbiology 2021, 71. [Google Scholar] [CrossRef]
  84. Rudra, B.; Gupta, R.S. Phylogenomics studies and molecular markers reliably demarcate genus Pseudomonas sensu stricto and twelve other Pseudomonadaceae species clades representing novel and emended genera. Front Microbiol 2024, 14, 1273665. [Google Scholar] [CrossRef]
  85. Ahmod, N.Z.; Gupta, R.S.; Shah, H.N. Identification of a Bacillus anthracis specific indel in the yeaC gene and development of a rapid pyrosequencing assay for distinguishing B. anthracis from the B. cereus group. J Microbiol Methods 2011, 87, 278–285. [Google Scholar] [CrossRef]
  86. Wong, S.Y.; Paschos, A.; Gupta, R.S.; Schellhorn, H.E. Insertion/deletion-based approach for the detection of Escherichia coli O157:H7 in freshwater environments. Environ Sci Technol 2014, 48, 11462–11470. [Google Scholar] [CrossRef]
  87. Gao, B.; Gupta, R.S. Conserved indels in protein sequences that are characteristic of the phylum Actinobacteria. Int. J. Syst. Evol. Microbiol 2005, 55, 2401–2412. [Google Scholar] [CrossRef]
  88. Griffiths, E.; Petrich, A.; Gupta, R.S. Conserved Indels in Essential Proteins that are Distinctive Characteristics of Chlamydiales and Provide Novel Means for Their Identification. Microbiology 2005, 151, 2647–2657. [Google Scholar] [CrossRef]
  89. Singh, B.; Gupta, R.S. Conserved inserts in the Hsp60 (GroEL) and Hsp70 (DnaK) proteins are essential for cellular growth. Mol. Genet. Genomics 2009, 281, 361–373. [Google Scholar] [CrossRef]
  90. Clarke, J.H.; Irvine, R.F. Evolutionarily conserved structural changes in phosphatidylinositol 5-phosphate 4-kinase (PI5P4K) isoforms are responsible for differences in enzyme activity and localization. Biochem J 2013, 454, 49–57. [Google Scholar] [CrossRef]
  91. Kuznedelov, K.; Minakhin, L.; Niedziela-Majka, A.; Dove, S.L.; Rogulja, D.; Nickels, B.E.; Hochschild, A.; Heyduk, T.; Severinov, K. A role for interaction of the RNA polymerase flap domain with the sigma subunit in promoter recognition. Science 2002, 295, 855–857. [Google Scholar] [CrossRef]
  92. Nandan, D.; Lopez, M.; Ban, F.; Huang, M.; Li, Y.; Reiner, N.E.; Cherkasov, A. Indel-based targeting of essential proteins in human pathogens that have close host orthologue(s): discovery of selective inhibitors for Leishmania donovani elongation factor-1alpha. Proteins 2007, 67, 53–64. [Google Scholar] [CrossRef]
  93. Chang, Y.Y.; Cronan, J.E., Jr. Membrane cyclopropane fatty acid content is a major factor in acid resistance of Escherichia coli. Mol Microbiol 1999, 33, 249–259. [Google Scholar] [CrossRef]
  94. Jiang, X.; Duan, Y.; Zhou, B.; Guo, Q.; Wang, H.; Hang, X.; Zeng, L.; Jia, J.; Bi, H. The Cyclopropane Fatty Acid Synthase Mediates Antibiotic Resistance and Gastric Colonization of Helicobacter pylori. J Bacteriol 2019, 201. [Google Scholar] [CrossRef]
  95. Kandasamy, S.; Lee, K.H.; Yoo, J.; Yun, J.; Kang, H.B.; Kim, J.E.; Oh, M.H.; Ham, J.S. Whole genome sequencing of Lacticaseibacillus casei KACC92338 strain with strong antioxidant activity, reveals genes and gene clusters of probiotic and antimicrobial potential. Front Microbiol 2024, 15, 1458221. [Google Scholar] [CrossRef]
  96. Vergalito, F.; Testa, B.; Cozzolino, A.; Letizia, F.; Succi, M.; Lombardi, S.J.; Tremonte, P.; Pannella, G.; Di Marco, R.; Sorrentino, E.; et al. Potential Application of Apilactobacillus kunkeei for Human Use: Evaluation of Probiotic and Functional Properties. Foods 2020, 9. [Google Scholar] [CrossRef]
  97. Percudani, R.; Peracchi, A. A genomic overview of pyridoxal-phosphate-dependent enzymes. EMBO Rep 2003, 4, 850–854. [Google Scholar] [CrossRef]
  98. Wood, B.J.B.; Holzapfel, W.H.N.; Hammes, W.P.; Vogel, R.F. The Genera of Lactic Acid Bacteria, 1 ed.; Wood, B.J.B., Ed.; Springer US: 1995; Volume 2, p. 398.
Figure 1. A maximum-likelihood tree for 411 genome-sequenced Lactobacillaceae species based on concatenated sequences of 87 conserved proteins. The statistical support values for different branches are indicated on the nodes. This tree was rooted by using Bacillus species as an outgroup (see Methods). Different main species clades observed in the tree are identified by the names of the genera and are compressed. The uncompressed version of this tree is presented in Figure S1.
Figure 1. A maximum-likelihood tree for 411 genome-sequenced Lactobacillaceae species based on concatenated sequences of 87 conserved proteins. The statistical support values for different branches are indicated on the nodes. This tree was rooted by using Bacillus species as an outgroup (see Methods). Different main species clades observed in the tree are identified by the names of the genera and are compressed. The uncompressed version of this tree is presented in Figure S1.
Preprints 165606 g001
Figure 2. (A) Branching pattern of Lactobacillus species in the maximum-likelihood tree; (B) Partial sequence alignment showing a 2 aa insertion (highlighted) in the 50S ribosomal protein L10 that is exclusively shared by species/strains from the genus Lactobacillus. The dashes (-) in this and all other sequence alignments indicate identity with the amino acids on the top line. Gaps in sequence alignment indicate that no amino acid is present in that position. Accession numbers for different sequences are indicated in the second column and the position of this sequence fragment within the protein is indicated above the sequences. Detailed sequence information for this CSI as well as 15 other CSIs specific for Lactobacillus are presented in Figures S2-S17 and some of their characteristics are summarized in Table 1.
Figure 2. (A) Branching pattern of Lactobacillus species in the maximum-likelihood tree; (B) Partial sequence alignment showing a 2 aa insertion (highlighted) in the 50S ribosomal protein L10 that is exclusively shared by species/strains from the genus Lactobacillus. The dashes (-) in this and all other sequence alignments indicate identity with the amino acids on the top line. Gaps in sequence alignment indicate that no amino acid is present in that position. Accession numbers for different sequences are indicated in the second column and the position of this sequence fragment within the protein is indicated above the sequences. Detailed sequence information for this CSI as well as 15 other CSIs specific for Lactobacillus are presented in Figures S2-S17 and some of their characteristics are summarized in Table 1.
Preprints 165606 g002
Figure 3. (A) Branching pattern of the species from genus Lacticaseibacillus in the constructed maximum-likelihood tree; (B) Excerpts from the sequence alignment of the manganese-dependent inorganic pyrophosphatase protein showing a 1 aa insertion shared by species/strains from the genus Lacticaseibacillus. Detailed sequence information for this CSI, as well as 8 other CSIs specific for Lacticaseibacillus, are presented in Figures S18-S26 and their characteristics summarized in Table 1.
Figure 3. (A) Branching pattern of the species from genus Lacticaseibacillus in the constructed maximum-likelihood tree; (B) Excerpts from the sequence alignment of the manganese-dependent inorganic pyrophosphatase protein showing a 1 aa insertion shared by species/strains from the genus Lacticaseibacillus. Detailed sequence information for this CSI, as well as 8 other CSIs specific for Lacticaseibacillus, are presented in Figures S18-S26 and their characteristics summarized in Table 1.
Preprints 165606 g003
Figure 4. Branching topology of Apilactobacillus (A) and Lactiplantibacillus (C) species in the constructed phylogenetic tree and examples of molecular signatures specific for these genera. (B) Partial sequence alignment showing a 2 aa insertion in the cyclopropane-fatty-acyl-phospholipid synthase family protein that is exclusively found in the species/strains from genus Apilactobacillus. Detailed sequence information for this CSI as well as 4 other CSIs specific for Apilactobacillus are presented in Figures S27-S30 and some of their characteristics are summarized in Table 1. (D) Excerpts from the sequence alignment of the protein pyridoxal phosphate-dependent aminotransferase depicting a 1 aa insertion that is specific for the species from genus Lactiplantibacillus. Detailed sequence information for this CSI and 7 other CSIs specific for Lactiplantibacillus species, are presented in Figures S31-S38 and their characteristics are summarized in Table 1.
Figure 4. Branching topology of Apilactobacillus (A) and Lactiplantibacillus (C) species in the constructed phylogenetic tree and examples of molecular signatures specific for these genera. (B) Partial sequence alignment showing a 2 aa insertion in the cyclopropane-fatty-acyl-phospholipid synthase family protein that is exclusively found in the species/strains from genus Apilactobacillus. Detailed sequence information for this CSI as well as 4 other CSIs specific for Apilactobacillus are presented in Figures S27-S30 and some of their characteristics are summarized in Table 1. (D) Excerpts from the sequence alignment of the protein pyridoxal phosphate-dependent aminotransferase depicting a 1 aa insertion that is specific for the species from genus Lactiplantibacillus. Detailed sequence information for this CSI and 7 other CSIs specific for Lactiplantibacillus species, are presented in Figures S31-S38 and their characteristics are summarized in Table 1.
Preprints 165606 g004
Figure 5. A summary diagram showing the species composition of different Lactobacillaceae genera and the numbers of taxon-specific CSIs identified for them. * indicates that these CSIs were identified in our previous work (Bello et al., 2022).
Figure 5. A summary diagram showing the species composition of different Lactobacillaceae genera and the numbers of taxon-specific CSIs identified for them. * indicates that these CSIs were identified in our previous work (Bello et al., 2022).
Preprints 165606 g005
Figure 6. Updated sequence information for two CSIs specific for the genera Weissella and Fructobacillus described in our earlier work [16]. (A) Excerpts from the sequence alignment of the protein phospho-N-acetylmuramoylpentapeptide-transferases showing an eight aa insertion specific for the species from genus Weissella. (B) Partial sequence alignment of the protein Asp-tRNA(Asn)/Glu tRNA(Gln) amidotransferase subunit (GatB) showing a four aa insertion in a conserved region that is specific for all species from the genus Fructobacillus. *denotes an extra insert in Fructobacillus apis replacing the residues “DA”; # indicates an extra insert in Fructobacillus broussonetiae replacing residues “LPD”. Newly identified species are shown in these figures in bold letterings.
Figure 6. Updated sequence information for two CSIs specific for the genera Weissella and Fructobacillus described in our earlier work [16]. (A) Excerpts from the sequence alignment of the protein phospho-N-acetylmuramoylpentapeptide-transferases showing an eight aa insertion specific for the species from genus Weissella. (B) Partial sequence alignment of the protein Asp-tRNA(Asn)/Glu tRNA(Gln) amidotransferase subunit (GatB) showing a four aa insertion in a conserved region that is specific for all species from the genus Fructobacillus. *denotes an extra insert in Fructobacillus apis replacing the residues “DA”; # indicates an extra insert in Fructobacillus broussonetiae replacing residues “LPD”. Newly identified species are shown in these figures in bold letterings.
Preprints 165606 g006
Figure 7. Results from the AppIndels server results showing the predicted taxonomic affiliations for the genomes of two representative unclassified Lactobacillus isolates. (A) The Lactobacillus strain CBA3605 was identified by the server as belonging to the genus Lactiplantibacillus, and it contained eight CSIs specific for this genus. (B) The genome of Lactobacillus strain UW_DM_LACCAS1_1 is predicted by the server as affiliated with the genus Lacticaseibacillus, and shared nine CSIs specific for this genus. Due to space constraints, not all identified CSIs in the genomes of these strains are shown here.
Figure 7. Results from the AppIndels server results showing the predicted taxonomic affiliations for the genomes of two representative unclassified Lactobacillus isolates. (A) The Lactobacillus strain CBA3605 was identified by the server as belonging to the genus Lactiplantibacillus, and it contained eight CSIs specific for this genus. (B) The genome of Lactobacillus strain UW_DM_LACCAS1_1 is predicted by the server as affiliated with the genus Lacticaseibacillus, and shared nine CSIs specific for this genus. Due to space constraints, not all identified CSIs in the genomes of these strains are shown here.
Preprints 165606 g007
Figure 8. A bootstrapped maximum-likelihood tree including the type species of different Lactobacillaceae genera and uncharacterized Lactobacillus isolates for which identification to specific genera was made by the AppIndels.com webserver. Information for some closely related strains is not included in this figure due to space constraints. The clades corresponding to different Lactobacillaceae genera and the uncharacterized Lactobacillus isolates branching with them are marked in the tree.
Figure 8. A bootstrapped maximum-likelihood tree including the type species of different Lactobacillaceae genera and uncharacterized Lactobacillus isolates for which identification to specific genera was made by the AppIndels.com webserver. Information for some closely related strains is not included in this figure due to space constraints. The clades corresponding to different Lactobacillaceae genera and the uncharacterized Lactobacillus isolates branching with them are marked in the tree.
Preprints 165606 g008
Figure 9. Superimposed cartoon and surface representations of AlphaFold-predicted protein structures showing CSIs specific for the genus A) Lactobacillus present in the 50S ribosomal protein L10 (RMSD = 5.4 Å), (B) Lacticaseibacillus in the protein manganese-dependent inorganic pyrophosphatase (RMSD = 1.1 Å), (C) Apilactobacillus in the protein cyclopropane-fatty-acyl-phospholipid synthase family protein (RMSD = 0.3 Å) and (D) Lactiplantibacillus in the protein pyridoxal phosphate-dependent aminotransferase (RMSD = 1.0 Å). In each panel, the CSI-containing homolog is shown in dark blue, the CSI-lacking homolog in cyan, and the position of the CSI is shown in red. Further information on the protein prediction analyses is provided in the Methods section.
Figure 9. Superimposed cartoon and surface representations of AlphaFold-predicted protein structures showing CSIs specific for the genus A) Lactobacillus present in the 50S ribosomal protein L10 (RMSD = 5.4 Å), (B) Lacticaseibacillus in the protein manganese-dependent inorganic pyrophosphatase (RMSD = 1.1 Å), (C) Apilactobacillus in the protein cyclopropane-fatty-acyl-phospholipid synthase family protein (RMSD = 0.3 Å) and (D) Lactiplantibacillus in the protein pyridoxal phosphate-dependent aminotransferase (RMSD = 1.0 Å). In each panel, the CSI-containing homolog is shown in dark blue, the CSI-lacking homolog in cyan, and the position of the CSI is shown in red. Further information on the protein prediction analyses is provided in the Methods section.
Preprints 165606 g009
Table 1. Summary of CSIs specific for the genus Lactobacillus, Lacticaseibacillus, Apilactobacillus and Lactiplantibacillus.
Table 1. Summary of CSIs specific for the genus Lactobacillus, Lacticaseibacillus, Apilactobacillus and Lactiplantibacillus.
Protein Name Accession No. Indel Size Indel Position Figure No. Specificity
50S ribosomal protein L10 WP_046332409 2 aa Ins 57-112 Figure 2
Figure S2
Lactobacillus
excinuclease ABC subunit UvrC WP_003619779 5-6 aa Ins 480-531 Figure S3
Anaerobic ribonucleoside-triphosphate reductase§ WP_011161356 2 aa Ins 517-562 Figure S4
DNA-binding protein WhiA WP_004893933 1 aa Ins 140-194 Figure S5
Translation initiation factor IF-2 WP_011544002 3 aa Ins 285-336 Figure S6
50S ribosomal protein L4 WP_046332456 2 aa Del 120-280 Figure S7
TIGR01457 family HAD-type hydrolase WP_046331702 1 aa Ins 98-130 Figure S8
C69 family dipeptidase WP_003647856 1 aa Del 345-389 Figure S9
YfbR-like 5’-deoxynucleotidase§ WP_057718391 1 aa Ins 23-79 Figure S10
class I SAM-dependent methyltransferase WP_003619061 1 aa Del 269-326 Figure S11
Phosphate acyltransferase PlsX§ WP_011162257 1 aa Del 176-227 Figure S12
DNA helicase PcrA* § WP_011162397 2 aa Ins 248-301 Figure S13
NADP-dependent phosphogluconate dehydrogenase WP_011162624 1 aa Del 5-57 Figure S14
calcium-translocating P-type ATPase§ WP_044025971 1 aa Del 814-864 Figure S15
ATP-binding protein* § WP_046332316 1 aa Ins 347-399 Figure S16
16S rRNA (cytosine(1402)-N(4))-methyltransferase RsmH* § WP_044496740 1 aa Ins 76-113 Figure S17
manganese-dependent inorganic pyrophosphatase* WP_003579130 1 aa Ins 9-59 Figure 3
Figure S18
Lacticaseibacillus
hemolysin family protein WP_138426554 1 aa Ins 345-382 Figure S19
1-acyl-sn-glycerol-3-phosphate acyltransferase WP_049169464 1 aa Del 142-191 Figure S20
DUF1002 domain-containing protein WP_049172803 1 aa Del 85-129 Figure S21
DeoR/GlpR family DNA-binding transcription regulator* WP_191995078 1 aa Del 85-128 Figure S22
DNA polymerase IV WP_138131441 1 aa Del 110-155 Figure S23
DNA polymerase IV* WP_138131441 1 aa Del 227-263 Figure S24
YfcE family phosphodiesterase* WP_129319710 1 aa Del 1-36 Figure S25
methionine adenosyltransferase WP_138426285 1 aa Del 58-102 Figure S26
cyclopropane-fatty-acyl-phospholipid synthase family protein WP_138741898 2 aa Ins 315-362 Figure 4(B)
Figure S27
Apilactobacillus
DEAD/DEAH box helicase WP_053791914 1 aa Ins 168-209 Figure S28
Phosphate acetyltransferase* WP_053791569 1 aa Del 200-239 Figure S29
glucose-6-phosphate dehydrogenase WP_053796109 1 aa Ins 12-48 Figure S30
pyridoxal phosphate-dependent aminotransferase WP_208215537 1 aa Ins 30-65 Figure 4(D)
Figure S31
Lactiplantibacillus
ABC transporter ATPase § KLD61660 1 aa Del 44-98 Figure S32
acetyl-CoA carboxylase § KLD60369 1 aa Ins 32-83 Figure S33
50S ribosomal protein L15§ WP_021337917 1 aa Del 83-126 Figure S34
C69 family dipeptidase WP_134144186 2 aa Del 289-325 Figure S35
GRP family sugar transporter § WP_222843328 1 aa Del 83-128 Figure S36
glycoside hydrolase family 13 protein* WP_064619115 1 aa Del 377-430 Figure S37
undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphate transferase § OAX76783 1 aa Del 158-208 Figure S38
*. Isolated exceptions present in some species §. Protein homolog missing in some species.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated