Comprehensive Analysis of subtilase Gene Family and their Responses to nematode in Trichoderma harzianum

The subtilase family is the second largest family of serine proteases. Some fungi including Trichoderma species can capture and kill nematodes by secreting hydrolytic enzymes or toxins, among which serine proteases are important enzymes that allow fungi to infect nematodes. Subtilase can degrade nematode and insect body walls. In this study, subtilase family genes were identified from the Trichoderma harzianum genome database, and bioinformatics analysis of the characteristics and evolutionary status of these genes, along with structural and functional analyses of their proteins, was performed. Gene structure analysis revealed that all the 41 subtilase genes contained introns, while some did not have upstream or downstream regions. Chromosome localisation showed that subtilase family members were unevenly distributed in 22 Trichoderma chromosomes, and 3 clusters were present, indicating that they may be hot spots of subtilase genes. Conserved motif analyses showed these proteins contained a commonly conserved motif, and motifs belonging to the same subfamily remained highly similar. The upstream region of the subtilase genes were enriched with different type and numbers of cis-elements, indicating that subtilase genes are likely to play a role in the response to diverse stresses. Transcription of 31 genes was increased after 5 days of infection with nematodes, whereas that of 10 genes decreased. In these subtilase genes, ThSBT4, ThSBT5, ThSBT12, ThSBT27, ThSBT34, ThSBT35, ThSBT38, and ThSBT40 showed significantly upregulated expression with a log2 fold change value of more than 4, and ThSBT35 showed the highest peak. These results laid a theoretical foundation for further research on the function of the subtilase genes and the mechanism of the resistance response.

including serine proteases, chitinases, and collagenases are important virulence factors that can degrade the main chemical constituents of the nematode cuticle and eggshell [4][5] [6]. In 1990, Lopez-Llorca LV identified the first pathogenic-associated serine protease P32 from Pochonia suchlasporia [7]. Subsequently,serine proteases PII were isolated and identified from the nematode-trapping fungus Arthrobotrys oligospora [8]and the nematode-parasitic fungus P. chlamydosporia [9]. Currently, more than 20 pathogenic serine proteases (such as pSP-3 [10], Ver112 [11], PrC [12][13] [14], Csp1 [15], Hnsp [16]) have been isolated from different nematode fungi. RB Wang et al. [17] showed that the crude form of a protease obtained from the nematode fungus Dactylella shizishanna killed nematodes more efficiently than the purified form. This observation indicates the osmotic dissolution process requires the synergy of several different enzymes including collagenase, chitinase, and serine protease to act against the nematode epidermis and eggshell that are made up of complex components. M. K. Jashni reported that the synergistic action of a metalloprotease and a serine protease obtained from Fusarium oxysporum enhances fungal virulence [18].
Subtilases are a family of subtilisin-like serine proteases, which is the second-largest serine protease family characterised to date. Over 200 subtilases are presently known and the complete amino acid sequences of more than 170 have been documented [19]. The vast majority of these proteases are endopeptidases, exopeptidases, and tripeptidases [20] [21]. The subtilase superfamily includes 3 families of MEROPS peptidase family S8 (subfamily S8A, subtilase), S8B (Kexin), and S53 (sedolisin). The subtilase family Peptidase_S8 (PF00082) is characterised by Asp/His/Ser catalytic triplets and α/β-sheet catalytic centres containing seven-chain parallel β-sheets [22]. Subtilase is an important virulence factor and pathogenic factor found in different host fungi involved in biological stimulation [23]. The wide distribution of the serine cuticle-degrading protease found in nematophagous fungi suggested that this enzyme exhibits considerable nematicidal activity for balancing nematode populations [24]. Subtilase can destroy nematode and insect body walls and also plays an important role in invading plant walls [25]. Guo Qiannan performed omics analysis to determine the infection mechanism of the nematode fungus Drechmeria coniospora and found that subtilase, metalloproteinase, and acid phosphatase played an important role in the process of degrading the nematode body wall and invading it. Mass spectrometry results revealed that merimycin is a secondary metabolite of non-ribosomal polypeptides containing unconventional amino acid analogues (AIB, AHMOD, and AMD) that exhibits nematicidal activity. D. coniospora has 26 subtilases exhibiting nematicidal activity, 13 of which belong to the subtilase Pr1C subfamily [26]. Clonostachys rosea can parasitise nematodes and insects [27] [28]. The low-molecular-weight degradation products of the nematode cuticle could significantly induce the expression of a subtilase extracellular protease prC [12]. Z Hao et al. amplified the gene sequence of the P. oligospora serine protease P186 and successfully expressed it in Pichia pastoris. The purified protease reP186 exhibited nematicidal activity [29]. In addition, subtilase genes obtained from the nematode-parasitic fungus Purpureocillium lilacinum formed a unique cluster, putatively indicating that the fungus might have developed distinctive mechanisms for nematode pathogenesis [30]. Bacillus sp. B16 exhibits significant nematode activity against the nematode Panagrellus redivivus. The crude extracellular protein extract of the culture supernatant killed approximately 80% of the tested nematodes within 24 h, indicating the involvement of extracellular proteases. The deduced protein sequence shows significant similarity to the sequence of subtilase BPN', and the purified protease can hydrolyse collagen and nematode epidermis [31]. L.H. Lian reported that the Bacillus sp. strain RH219 exhibits remarkable nematicidal activity, and an extracellular cuticle-degrading protease termed Apr219 served as an important nematicidal factor [32]. By performing whole-genome sequencing of Bacillus firmus DS-1, Geng, Ce et al. reported that this species exhibited high toxicity to root nematode and soybean cyst nematode. Sep1, a novel serine protease isolated from B. firmus, exhibits nematicidal activity and capability of degrading intestinal tissues of nematodes [33].Yang, Jinkui performed genomic sequencing on the nematode predatory fungus Aspergillus oligogenome (ATCC24927) and found that this species expresses numerous genes similar to the ones encoding subtilase, cellulase, cellobiohydrolase and pectinesterase families. By establishing a nematode-trapping device model, various fungal signal transduction pathways were found to be activated and downstream genes related to various cellular processes, such as energy metabolism, cell wall production, adhesion protein biosynthesis, cell division, and glycerol accumulation, and biogenesis using peroxisomes, were upregulated [34]. Subsequently, the crystal structure of proteases (Ver112 and PL646) from the nematode fungus and the chitinase CrChi1 were extensively studied, the active site residues were identified, and the catalytic mechanism of these enzymes in causing host infection was elucidated [35]. Muszewska, A. et al. [36] analysed the phylogenetic distribution of more than 600 fungal protein groups, and their results showed that of the 54 serine protease families described in the MEROPS peptidase database, 21 are found in fungi and most eukaryotic organisms encode members of the 13-16 serine protease family. Among the analysed taxa, the number of serine proteases in each taxon was different, and the most abundant of them was S8 protease. Jinkui Yang studied a cuticle-degrading serine protease that was cloned from three isolates of Lecanicillium psalliotae (syn. Verticillium psalliotae) by using the 3′ and 5′ rapid amplification of cDNA ends method. The deduced protease sequence exhibited a high degree of similarity to other cuticle-degrading proteases isolated from other nematophagous fungi [37].
Trichoderma species are used as biological control agents in many plant diseases and play an important role in plant protection by conferring resistance to nematodes and insects. They also maintain the dynamic balance of the microbial in the soil ecological environment. Several studies have shown that Trichoderma exerts a certain inhibitory effect on root-knot nematodes [38] [39]. Studies conducted by Szabó et al. showed that genes encoding chitinase, serine protease, and aspartic protease in Trichoderma harzianum are expressed synergistically and play an important role in destroying nematode eggs [40]. T. virens has also been reported to produce growth-promoting agents and act as a biocontrol agent against Meloidogyne incognita [41]. The culture filtrate of T. virens G1-3 can inhibit the hatching of eggs of M. incognita and the activity of second-instar larvae (J2s) [42]. The crude extract from solid fermentation using the strain SMF2 (T. pseudokoningii) exhibits strong nematicidal activity against root-knot nematodes. The new serine protease SprT purified from this species can significantly shrink the epidermis of nematodes, kill nematode larvae, and inhibit egg hatching. Sequence analysis showed that SprT is a single-domain subtilase containing 284 amino acids [43]. Shuwu Zhang reported that T. longibrachiatum is a promising and highly effective antinematode biological control agent, and its conidia have a strong lethal and parasitic effect on M. incognita. It can attach to and parasitise the surface of second-instar larvae and penetrate the membrane to proliferate on the epidermis, leading to the partial dissolution of the epidermis by the metabolites of T. longibranch. In addition, T. longibranch can increase plant height, root length, and fresh weight of branches and roots of cucumber [44].
Recently, genome-wide identification and characterisation of the subtilase genes have been completed in some plants such as grapevine [45] [46], tomato [47] and barley [48], which exhibit stress tolerance and disease resistance. However, the subtilase genes of Trichoderma that are involved in growth, development, and defence-related activities have not been extensively studied. Here we systematically characterised all 41 putative subtilase family genes of Trichoderma and performed an phylogenetic relationship analysis. In addition, gene structures, protein motifs, chromosomal location, and promoter were analysed. Furthermore, the expression patterns of the subtilase family genes in mycelial tissues were determined, whereas the differential expression of the subtilase genes was determined after nematode treatment by using RNA-seq analysis.

Identification of the subtilase gene in T. harzianum
A total of 41 candidate subtilase genes were screened from the T. harzianum genome. On the basis of the conserved domain (Peptidase_S8peptidase-S8 domain), the genes were divided into 6 clusters and named ThSBT1 to ThSBT41. The parameter of the gene characteristics is summarised in Table 1 ThSBT2 and ThSBT28 were expressed on the cell membrane. ThSBT16, ThSBT25, and ThSBT27-ThSBT30 (except ThSBT28) were expressed in the nucleus. ThSBT24 was expressed in the membrane, cytoplasm, and nucleus.

Phylogenetic tree construction of subtilase family genes
Phylogenetic analysis was performed based on the relationship and length of all branches of the family members in the tree, wherein the proximity of different sequence associations showed the evolution origin process. We constructed the phylogenetic tree using 287 protein sequences of subtilase family members. The accession number of the gene identifier is listed in Table S1. The gene subfamily was divided by the phylogenetic tree branches and the annotation of the Peptidase_S8 domain in Pfam, NCBI, and SMART databases. By using the maximum likelihood algorithm, the bootstrap value of the evolutionary tree branch was found to be relatively high (95%). According to the phylogenetic tree shown in Figure 1, these sequences belong to 6 clusters. The annotation database information showed that they belong to the Peptidases_S8_S53 superfamily (cl10459), S8_S53 family (cd00306), and 5 subfamilies.

Gene structure analysis of T. harzianum subtilase family genes
GSDS was used to display the intron/exon structure of the subtilase family genes in T.
harzianum ( Figure 2). Supplementary ThSBT37-ThSBT41 showed the same pattern in the figure 2. Furthermore, the genetic structure of ThSBT27 and ThSBT28, and ThSBT39 and ThSBT40 were almost identical.  Multiple sequence alignment revealed that the amino acid sequence identity among members of the same subfamily was high( Figure 4). The amino acid sequence identity alignment of ThSBT1-ThSBT5 was 44.56%, that of ThSBT6-ThSBT10 was 40.18%, that of ThSBT11-ThSBT19 was 12.44%, that of ThSBT20-ThSBT28 was 15.95%, and that of ThSBT29-ThSBT33 was 28.08%. The ThSBT34-ThSBT41 alignment was 43.64%.

Identification of protein motifs of T. harzianum subtilase family genes
The MEME programme was used to analyse the conserved motifs in the Peptidase_S8 domain of T. harzianum subtilase genes, and the result is shown in Figure 5. There were 19 different motifs obtained from all the sequences, and the motif sequence and length information are shown in Supplementary Table S3. By using the Pfam database, these motifs were annotated and related functional information was queried. The figure shows the motif distribution of the conserved domains in different colours and positions. The distribution of motifs among subfamilies varied, and interestingly, the motifs belonging to the same subfamily were highly similar. Except for ThSBT18 and ThSBT28, all the proteins contained the commonly conserved motif1 and motif6, which proved that these two are highly conserved motifs. Moreover, motif3 and motif5 existed in ThSBT6-ThSBT41, which are also highly conserved motifs.
Moreover, ThSBT15-ThSBT41 had unique motif4, while ThSBT19 and ThSBT34-ThSBT41 contained motif8 and motif9. According to the Pfam annotation query, motif1 and motif4 are encoded by the subtilase family, whereas the other motifs have no specific description.

Chromosomal distribution of subtilase genes
Overall, 41 subtilase genes were unevenly distributed on 22 Trichoderma chromosomes.

Promoter cis-elements analysis of T. harzianum subtilase genes
To understand the function and regulatory mechanism of the subtilase genes, the PlantCARE programme was used to analyse the upstream sequence of all the genes, and a series of cis-elements related to a stress response were identified( Table 2). The results showed that the subtilase gene contained various types of cis-acting elements, and the same element had multiple action sites in the promoter region of a gene (Figure 7). This study identified elements closely related to the stress response, namely abscisic acid-responsive element  (14), while ThSBT2 and ThSBT6 had the lowest number of elements (4). The TGACG element had the widest distribution (ranging between 1 and 9) in all the gene promoter sequences, except ThSBT1. The second most was the ABRE, except for that in ThSBT7, ThSBT19, ThSBT20, and ThSBT39, and it was found in all the remaining sequences (ranging between 1 and 5). The distribution of MBS in LTR was relatively even, with most of them having 0 or 1 and a few having 2 or 3 MBS elements. The GARE motif and TCA-element were less distributed; 1 or 2 of these elements were present in a few sequences, whereas most sequences did not have any of these 2 elements. TC-rich repeats, WUN-motif, and ERE had the least distribution, with only 1 element existing in a few sequences. These results indicated that the subtilase genes may play a role in stress tolerance.

Expression of subtilase genes in the nematode resistance response
The current understanding of the function of the subtilase genes in Trichoderma is limited.
We performed an expression analysis of the nematode resistance subtilase genes to determine its possible role. Many genes showed substantial expression differences between the treatment and control groups, which indicated that they may be involved in resistance response.
Moreover, the subtilase genes have been previously reported to be differentially expressed in nematocidal fungi in response to nematode infection, indicating its potential role in nematode resistance interactions. Therefore, these genes are functional candidates for future research. The development and improvement of next-generation sequencing technology have allowed the wide use of RNA-seq in various studies of non-model organisms. In this study, we used the transcriptome data from a previous study to verify gene expression (Figure 9) Peptidases_S8_ProteinKivertases, was upregulated. Among ThSBT34-ThSBT41, which belonged to subfamily 5 of the peptidase S8 family, ThSBT39 was downregulated and the rest were upregulated. The cluster heat map showed the overall changes in the trend of these genes.
About 75% of the expression of subtilase genes during the resistance period of mycelia showed an upward trend, and 25% showed a downward trend. These identified genes showed differential expression under the given nematode resistance, which provided a reference for similar studies and further functional analysis.

Discussion
Nematodes cause considerable destruction of agricultural produce worldwide. At present, although many natural nematophagous fungi or bacteria have been used in biocontrol methods, disadvantages such as culture technologies and host limitations still exist [56] [57]. In the recent years, the omics sequencing technology of fungi has been rapidly developed, and the publication of genome and transcriptome data has allowed researchers to conduct an in-depth analysis of nematodes [1][53] [59]. Serine proteases are widely found in fungi, and their nematicidal effects have been studied[60] [61]. Most of these proteases are substrate enzymes with similar biochemical properties and similar sequences. They are one of the important enzymes involved in the recognition of nematodes [34]. Among them, subtilase is essential in the evolution of fungal pathogenicity to nematodes. Subtilase was initially discovered as an extracellular secreted alkaline protease from B. subtilis [62]. Subsequently, researchers globally conducted more in-depth and extensive research using various species and isolated similar proteases. The present study mainly focused on parasitic insects and pathogenic fungi. The subtilase proteases of pathogenic fungi play the role of virulence factors, mainly involved in the processing and assembly of various proteins such as membrane surface glycoproteins, growth factors, cytotoxins, complement system-related enzymes, and blood coagulation-related enzymes [63].
Subtilase is a member of the serine protease family and has evolved independently to form the polymerised Asp/Ser/His catalytic triad. The structure is an alpha/beta sheet containing a 7-chain parallel beta plate fold, and the order is 2314567 [20]. All subtilase family members have a Peptidase-S8 domain with high homology as determined by Pfam and HMMER analysis, indicating that the domain is relatively conservative and is also the premise for constructing the evolutionary tree. Based on the phylogenetic analysis, these subtilase genes were grouped into 6 clusters. The tree showed that proteases in the same subfamily have higher homology.
According to the predicted subcellular localisation data (Table 1), some genes are expressed in the cytoplasm, whereas most genes encode secreted proteins that produce toxins that act outside the cell. The pI value of these proteins ranged between 4.61 and 8.17. Some studies have found that a higher pI value is conducive to the hydrolytic activity of the protease itself and plays an important role in the binding process of the enzyme and the nematode epidermis [5].
We constructed the phylogenetic tree with 287 protein sequences of subtilase family members.
arundinaceum, which is related with trichothecenes, a family of terpenoid toxins. Multiple sequence comparisons showed high homology and their catalytic motifs were highly conserved. We may assume that they may have a similar function. Moreover, ThSBT20 was clustered with PlSBT21 [67], PlSBT22, and PlSBT23 [64], its protease-inhibited hatching of the root-knot nematode M. incognita [68]. Furthermore, IC Paterson reported that Metarhizium anisopliae produces several extracellular cuticle-degrading proteases and this evidence is consistent with the finding of a previous study reporting that PR1, a chymoelastase, is a determinant of pathogenicity [69]. Previous studies have shown that parasitic fungi can compensate for the loss of protease activity because of the production of serine protease inhibitors by the host by expressing other proteases [34]. Some proteases were not very similar but they all had the same domain and conservative catalytic motifs, which indicated that their presence could compensate for the enzymatic activity of the invasive protease.
When the strain is stimulated by external stress factors, it transduces transcription factors through a series of signals. These activated transcription factors combined with the cis-acting elements of the target gene promoters then activate the transcriptional expression of stress resistance genes and respond to external regulatory stress signals [70]. The cis-acting element in the promoter region of the gene plays a key role in the biological stress response [71]. To explore the function of the subtilase gene regulatory region in the process of resistance to the stress, we collected the sequence information of the 2000-bp region upstream of the promoter of the subtilase gene, and the element name and site information were also obtained. We focused on the cis-acting elements related to antiadversity, defence response, and detoxification effects. All these motifs were randomly distributed in the positive and negative chains of the promoter sequence. Analysis of the cis-acting element genes in the promoter region showed that each subtilase gene carried multiple elements related to stress response in its promoter region, indicating that the genes may be involved in stress resistance and defence response. First, the cis-acting element TGACG motif, which is involved in the jasmonic acid response, had the highest distribution in all the genes, and ThSBT10 had the most number of TGACG motifs. We assumed that this gene is strongly associated with the defence response. Second, the ABRE, which is involved in the abscisic acid reaction, had high distribution in most genes, especially in other substrates such as vitellin and chitin [72]. Furthermore, ThSBT20 showed upregulated expression with a log2 fold change of more than 2, and it was clustered with genes from Purpureocillium lilacinum. In addition, ThSBT32 was clustered with FoSBT32 in Fusarium oxysporum. A previous study reported that the extracts of Fusarium oxysporum exhibit inhibitory activities against M. incognita and Radopholus similis [73]. This study provides the required data for selecting candidate genes in the subtilase gene family and for further research.

Gene identification and sequence retrieval
T. harzianum whole-genome sequence data were downloaded from the NCBI database.
Bioedit software was used to extract the gene sequence of the subtilase family of T. harzianum from the genome file, after which the full-length amino acid sequence was queried and downloaded from the NCBI website. By using the Hidden Markov Model (HMMER) Blast was used to screen and obtain the submember sequences of closely related subtilase family genes.

Phylogenetic tree construction of subtilase genes
The Muscle software programme was used to perform multisequence alignment on the 287 protein-encoding genes of the subtilase family. Evolutionary history was inferred by using the maximum likelihood method based on the JTT matrix-based model [77] in Mega7 software [78].
The test parameter (Bootstrap) was set to 100 to generate the molecular phylogenetic evolution tree. iTOL was used to draw the evolutionary tree [79].

Gene structure analysis of T. harzianum subtilase family genes
The web server Gene Structure Display Server (GSDS) [ [82] was used to identify the conserved motifs of the 41 protein domain sequences from T. harzianum subtilase genes, and the parameter settings were as follows: number of repetitions = zoops, maximum number of patterns = 20, and optimal motif length = 6-200 residues. Then, we obtained the different motif recognition regions, and the corresponding function and sequence message were analyzed using the Pfam database.

Chromosome location and cis-acting elements analysis of T. harzianum subtilase genes
The chromosomal location information of 41 subtilase genes was identified from the NCBI database, and their distribution on the chromosome was analysed using MapGene2Chrom web v2 software (http://mg2c.iask.in/mg2c_v2.1/). TBtools was used to extract and screen the upstream sequence (2 kb) of subtilase genes from the whole genome sequence of T. harzianum.

Fungal materials and RNA extraction
Trichoderma mycelium was cultured at 25°C for 5 days in the Potato Dextrose Agar medium.
The nematode was cultured in the presence of Botrytis cinerea using the corn kernel medium.
The mature nematodes were extracted by using the Bellman funnel method [84]. The volume of the extracted liquid containing pinewood nematodes was adjusted to 6000/mL, and then added to the mycelia and covered uniformly. The mycelium was collected on the fifth day and considered the treatment group, while the mycelium collected at 0 h was considered the control group. The samples were sent to Novogene company for RNA extraction and detection, cDNA library establishment, and quality inspection. Total extracted RNA was used to synthesise cDNA using the RNALAPCR kit (TaKaRa, Shiga, Japan) according to the manufacturer's instructions.

Expression analysis of subtilase genes in Trichoderma in response to nematodes
To determine the expression profiles of subtilase genes in Trichoderma in response to nematodes, RNA-seq reads were retrieved from previous experiments performed using t-test values (p < 0.05) after Bonferroni correction served as significantly differential expressed genes. On the basis of the FPKM values, a heat map was generated using the R studio [85].

Conclusions
In the recent years, the research on protease genes and metabolites has increased considerably, with a focus on hydrolytic proteases and their metabolites. Identifying effective methods for controlling the populations of various insects, pathogenic bacteria, and nematodes; reducing the crisis of chemical control; and promoting the development of biological control are the outstanding points of the new generation of biological agents. The subtilase family plays an important physiological role in biological organisms and has extensive research and application value. In this study, 41 subtilase family genes were screened based on the T. harzianum genome database, and bioinformatics analysis was performed to determine gene characteristics, protein structure, evolutionary status, and function. According to their expression profile, we speculated their role in the process of resisting stress. The roles of subtilase gene expression differences in response to nematode stresses provide insight into the roles of defence and resistance in the mycelia of Trichoderma. These results lay the theoretical foundation for further research on the function of the subtilase gene and nematicidal mechanism.  Table S4. Detailed information of the 19 motifs in the subtilase proteins of Trichoderma harzianum. Figure S1. Detailed information of subtilase conserved domains.

Conflicts of Interest:
The authors declare no conflict of interest.