Genome-wide Investigation and Expression Profiles of Heat Shock Transcription Factor (HSF) Gene Family in Maize (Zea mays L.) under Different Growth Stages and Abiotic Stress Conditions

Heat shock transcription factors (HSFs) participate in regulating many environmental stress responses and biological processes in plants. Maize (Zea mays L.) is a major cash crop that is grown worldwide. However, the growth and yield of maize are affected by several adverse environmental inputs. Therefore, investigating the factors that regulate maize growth and development and resistance to abiotic stress is an essential task for developing stress-resilient maize varieties. Thus, a comprehensive genome-wide identification analysis was performed to identify HSFs in the maize genome. The current study identified 25 ZmHSFs, randomly distributed throughout the maize genome. Phylogenetic analysis revealed that ZmHSFs are divided into three classes and 13 sub-classes. Gene structure and protein motif analysis supported the results obtained through the phylogenetic analysis. Domain analysis showed the DNA-binding domain to be the most conserved region of ZmHSFs. Segmental duplication is shown to be responsible for the expansion of ZmHSFs. Most of the ZmHSFs are localized inside the nucleus, and the ZmHSFs which belong to the same group show similar physio-chemical properties. The 3D structures revealed comparable conserved ZmHSFs protein structures. RNA-seq analysis revealed a major role of class A HSFs including, ZmHSFA-1a and ZmHSFA-2a in all the maize growth stages, i.e., seed, vegetative, and reproductive development. Furthermore, ZmHSFs displayed an obvious spatiotemporal expression. Under abiotic stress conditions (heat, drought, cold, UV, and salinity), members of class A and B ZmHSFs are induced. Gene ontology (GO) annotation analysis indicated a major role of ZmHSFs in resistance to environmental stress and regulation of primary metabolism. Further, the protein-protein interaction analysis showed that ZmHSFs interact with several molecular chaperons and major stress-responsive proteins. To summarize, this study provides novel insights for functional studies on the ZmHSFs in maize breeding programs.

under HS, while the expression of several genes was down-regulated even under normal growth conditions [29]. HSFA2 is responsible for the extension of acquired thermotolerance (AT) in Arabidopsis [31]. In hsfa2 mutants, the duration of AT was compromised, and the expression of HS-inducible genes was down-regulated. Lämke et al. reported that HSFA2 promotes the sustained activation of several HS-memory genes through methylation of the target genomic region [32]. The transcripts of HSFA2 are undetectable under normal conditions. However, after HS, the HSFA2 becomes the most strongly induced HSF in plants [18,33]. Yoshida et al. reported that the overexpression of HSFA3 improves plant thermotolerance, while the T-DNA mutants showed reduced thermotolerance [34]. Lin et al. reported that HSFA2, HSFA4a, and HSFA7a are essential for HSR and cytoplasmic protein response. HSFs have also been characterized in several crop plants [35]. It was reported that OsHSFA4a and its homolog in wheat TaHSFA4a, confers cadmium tolerance to plants [36]. The expression of TaHSFA2-10 is induced in response to HS, oxidative stress, salicylic acid, and its overexpression improve plant thermotolerance [37]. In addition, HSFs are also involved in the regulation of growth and development in plants [16]. For example, HSFA9 is expressed specifically during embryogenesis and maturation in the seeds of Arabidopsis [38]. Albihlal et al. reported that in Arabidopsis, at least 85 development-associated genes are controlled by HSFA1b [39]. Authors proposed that the HSFA1b allows plants to adjust growth and development under continuously varying environments by transducing external stimuli to stress-associated and development-related genes.
HSF gene family has been characterized in several plant species, including Arabidopsis thaliana [20], Oryza sativa [40], Zea mays [41], Glycine max [42], Populus trichocarpa [43], Solanum lycopersicum [44], Brachypodium distachyon [45], and Triticum aestivum [46]. However, the role of HSFs in plant growth and development and in responses to stresses other than HS, is not very well understood in maize. Computational biology approaches provide a convenient and reliable platform upon which further wet-lab studies could be carried out. Here, we perform an extensive in-silico analysis of maize HSFs to gain better insights into the functional role of maize HSFs in growth and development and tolerance to multiple abiotic stresses.

Sequence Retrieval
The protein sequences of Zea mays HSFs were extracted from PLAZA 4.5 (https://bioinformatics.psb.ugent.be/plaza/versions/plaza_v4_5_monocots/) and Ensembl Plants (https://plants.ensembl.org/index.html) databases. For this, the BLASTP search was performed using the Arabidopsis (AT4G17750, AT2G26150, AT5G03720, AT4G36990, and AT3G24520) and Zea mays (GRMZM2G115456, GRMZM2G002131, GRMZM2G086880) HSFs against maize genome Zm-B73-REFERENCE-NAM-5.0, as a query to obtain a putative list of HSFs in the maize genome. These sequences were checked for the presence of DNA binding domain (DBD) and oligomerization domain (OD) through EMBL-EBI employing the hidden Markov model (HMM) (https://www.ebi.ac.uk/Tools/hmmer/search/phmmer) and SMART (http://smart.emblheidelberg.de/) tools [47]. The coiled-coil structure, which is a property of OD were predicted using MARCOIL [48]. After carefully analyzing the sequence, the proteins that lacked DBD and/or OD (HR-A/B region) were removed. In addition, the redundant proteins that were the product of a single gene were also discarded from further analysis. Finally, a total of 25 maize HSFs genes were selected for further analysis.

Sequence Alignment and Phylogenetic Analysis
The protein sequences of Arabidopsis thaliana, Oryza sativa, Brachypodium distachyon, and Sorghum bicolor were aligned with Zea mays HSFs through Clustal W [53]. An unrooted phylogenetic tree was constructed using the Molecular Evolution Genetic Analysis X (MEGA X) tool [54]. The tree was constructed via the Neighbor-joining method with 1000 bootstrap values.

Gene Duplication and Evolutionary Analysis
Gene duplication events were investigated by following two parameters: (1) the length of alignable sequence cover > 80% of the longer gene; and (2) the similarity of the aligned regions > 70% [55,56]. To analyze the molecular evolutionary rates of duplicated gene pairs, the non-synonymous substitution (Ka) and synonymous substitution (Ks) ratio were calculated using Ka/Ks calculation tool (http://services.cbu.uib.no/tools/kaks). Based on a rate of 6.1 x 10-9 substitutions per site per year, the divergence time (T) was calculated as T= Ks/ (2 x 6.1 10 -9 ) x 10 -6 million years ago (Mya) for HSF genes [57]. The duplicated HSF genes were connected using Tbtools [58].

Chromosomal Distribution and Expression Profiling of HSF Genes
Based on their initial positions on the maize genome, the HSF genes were named, and a chromosomal graph was constructed using Tbtools. The RNA-seq data utilized in the current study was retrieved from maize MaizeGDB (https://qteller.maizegdb.org/genes_by_name_B73v4.php). For this, the comprehensive maize gene expression analysis performed by Stelpflug et al. was downloaded [59]. The present study included the analysis of the growth, tissue, and stress-specific expression of maize HSFs. The transcript per million values was normalized using the log-2 base, and a heat map was constructed using Tbtools.

Identification and Chromosomal Distribution of Maize HSFs
With the availability of the genomic sequences of the number of plant species, including maize, it is now possible to obtain the protein sequences of all the HSF members. In the present study, a total of 25 HSFs were identified from the maize genome ( Figure 1; Table 1). All the HSF proteins were surveyed for the presence of DBD and OD through EMBL-EBI, employing HMM. Furthermore, SMART was used to search the HSF-DBD to check the accuracy of the results. After discarding redundant sequences, 25 ZmHSFs were selected for analysis. These HSFs were named based on their chromosomal locations (ZmHSF-01 to ZmHSF-25) ( Table S1). The characteristics of maize HSF genes have been presented in Table 1. All the HSFs were mapped on the chromosomes of maize ( Figure 1). The maize genome was shown to possess HSF genes on all of its chromosomes, though the number of HSFs between different chromosomes varied considerably. Chromosome 1 had a maximum of 6 HSFs genes, whereas a single HSF gene copy was localized on each of chromosomes 4, 6, and 10. On the other hand, chromosomes 2, 3, 7, and 9 harbors two gene copies each, while three gene copies were recognized on each of chromosomes 5 and 8. Except for ZmHSF01, ZmHSF02, ZmHSF12, ZmHSF16, and ZmHSF23, all the other HSF genes were present on the lower arm of the chromosomes.

Phylogenetic Analysis and Classification of Maize HSFs
In present study, the evolutionary relationship among AtHSFs, OsHSFs, SbHSFs, BdHSFs, and ZmHSFs were explored. A total of 118 HSFs were divided into three classes and 15 sub-classes based on the phylogenetic tree ( Figure 2; Table S2). Variation in HSF gene number was observed among different plant species and between sub-groups (  Table 2). Results indicated that maize HSFs were closed to sorghum HSFs. Similarly, HSFs of rice were closer to Brachypodium HSFs, which is in line with the botanical classification among monocots. In contrast to Arabidopsis, sub-class A1 contains less HSFs in monocots (Table 2). On the other hand, sub-class A2 appears to have expanded in monocots. There was no ortholog of Arabidopsis HSFA9, HSFB3 in monocots that might reflect the loss of these sub-groups in family Poaceae ( Figure 2, Table 2). Contrarily, this could also signify the gain of these sub-groups in dicots. A higher number of class C HSFs was observed in monocots. The AtHSFA2, AtHSFC1 did not align with subclass A2, C1 in the present study, which is in alignment with the results reported by Lin et al. [41].

Gene Duplication Analysis and Evolutionary Rate Calculation
Gene duplications play an important role in evolution as duplications cause genes to generate gene families [64]. In fact, it has been proposed that tandem and segmental duplications have been the primary driving source of evolution as these events lead to expansion of gene families, and generation of proteins with novel functions [65]. Tandem duplication involves the duplication of two or more genes located on the same chromosome, while segmental duplication refers to the phenomenon when genes belonging to the same clade but located on different chromosomes are duplicated [66]. In the present study, a total of 18 (18/25; 72 %) maize HSF genes were shown to be duplicated (Table 3). Further, only one pair of a gene (ZmHSF-01/Zm-HSF-04) appeared to be tandemly duplicated, which was recognized on chromosome number 1 ( Figure 3). The rest of duplicated genes were all segmentally duplicated, with eight different clusters containing 16 genes. These genes were recognized on chromosomes 1-9. Moreover, the molecular evolutionary rate of tandemly and segmentally duplicated HSF genes was calculated to gain insights into the selective constraints on the duplicated HSF genes. The ratio of Ka and Ks substitution rate is an effective method to investigate the selective constraint among duplicated gene pairs [67]. Hence, in the present study, Ka, Ks, and Ka/Ks values for each pair of the paralogous gene were calculated (Table 3). In principle, the value of Ka/Ks < 1 signifies the purifying selection (negative selection), Ka/Ks > 1 signifies positive selection, and Ka/Ks = 1 means neutral selection [57]. Here, 18 ZmHSF genes were shown to be duplicated. The Ka/Ks ratio for duplicated ZmHSF genes ranged from 0.3415 to 0.7572. All the HSF genes in the present study have Ka/Ks value < 1.
The outcome suggests that these genes were under strong purifying selection pressure by natural selection during the course of evolution. Further, the divergence periods of tandemly and segmentally duplicated ZmHSF genes were estimated to range from 5.96 to 38.04 with an average of 20.89 million years ago (MYA). Some paralogous gene pairs (ZmHSF-02/ZmHSF-24, ZmHSF-09/ZmHSF21, and ZmHSF-16/ZmHSF20) appeared to be recently duplicated ( Table 3). The grass species are estimated to have diverged around 56-73 MYA [68,69]. In the present analysis, all the HSF genes in maize appeared to have duplicated after the divergence of grass species. Further, most of the HSF genes in maize are paralogs, and it can be concluded that duplication events (primarily segmental) played a significant role in the expansion of the HSF gene family in maize. Table 3. Duplicated gene pairs, non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), non-synonymous to synonymous substitution rate ratio (Ka/Ks), estimated time of duplication event in a million years ago (MYA), and mode of gene duplication.

Gene Structure and Protein Motif Analysis
To investigate the structural relationship among the HSF genes, the intron-exon organization of all the targeted HSFs was analyzed using GSDS software. The intron-exon structure and number play a key role in the evolution of gene families [70,71]. The gene structure analysis was in line with the phylogenetic relationship among maize HSFs (Figure 4). In general, the intron and exon numbers were shown to be highly consistent. Particularly, 92% (23/25) HSFs contain only one intron except for HSF-02 and HSF-24 ( Figure  4). Similarly, HSF-02 and HSF-24 were shown to contain 3 and 5 exons. In contrast, the rest of the HSF genes contained two exons. Further, 17 HSFs contained 5′ UTR and 3′ UTR. The HSF genes belonging to the same class and sub-class showed a similar intron-exon pattern in terms of intron number, exon length, intron phase, and overall gene length (Figure 4). MEME was used to identify the conserved motifs/regions responsible for DNA-binding, oligomerization, nuclear localization, nuclear export, and biological activation of HSFs ( Figure 5). In total, 20 motifs designated as motifs 1-20 were identified among maize HSF proteins (Table 4). The highly conserved DBD is represented by motif 1, 2, and 4. Motif 3 corresponds to OD of class A and C HSFs. The HR-A core region of OD is represented motif 5 and is present in all HSFs. OD of class B HSFs is depicted by motif 7. Motif 6 constitutes the NLS of class A HSFs and is present in eight members. The AHA motif is shown by motif 8 and is present in 11 class A HSFs. The NES of class A HSFs is represented by motif 16. Motif 15 constitutes the NLS of Class B2 HSFs. Certain HSFs also possess class-specific conserved motifs i.e., ZmHSF-02 and ZmHSF-24 contain motif 10 and 13, HSFs of subclass B1 harbor motif 14, sub-class C1 HSFs motif 19, the sub-class B2 HSFs motif 11, and sub-class A4 members (ZmHSF-16, ZmHSF-20) contain motif 18. Motif 9 and 12 are present in the same members of class A HSFs. Four members of class A HSFs contain motif 17. Motif18 is found in two members of class A HSFs. Finally, motif 20 is found in members of sub-class A2 and A6 HSFs.

Domain Analysis and Physio-chemical Properties
The modular structure and the functional domains of HSFs have been studied and described extensively [22]. The HSF-type DBD was highly conserved and consisted of approximately 100 amino acids ( Figure 6). The locations of DBD and OD were predicted using SMART and MARCOIL ( Table 5). The DBD of most maize HSFs was located at the beginning of the N-terminal. Few exceptions were ZmHSF-03, ZmHSF-10, ZmHSF-14, and ZmHSF-15. As expected, the linker length between DBD and OD of HSBs was larger than HSFAs and HSFCs.
The physio-chemical properties of HSFs such as amino-acid length, Mw, and pI were investigated using Expasy (Table 5). In addition, the amino acid composition of each group was analyzed using the online tool CoPId (Table S3). The amino acid length of class A HSF ranged from 350 (ZmHSF-23) to 528 (ZmHSF-14), for class B, 298 (ZmHSF-08) to 414 (ZmHSF-03) and in class C, the amino acid length ranged from 257 (ZmHSF-13) to 348 (ZmHSF-21  . In general, the pI of class A HSFs was in acidic ranges except for ZmHSF-10, which was shown to be in a slightly basic range. The pI of class B was in slightly acidic and basic ranges. Finally, for class C the pI was in similar ranges as class B HSFs. The average amino acid composition of ZmHSFs ranged from 1.1 (cysteine) to 10.0 (alanine) (Figure 7). The average amino acid composition of class A HSFs ranged from 0.8 (cysteine) to 8.7 (alanine). In contrast, the average amino acid composition of class B ranged from 1.3 (tryptophan) to12.6 (alanine). Finally, the class C amino acid composition ranged from 1.1 (tryptophan) to 11.2 (alanine).

Proteins Structure and Sub-cellular Localization of Maize HSFs
The protein structures of maize HSFs were predicted through Alphafold. The predicted models were downloaded to view their 3D structure (Figure 8). The highly conserved DBD is represented by α-helices and β-sheets. The OD can be seen to be linked with DBD through linker residues. Most of maize HSFs were predicted to be localized inside the nucleus (Table 5). Exceptions were ZmHSF-10 and ZmHSF-23 (HSFA7a and HSFA7b). This indicates that class 7 HSFs might possess distinct properties.  Table 5.

Expression Profiles of Zea mays HSFs during Different Developmental Stages
The expression patterns of Zea mays HSF genes were investigated during different growth stages and across different time scales. Although gene expression pattern is not always directly related to protein abundance, the transcriptome profiles can still provide insights into the probable role of genes in particular biological processes [72]. The transcriptome data utilized in the present study was downloaded from the maize genome database [59]. We analyzed HSF profiles in 3 different growth stages (seed, vegetative, and reproductive) and across 20 different tissues (embryo, endosperm, whole seed, primary root, tap root, whole root, stem and shoot apical meristem, immature leaves, tip of stage 2 leaf, mature leaf tissue, pooled leaves, topmost leaves, vegetative meristem and surrounding tissues, immature tassel, meiotic tassel, anthers, mature pollen, mature female spikelet, pre-pollination cob, immature cob, and silks) (Table S4). Furthermore, the expression patterns were investigated at different timescales in a particular developmental stage to get an overview of the spatiotemporal expression of HSFs.
The ZmHSF-04, ZmHSF-05, and ZmHSF-06 are shown to be the most highly induced HSFs across all the tissues during different growth stages (Figure 9 A,

Expression Pattern of Zea mays HSFs under Abiotic Stresses
Maize growth, development, and yield is adversely affected by several abiotic stresses [73]. Therefore, to examine maize HSFs expression under different abiotic stress events, RNA-seq data was analyzed and a heat map was constructed ( Figure 10;  Figure 10). However, some HSFs were only induced by a particular stress. For example, higher transcripts of ZmHSF-24 are only detected after HS treatment. Similarly, ZmHSF-16 is slightly overexpressed after cold treatment. While some HSFs showed almost no expression under any stress condition. These include ZmHSF-03, ZmHSF-08,  and ZmHSF-22 ( Figure 10).

Functional Annotation of Maize HSFs
HSFs have been reported to play a major role not only under stressful conditions but also in plant growth and development [16,72]. Therefore, the regulatory functions of maize HSFs were predicted through GO annotation investigation based on the biological process (BP), molecular function (MF), and cellular component (CC) classes ( Figure 11; Table S6). The BP annotation analysis indicated that maize HSFs are mainly involved in cellular response to heat (GO:0034605), response to heat (GO:0009408), response to temperature stimulus (GO:0009266), Regulation of transcription by RNA polymerase II (GO:0006357), response to abiotic stimulus (GO:0009628), cellular response to stress (GO:0033554), regulation of RNA biosynthetic process (GO:2001141), etc. (Figure 11). With regards to MF annotation analysis, it was revealed that maize HSFs are mostly involved in RNA polymerase II cis-regulatory region sequence-specific DNA binding (GO:0000978), cis-regulatory region sequence-specific DNA binding (GO:0000987), RNA polymerase II transcription regulatory region sequence-specific DNA binding (GO:0000977), transcription regulatory region nucleic acid binding (GO:0001067), transcription cis-regulatory region binding (GO:0000976), sequence-specific double-stranded DNA binding (GO:1990837), double-stranded DNA binding (GO:0003690), etc. ( Figure  11). The CC annotation study showed that ZmHSFs are majorly involves in the nucleus (GO:0005634), intracellular membrane-bounded organelle (GO:0043231), membranebounded organelle (GO:0043227), intracellular organelle (GO:0043229), organelle (GO:0043226), etc. (Figure 11). To conclude, the GO annotation study confirms the role of maize HSFs in regulating abiotic stresses and plant metabolism. Figure 11. Gene ontology enrichment analysis. The figure depicts the predicted role of HSFs in biological processes, molecular functions, and cellular components.

Protein-Protein Interaction Network Analysis
The protein network interaction analysis can help understand the biological functions and mechanisms of protein function [74]. Since both the RNA-seq data and GO annotation analysis suggested the role of HSFs in stress conditions and normal growth, we performed network analysis to predict the interacting partners of ZmHSFs (Table S7). The results showed that maize HSFs interact not only with themselves but also with a range of proteins with well know functions in cellular growth and stress responses ( Figure 12). For example, HSFs were shown to interact with molecular chaperons HSP101, HSP82 (belongs to HSP90 family), HSBP-2, and DnaJ-like protein (belongs to the HSP40 family). It was reported that HSP101 and HSA32 interact with each other and promote acquired thermotolerance in Arabidopsis [75]. The HSP82 was reported to be induced by higher temperatures. A higher concentration of HSP82 is required for normal cellular growth in yeast at higher temperatures [76]. Gu et al. reported that maize HSBP-2 and HSFA2 interact with each other and modulate raffinose biosynthesis [77]. HSFA2 was shown to bind to the promoter sequence of HSBP-2 and activate its expression. Higher raffinose synthesis improved HS tolerance of Arabidopsis thaliana. The DnaJ-like proteins are molecular cochaperones that interact with HSP70s and control protein homeostasis [78]. DnaJ proteins have been reported to play a critical role in plant growth, development, and HS tolerance [78][79][80]. ZmHSFs also interact with two major proteins, i.e., multi-protein bridging factor 1c (MBF1c) and DREB2A. Both these proteins have been shown to accumulate under diverse abiotic stress conditions. DREB2A is a major protein, and its overexpression improves plant HS, drought stress, cold stress, etc., tolerance [81]. MBF1c is a transcriptional co-activator that modulates the expression of DREB2A, some HSFs, and phytohormones [3]. Interestingly, MBF1c is necessary for basal thermotolerance but not for acquired thermotolerance [82]. In addition, MBF1c is also shown to be required for plant developmental responses [83].
Maize HSFs are also predicted to interact with SUMO proteins. SUMOylation is a post-translational phenomenon where SUMO proteins are covalently attached and detached to target proteins [84]. This process affects several biological processes inside the cell, including transcriptional regulation of gene expression, apoptosis, programmed cell death, cellular response to stress, stability of proteins, etc. [84]. Rytz et al. reported that SIZI, a SUMO protein, targets multiple TFs, chromatin remodelers, transcriptional co-activators/repressors connected to abiotic and biotic stress responses [85]. This suggests maize HSFs may also be SUMOylated under diverse biological conditions and stress responses. To conclude, PPI analysis was in alignment with the RNA-seq and GO annotation analysis which indicated that HSFs of Zea mays play an important role not only in abiotic stress conditions but also in maize growth and metabolism.

Discussion
Maize (Zea mays) is a major cereal crop that is widely cultivated worldwide for food, feed, fiber, and fuel. Maize is also considered a model plant for basic and applied research in plant science [86]. Unraveling the factors regulating the growth and stress resistance would contribute significantly to the development of climate-smart, stress-resilient maize cultivars with higher agricultural productivity. The sequencing of maize genome (B73 inbred line) in 2009 opened the plethora of opportunities to identify, analyze, and characterize stress-associated genes in maize [87]. To provide food security in the scenario of climate change and ever-growing world population, it is imperative to understand the molecular mechanisms behind plant stress resistance and explore genetic resources associated with the higher crop yield [3,15]. HSFs have been identified in several plant species, including important crops. The Arabidopsis thaliana, Oryza sativa, Zea mays, Glycine max, Populus trichocarpa, Solanum lycopersicum, Brachypodium distachyon, Sorghum bicolor, and Triticum aestivum contain 21, 25, 25, 38, 28, 26, 24, 23, and 61 HSFs in their genomes [20,[40][41][42][43][44][45]74,88]. Following the sequencing of several plants, it is found out that the number of HSFs may be independent of the genome size [74]. For example, Arabidopsis thaliana (135 Mb) contains 21 HSFs [20,89], while Medicago truncatula (375 Mb) harbors 15 HSFs [43,90]. Similarly, 25 HSFs are found in Oryza sativa (430 Mb) [40,91] and an equal number of HSFs are also present in Zea mays (2.4 Gb) [41,87]. Even though the HSF gene family was previously characterized by Lin et al. [41], our work differs from theirs in multiple aspects. Their research was mostly restricted to the identification and classification of ZmHSFs. On the other hand, this comprehensive study particularly focused on the evolutionary analysis, expression profiling, GO, and PPI networks to explore the probable regulatory role played by ZmHSFs under benign and stress conditions.
The distribution of the HSF gene family in maize was analyzed by constructing a chromosomal map (Figure 1). The fact that all the chromosomes harbor at least one HSF gene suggests that the most recent common ancestor of Zea mays has HSF genes distributed widely in its genome. Phylogenetic analysis indicated maize HSFs are divided into three classes and further into 13 sub-classes which is consistent with the HSF class number observed in other monocots. For example, the HSFs of Oryza sativa, Brachypodium distachyon, Sorghum bicolor, and Triticum aestivum are also divided into three classes and 13 subclasses [40,45,46,88]. Despite that, differences among HSF numbers were observed in different sub-classes between monocots ( Figure 2, Table 2). For example, compared to rice, Brachypodium, and Sorghum, the sub-family B4, A2, and C2 contain less HSF members in maize. On the other hand, the sub-class A1, A4, A8, B1, and B2 is expanded in Zea mays ( Figure 2, Table 2). Gene duplications generate new genes and provide novel possibilities for evolutionary success [92]. In the present analysis, 9 pair of ZmHSFs were shown to be paralogs (Table 3, Figure 3). The results indicate that segmental duplication events have played a major role in the expansion of the HSF gene family in maize. An increase in gene regulatory repertoire such as transcriptional regulators, developmental regulators, signal transducers, etc., is a prerequisite for the evolution of complex systems in different organisms [22,93]. Since the gene duplication events result doubling of a single gene which cannot account for such large expansions, it has been suggested that whole-genome duplication (WGD) events have been instrumental in expanding the regulatory repertoire of plants [92]. It is assumed that the Arabidopsis genome went under two rounds of WGD in the past 60-70 million years [94,95] and that more than 90% increase in regulatory genes have been caused by duplication events in Arabidopsis in the past 150 million years [94]. This suggests that an increase in the HSF gene members in plants accounts for WGDs. Additionally, segmental duplications occur in gene families, which evolve at a slower rate [65]. It is thought that the increase or decrease in exon number play an important role in the evolution of a gene family [96]. Therefore, we investigated the number and distribution of introns and exons in ZmHSF genes. Our results showed that the ZmHSF genes contain 2 exons and 1 intron except for ZmHSF-02 and ZmHSF-24 ( Figure 4; Table 1). Moreover, the length and position of exons and introns were well conserved in the same sub-classes but varied considerably between different sub-classes.
The previous investigations showed that HSFs play an important role in plant growth [39]. Therefore, we investigated the tissue-specific expression of ZmHSFs in twenty different developmental tissues using RNA-seq data ( Figure 9A, B, C). Several genes showed an enhanced expression that reflects their role under various developmental stages. In particular, ZmHSF-05 (A-2a) and ZmHSF-06 (A-1a) were highly expressed almost across all the growth phases. The hsfa1abde quadruple mutants displayed abnormal phenotype and growth retardation, implying HSFA1s is also involved in developmental processes [29]. Interestingly, HSFA2 could rescue the developmental defects of hsfa1abde quadruple mutants [97]. This further supports the result obtained from our analysis and provides a strong base for further wet-lab studies to characterize the function of ZmHSF-05 and ZmHSF-06 in plant growth and development. Similarly, HSFs have been reported to play a key role in plant acclimation to abiotic stress conditions. Kumar et al. reported that TaHSFA6e modulates tolerance of wheat to HS and drought stress during pollination and grain filling stages [98]. Yokotani et al. reported that OsHSFA2e improves Arabidopsis tolerance to HS and salinity stress by activating the expression of HSPs [99]. Thus, the expression patterns of ZmHSFs were evaluated under abiotic stress conditions. Most of ZmHSFs displayed stress-specific expression with some HSFs showing up-regulation only under particular stress events ( Figure 10). Jiang et al. reported that ZmHSF-04 improves plant tolerance to HS, salinity stress and increases the sensitivity to abscisic acid [100]. Similarly, ZmHSF-12 overexpression improves plant basal thermotolerance and AT [101]. These results are in line with our analysis which showed a higher transcript accumulation of this TFs under respective stress conditions ( Figure 10).
The PPI analysis indicated that maize HSFs interact with molecular chaperones and stress-associated proteins ( Figure 12). Molecular chaperones are present inside the cells and are constitutively expressed under normal conditions or are induced under specific developmental stages or stress conditions [102]. These chaperons perform various functions under physiological conditions inside the cells, such as signaling, folding and stabilization, translocation, and degradation of proteins [11,102]. Under harsh environmental conditions, molecular chaperones act as powerful buffers to limit protein misfolding/unfolding and prevent protein aggregate formation that might be otherwise toxic to plant cells [25]. Here, the ZmHSFs was shown to interact with chaperons belonging to different families (HSP101, HSP90, HSP40) and genes with a well-known role in thermotolerance (HSA32, HSP82, HSBP-2). DREB2A is a major transcriptional activator that functions downstream of HSFA1s dependent transcriptional cascade in Arabidopsis thaliana [3,12,15,29]. Similarly, MBF1c is a major protein that is characterized by its role in the regulation of abiotic stress responses and growth in plants [3,82,83]. SUMO proteins are attached to their target proteins and modify their biological activities under various physiological and stress conditions [84]. In Arabidopsis, SUMOylation has been proposed as one of the molecular mechanisms that are responsible for the activation of HSFA1s [12]. Many HSFs in Arabidopsis such as HSFA1d, HSFA2, and HSFB2B have the potential to be SUMOylated [103]. In tomato, the knockout of SIZI (a SUMO ligase) reduces plant thermotolerance [104]. Here, ZmHSFs was shown to interact with all these proteins (DREB2A, MBF1c, and SUMO proteins), which further confirms their role in the regulation of abiotic stress response.
Taken together, our present analysis provides strong support for the positive role of HSFs in the growth and development of maize by the regulation of primary metabolism. Furthermore, HSFs of maize interact with the major stress-responsive proteins and confer abiotic stress resistance.

Conclusion
Here, we identified a total of 25 HSFs from the maize genome through genome-wide investigation analysis. To gain a better understanding of phylogenetic analysis, gene structure and conserved protein motif analysis, gene duplication and evolutionary analysis, domain analysis and physio-chemical properties, protein 3-D structure, GO and PPI network analysis, and expression profiles of ZmHSFs under various developmental stages and stress conditions were studied. The results indicate ZmHSFs play a major role in plant growth and stress responses. Therefore, additional studies are recommended to confirm the proposed role of HSFs in maize growth, development, and response to numerous environmental cues.

Supplementary Materials:
The following supplementary material is available online.  Table S5: TPM values of Zea mays HSFs under multiple abiotic stress conditions;  Table S6: GO annotation file of Zea mays HSFs; Table S7