Combined Linkage Analysis and Genome-wide Association Study Reveal QTLs and Candidate Genes Conferring Genetic Control of Prolificacy Trait in Maize

1 Affiliation 1; College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China. 2 Affiliation 2; College of Life Sciences, Synergetic Innovation Center of Henan Grain Crops and National Key Laboratory of Wheat and Maize Crop Science, Henan Agricultural University, Zhengzhou 450002, China. # These two authors contributed equally to this work. * Correspondence: Zijian Zhou (zhouzijian19900601@163.com); Jianyu Wu (wujianyu40@126.com).


Introduction
Maize (Zea mays L.) is one of the world's staple food and industrial raw materials and energy crops [1]. In maize genetics and breeding, the number of ears on a plant is scored as prolificacy [2]. Yield which consists of the ear number per plant and the yield per ear is the goal of the ordinary maize. Prolificacy makes dominant ears smaller and produces invalid ears, which leads to lower yield [3]. However, the economic benefits of special maize such as silage corn, baby corn and sweet corn are closely related to the number of ears [4,5]. In addition, for genetic breeding, maize inbred lines with prolificacy can amplify the events of hybrid combinations. Therefore, it is very important to study the prolificacy trait in maize.
Typically, maize has 1-2 economic benefits ears, with only one single ear at each node. The prolificacy is that maize has 3 or more ears, including multi-node prolificacy, single-node prolificacy and multi-tiller prolificacy. Maize originated from a single domestication event of teosinte in the Balsas River Basin, Mexico, as early as 9000 years ago [6,7]. During the process of maize evolution and artificial selection, the tiller and prolificacy characters of maize gradually degenerate [7,8]. But under certain environmental conditions, the tillering and prolificacy phenomenon of maize still occurs. Besides, axillary buds can potentially develop at every node except the top 5-9 nodes below the tassel [9]. Interestingly, the ear of maize belongs to an abnormal stem, and axillary meristem in underground will develop into tillers, while that in aboveground may develop into ears. The phenomenon of apical dominance plays an important role in regulating, in which the growing apex meristem produces a long-distance inhibitory to maintain axillary buds in a dormant state axillary shoots [10]. Dormancy is tightly regulated by a complex interaction between internal and external stimuli [11]. When maize suffers high-temperature stress, diseases and insect pests, high density, or not timely pollination, the development of dominant ears is hindered, leading to the reduction of apex dominance [12]. Then, lower axillary buds develop further to form prolificacy. Besides, the upper internode of some tropical lines which grow in temperate regions, elongate slowly, and the lower axillary buds develop preferentially, which can form prolificacy. Adequate nutrients, such as increased use of nitrogen fertilizer, will also lead to the prolificacy by promoting lower axillary meristems growth [13].
Phytohormone plays an important role in the growth and development of the female panicle, among which auxin is the main plant hormone to maintain and remove apical dominance. The polar transport of auxin will affect the development of ears and lower axillary buds. Studies have shown that the regulation of auxin responses began at the initial stage of ear development [14]. Arabidopsis regulates the development of apical and basal boundaries in octagonal tissues through positive regulation of 'mid-level' auxin responses [15]. Various studies revealed antagonistic interactions between the plant hormones auxin and cytokinin (CK) in regulating bud outgrowth [16,17]. Studies in other plant species also showed that auxin and abscisic acid (ABA) inhibited bud growth, while cytokinin promoted it [18,19]. Strigolactones (SLs) and gibberellin (GA) can repress bud outgrowth while brassinosteroids (BRs) promote bud outgrowth in rice [19]. Ethylene participates in ear development and plays a negative regulatory role in the early stage of ear development [14]. Because of a strong carbohydrate requirement in dividing and differentiating cells, it makes sense that meristem maintenance, identity, and organogenesis should be carefully coordinated with nutrient status [20]. There is also increasing evidence that sugars can regulate specific developmental programs and transitions via genes that control meristem maintenance and identity [21,22]. Plasticity in plant development is controlled by environmental signals with largely unknown signalling networks. In particular, the maize G protein signaling plays a key role in ear development and prolificacy [23].
Prolificacy is a result of several factors such as initiation of axillary primordia, development of axillary bud, branch elongation, ear development and female floret development, which in turn are controlled by a large number of genes [24]. The quantitative inheritance pattern of prolificacy with the prevalence of non-allelic interactions of duplicate epistasis type has been observed. Dominance × dominance effect was predominant over additive × additive and additive × dominance effect [24]. Since the multi-ear traits of maize are derived from teosinte, QTLs and related genes controlling the multi-ear traits can be found through genetic research on the hybrid offspring of teosinte and maize inbred line. Some QTLs associated with maize prolificacy traits have been identified. Eight QTLs were identified which were associated with maize prolificacy, one QTL (prol1.1) is located on the short arm of chromosome 1 and accounts for 36.7% of the phenotypic variance, which has a much larger effect than the other seven. Fine-mapped prol1.1 to a 2.7 kb ''causative region'' upstream of the grassy tillers1 (gt1) gene, which encodes a homeodomain leucine zipper transcription factor [2]. The grassy tillers1 (gt1) is named for excessive tillering and plays a role in maize development [25]. Loss of the gt1 function alleles also cause trait changes including prolificacy in maize [2]. Teosinte branched1 (tb1) is responsible for the shortening and feminization of the axillary branches in domesticated maize [26,27]. Since both tillers (basal lateral branches) and upper lateral branches arise from axillary meristems, in a general sense, tb1 governs the fate of the axillary meristems [28].
Although the genetic mechanism of the prolificacy trait has been elucidated by the genetic populations of maize and teosinte, the genetic basis of the prolificacy among modern maize inbred lines has not been revealed in detail. In this study, two RIL popu-lations and GWAS population used to identify the QTLs and candidate genes associated with prolificacy trait in maize. Specifically, the three aims of this investigation were to (1) identify the stable QTLs of prolificacy trait, (2) identify key candidate genes, (3) analyze the genetic mechanism of prolificacy in maize.

Phenotypic analysis
Through the investigation of prolificacy traits in different environments of three populations, we analyzed the phenotypic frequency of each environment. The phenotypic distribution of each environment is similar to the normal distribution (Figure 1), indicating that the prolificacy of maize is a quantitative trait controlled by multiple genes. Besides, phenotypes in different environments in the three populations were correlated, indicating that phenotypic data structures in different environments were relatively similar, and joint analysis of phenotypic data in multiple environments in the population could be conducted.
In order to further dissect the phenotypic variation of prolificacy traits, we implemented heritability analysis in three populations respectively ( Table 1). The combined heritability of each population was 0.87, 0.83, and 0.85, respectively. This indicates that the prolificacy traits are mainly controlled by genetics.

Linkage analysis
The genetic maps of two RIL populations (BT×XI502 RIL and N6×CML170 RIL) were constructed (figure S1). There were 7,528 polymorphic SNP markers in the BT-1×XI502 RIL population, with a total length of 2541cM and an average genetic distance of 0.34cM between the markers. In the N6×CML170 RIL population, there were 3998 polymorphic SNP markers with a total length of 1450.13cM and an average genetic distance of 0.36cM between the markers.
The linkage analysis of the RIL population was carried out by composite interval mapping method. Firstly, QTL analysis was performed on prolific phenotypes in different environments. 30 and 11 QTLs were identified by the two populations respectively (Table S2 and Table S3). In addition, we performed combined analysis of multiple environments for RIL populations with the best linear unbiased estimate (BLUE) ( Table 2). A total of 13 QTLs were identified on chromosomes 1, 2, 5, 6, 7, 9 and 10 by the two RIL populations. One of these QTLs qP6-1 on chromosome 6 in the BT-1×XI502 RIL explained the highest phenotypic variation, 7.65%, and the QTL qP9-3 on chromosome 9 in the N6×CML170 RIL explained the highest, 7.69%. It is noteworthy that both two populations detected the common QTL on chromosome 9, qP9-2 and qP9-3. The overlapping interval of two QTLs will be an important object of our study. A total of 15 genes were identified in the overlapping interval 9_135605559 to 9_136357845 ( Table 3). Two of fifteen genes have been reported associated with tissue differentiation. The gene GRMZM2G317262 encoding the F-box family protein, and GRMZM2G317584 encoding Ethylene insensitive 3 family protein.  Table 3. Candidate genes and their functional annotations in qP9-2.

Genome-wide association study
In short, the GWAS population was constructed by three subgroups. The largest subgroup 1 contained tropical germplasm available at CIMMYT. Tangsipingtou heterotic subpopulation generated from Chinese very important elite materials belonged to subgroup 2, and most of the lines in Subgroup 3 were from the Chinese Reid and Lancaster heterotic subpopulation. Genome-wide association study is carried out for a single environment (figure S2 and S3). A total of 106 Significant SNPs were detected, which correspond to 186 genes, (Table S4).
Through the joint analysis of the phenotypes in 8 environments of the GWAS population ( Figure 2). A total of 8 significant SNPs were detected, which were located on chromosome 1, 5, 6 and 7, respectively. The most significant SNP site was S6_31649694 on chromosome 6. According to the LD value (10-20kb) of the GWAS population, we identified 7 candidate genes (Table 4). GRMZM2G025959, GRMZM2G088736, GRMZM5G882364, GRMZM2G141679, GRMZM2G109126, GRMZM2G046297 and GRMZM5G801004. By comparing to the QTL, the associated SNPs of S5_178621021 were localized in the interval of QTL qP5-1. The two corresponding genes GRMZM2G088736 and GRMZM5G882364 were identified based on the MaizeGDB database (http://www.maizegdb.org/), which encodes an ycf45 protein and Copper transport protein, respectively. The SNP S7_100241083 was localized in the interval of QTL qP7-1. The corresponding gene GRMZM2G141679 encodes an ethylene-responsive transcription factor. So, the genes GRMZM2G088736, GRMZM5G882364 and GRMZM2G141679, were considered important candidate genes.

Quantitative analysis
In summary, five important candidate genes were selected, namely, GRMZM2G317262, GRMZM2G317584, GRMZM2G088736, GRMZM5G882364 and GRMZM2G141679. In order to screen and verify the potential candidate genes, the relative expression levels of the above five genes were measured (Figure 3). The gene GRMZM2G088736 was found not to be expressed in different parents and different internodes. The expression level of N1 ear in BT-1 was used as a calibrator to analyze the gene expression of candidate genes. The expression level of candidate gene GRMZM2G141679 was different between different nodes, and showed an upward trend, except for the XI502 inbred line. The expression of gene GRMZM2G317584 in the four inbred lines showed an increasing trend with the decrease of ear position. The expression of these two genes was related to the differentiation of the young ear in maize and positively regulated the development of the ear. The expression level of GRMZM2G317262 in different internodes of the four inbred lines were relatively stable, and it was obvious that the gene GRMZM2G141679 and GRMZM2G317262 expression levels of multi-ear inbred lines XI502 and CML170 were lower than those in the low-ear inbred lines BT-1 and N6. The expression level of gene GRMZM5G882364 was not significantly different in the female panicle between different nodes, and between the four inbred lines.

Discussion
The development of maize ear is divided into four stages, namely, growth cone extension stage, spikelet differentiation stage, floret differentiation stage and sex organ formation stage [29]. Usually, the axillary buds of the middle and lower part of maize stay in the early stage of ear development, and only 1-2 axillary buds of the middle and upper part of maize will develop to ear. The differentiation process of the ear in maize is similar to the tassel, but the ear differentiates is late and quickly. Finally, the silk of the ear receives the pollen and develops the mature ear. The silking of the ear with different development of axillary buds can be taken as the criterion for the formation of multiple ears of maize, which had a long period and easily recognized. There is apical dominance in ear development of maize. When the dominant ear of maize is subjected to biological or abiotic stress, growth and development are hindered and apical dominance is weakened, which promotes the development of lower female ear and finally effects prolificacy trait. In abiotic stress, high temperature is a key factor affecting plant growth and development. Prolificacy can indirectly reflect the high temperature tolerance of dominant ear in maize [30]. Therefore, it is very important to analyze the genetic mechanism of maize high temperature tolerance from the perspective of prolificacy.
Through QTL analysis, 13 QTLs affecting prolificacy trait were identified, among which the range of qP1-3 (V4: 22,965,625-24,963,234) includes the prol1.1 (V4: 23,005,166-23,462,365) and gt1. QTL qP9-2 and qP9-3 located on chromosome 9 were multi-environment and multi-population stable QTLs, of which including 15 genes in the overlapping interval from 9_135605559 to 9_136357845. Combined with functional annotation, two candidate genes, GRMZM2G317262 and GRMZM2G317584 were selected. Through the Genome-wide association study, 8 significant SNPs were identified, corresponding to 7 candidate genes. Three of seven candidate genes, GRMZM2G141679, GRMZM2G088736 and GRMZM5G882364, were selected combined linkage analysis results. Further expression analysis initially studied the functions of the above five candidate genes, and the results showed that GRMZM2G317262, GRMZM2G317584, GRMZM2G141679 and GRMZM5G882364 may be associated with maize prolificacy traits.
Hormones play an important role in the growth and development of plants, among which auxin, gibberellin, cytokinin and BRs promote growth, while abscisic acid and ethylene play the opposite role. Candidate gene GRMZM2G317584 identified by linkage analysis encodes an ethylene-insensitive3 protein (EIN3/EIL). Ethylene-insensitive3 (EIN3/EIL) is a key regulator for initiation of ethylene-mediated downstream transcriptional cascades by binding to primary ethylene response elements (PEREs) and EIL conserved binding sequences (ECBSs) in the promoters of downstream genes involved in the ethylene reaction [31,32]. EIL is an important gene family in plants and plays key roles in the ethylene signaling pathway which regulates a broad spectrum of plant growth and development, as well as defenses to various biological and abiotic stresses [33]. Previous studies have shown that ethylene has a negative regulatory effect on the early ear development of maize [14]. Ethylene regulates the responses of plants to various abiotic stresses, especially in low temperature stress [34]. Ethylene biosynthesis and signaling negatively regulate plant freezing tolerance by repressing the cold-inducible CBFs and type-A ARR genes in Arabidopsis [35].
GRMZM2G141679 identified by Genome Wide Association Study encodes an ethylene-responsive transcription factor. Quantitative analysis showed that the expression level of GRMZM2G141679 was higher in the early stage of ear development and lower in the prolific parents, indicating that this gene may negatively regulate the occurrence of the prolificacy trait. The expression level of GRMZM2G317584 is higher in early ear development, indicating that the gene plays an important role in early ear development.
The gene GRMZM2G317262 encodes an F-box family protein. The F-box family protein is involved in the regulation of various developmental processes in plants, such as photomorphogenesis, circadian clock regulation, self-incompatibility, anther meristem and floral organ identity determination, etc. It is reported that the F-box gene (GenBank ID: AJ577363) was expressed during all the spike developmental stages in wheat [36]. F-box proteins contain a conserved F-box domain, which interacts with Skp1, CULLIN, and RBX1 to form the SCF complex [37]. The SCF complexes confer the specificity of selective protein ubiquitination and subsequent degradation by the 26S proteasome [38]. Meanwhile, ethylene -induced stability of EIN3/EIL1 is mediated by proteasomal degradation of two F-box proteins [35,39]. Based on the quantitative analysis of gene GRMZM2G317262, it can be seen that the expression level of this gene is relatively low in the early ear development and in the high prolificacy parent, which indicated that the gene GRMZM2G317262 negatively regulated prolificacy trait.
The gene GRMZM5G882364 encoding the Copper transport protein family. Copper is an essential metal for plants. It plays key roles in photosynthetic and respiratory electron transport chains, in ethylene sensing, cell wall metabolism, oxidative stress protection and biogenesis of molybdenum cofactor. Cu homeostasis is also receiving a growing interest in plant research, since it is implicated in adaptive responses to the oxidative damage produced by environmental stress [40]. The ethylene receptor ETR1, which localizes in the endoplasmic reticulum (ER), high affinity binding of ethylene is mediated by a copper co-factor [41].
Since the functions of the four candidate genes are related to ethylene metabolic pathways, the model was speculated, (Figure 4). Under normal circumstances, the expression level of ethylene transcription factor is low in-ear (N1 and N2), the key of the ethylene downstream cascade control factor of EIN3 expression level is low, at the same time, the expression of F-box makes EIN3 unstable, causing the low ethylene content in the ear (N1 and N2). It weakens ethylene to continue the metabolism of the binding copper ions ethylene receptor ETR1 in the endoplasmic reticulum, causing ear (N1 and N2) normal growth. After the high-temperature stress, the influx of copper ions increased, and the ethylene receptor ETR1 increased in the endoplasmic reticulum. It may be that the expression of F-box protein decreased, leading to the structural stability of EIN3, which in turn led to the enhancement of ethylene metabolism pathway, and finally inhibited the development of ear (N1 and N2), promoted the development of ear (N3 and N4), and formed prolificacy trait. In the process of maize domestication, to facilitate field work and grain harvest, traditional maize tends to have fewer ears per plant to ensure maize yield. With the development of agricultural modernization, both field work and harvest can be done by machines, and maize kernels are no longer the only harvest target. This made it impossible for the currently domesticated maize to meet modern needs. With the development of science and technology, genetics, molecular breeding and other means can accelerate the breeding process of modern maize. For different harvest targets and facing different environmental stresses, the demand for ear prolificacy traits of maize is also different, so how to regulate the number of the ear is very important. The fundamental way to excavate the genes controlling the prolificacy trait and to cultivate new varieties. In previous studies, most genetic studies were carried out based on the hybrid offspring of teosinte and maize, and the gene that can affect panicle number was also cloned, such as gt1 and tb1. However, allelic variation between teosinte and maize may be too rare to be used. Combined linkage analysis and Genome-wide association study to find allelic variation that can control prolificacy traits in maize, to develop functional markers and serve molecular breeding.

Material and field management
For QTL mapping analysis, two bi-parental RIL populations are generated using the method of single seed descent. One RIL population containing 276 lines was derived by crossing BT-1(low prolificacy) with XI502 (high prolificacy). The other RIL population containing 250 lines was generated by crossing N6 (low prolificacy) with CML170 (high prolificacy). The GWAS population consisted of 298 maize inbred lines, which have also been used in previous studies [42,43]. Three genetic populations were planted in Zhengzhou (

Phenotypic investigation and data analysis
After the pollen-shedding of tassel was finished, prolificacy investigation of maize was carried out, the ear with silks was considered effective, and the axillary bud which was not further developed was considered invalid. Here, we investigate the prolificacy phenomenon at different internodes to highlight the development ability of axillary buds, so the multiple ears (baby ears) at the same node were recorded as one ear. Investigate 10 plants in each row and take the average value of each material for further analysis.
Correlation analysis and frequency distribution visualization of phenotypic data in different environments of three populations were carried out with R package "Hmisc" and "PerformanceAnalytics", and AOV analysis. Genetic map construction, heritability analysis and linkage analysis were implemented with QTL IciMapping software from http://www.isbreeding.net/ [44]. The R package "LinkageMapView" is used to visualize the genetic map. Mixed linear model (MLM) of Tassal5 was used for the Genome-wide association study, in which BLUPs, markers, Kinship and PCA are included. The R package "CMplot" was used for visualization of QQ-plot and Manhattan plot. Candidate genes associated with the significant SNPs were searched by websites (www.maizeGDB.com) and screened out according to function annotations and previous research. The expression levels of candidate genes were analyzed by quantitative PCR.

RNA extraction and Quantitative Real-time PCR
The inbred lines BT-1, XI502, N6 and CML170 were planted in Zhengzhou in 2020, and the female ears of maize were sampled at different internodes during the silking period, from top to bottom namely N1, N2, N3, N4, N5 and N6, respectively. Three biological replicates were performed, and each biological replicate consisted of three samples. The Column Plant RNAout 2.0 (TIANDZ, China) was used to extract RNA from the samples. HIScript® Ⅲ RT SuperMix (VazymE, China) was used for the synthesis of cDNA from the RNA samples. The primers were designed with software Primer5 (Table  S1) and synthesized by the company (SunYa, China). Quantitative real-time PCR was carried out with UItraSYBR Mixture (CWBIO, China) and instrument QuantStudio5. The relative expression of candidate genes with reference to 2 -Δ Δ CT method to calculate.    Gene ID Primer