Different distribution of soil core microbiota in two places of Huabei plain, central China

Soils harbor diverse bacteria, and these bacteria play important roles in soil nutrition cycling and carbon storage. Numerous investigations of soil microbiota had been performed, and the core microbiota in different soil or vegetation soil types had been described. However, the complexity of soil environments and relatively limited information of many geographic areas had attracted great attention on comprehensive exploration of soil microbes in enormous types of soil. To reveal the core soil microbiota in Huabei plain, soil samples from metropolis and countryside regions in the Huabei plain were investigated using high-throughput sequencing strategy. The results showed that the most dominant bacteria are Proteobacteria (38.34%), Actinobacteria (20.56%), and Acidobacteria (15.18%). At the genus-level, the most abundant known genera are Gaiella (3.66%), Sphingomonas (3.6%), Acidobacteria Gp6 (2.1%), and Arthrobacter (2%). Moreover, several dominant operational taxanomy units (OTU), such as OTU_3 and OTU_17, were identified to be associated with the soil environment. Microbial distributions of the metropolis samples were different from the countryside samples, which may reflect the environments in the countryside were more diverse than in the metropolis. Microbial diversity and evenness were higher in the metropolis than in the countryside, which might due to the fact that human activity increased the microbial diversity in the metropolis. The soil core microbiota of the Huabei plain were complex, and microbial distributions in the Huabei plain might be mainly affected by the human activity and environmental factors, not by the distance. Our data highlights the soil core microbiota in Huabei Plain, and provides insights for future soil microbiota distribution studies in central China.


Introduction
The Soils harbor abundant microbial resources and contain high numbers of microbes [1,2]. Among these microbes, bacteria play important roles in various aspects, especially in carbon storage and nutrient cycling [3][4][5]. The environmental factors played important roles in microbial distribution, while the geographic distances showed little effect on microbial diversity in soil [6][7][8]. A global analysis of drylands indicates that increasing aridity reduces soil microbial diversity [6]. Recently, the global topsoil microbiota structure and function analyses demonstrate that the microbial distribution patterns are mainly associated with precipitation and soil pH [2,9]. The microbial community during corpse decomposition in different vegetation soil types are similar, and the dominant factor driving community development is the nitrogen and carbon input [10]. Moreover, deforestation would affect the soil microbiota and the alpha diversity would be increased after the slash-and-burn forest cleaning in Amazon [11].
The microbial distribution in different biogeographical areas was different, and the dominant bacteria in soils worldwide are Proteobacteria, Actinobacteria, Verrucomicrobia, Acidobacteria and Firmicutes [12,13]. The most dominant bacteria in drylands are assigned to Actinobacteria, which composes 23%-29% of the total bacteria [6], and the desert soil microbiota is obviously distinct from microbiotas of nondersert soils [14]. In relatively undisturbed soil samples collected from North America, the most dominant bacteria are Acidobacteria, Veruucomicrobia and Bacteroidetes [15,16]. The investigation of the East European plain soils showed that the most abundant bacteria are Actinobacteria, Proteobacteria and Verrucomicrobia [17]. The soil microbial diversity would be affected by vegetation type, and the rhizospheric microbial distribution of different plants are distinct [18,19]. Despite many efforts have been tried to understand global soil microbial distribution, such as the Earth Microbiota Project [20], the microbial distribution in many geographic areas is still unknown.
Here, we present a soil microbial community study to assess the microbial diversity using a high-throughput sequencing approach in two different areas in the Huabei plain, central China. The sampling places have been used for agriculture for thousands of years, encompassing the countryside and the metropolis area. We analyze the dominant microbes in these samples and compare their microbial distribution. Moreover, the potential relationship between the soil samples and the environmental factors are discussed.

Sample collection and analysis
The 13 soil samples were collected from two different regions in the Huabei plain, Xincai county and Zhengzhou of Henan province, China, in March, 2018 (Table 1). Sampled soils are moist clay in these two places ( Figure S1). Among them, 7 soil samples were collected from Xincai county (countryside place, named as XC group), and another 6 samples were collected from Zhengzhou (metropolis, about 300 kilometres from Xincai, named as ZZ group). The soil samples were collected from 5-10 cm of the soils, and were transferred to the laboratory and stored at -20 ℃ before use (Table 1). To measure pH, 0.5 g soil of each sample was thoroughly mixed with 2ml water. The pH was read with a digital pH meter (Shanghai Lei-ci Co. Ltd) [21]. Temperature and other soil parameters were collected from the public database of China meteorological administration (Table  1).

Soil DNA extraction
Soil DNA was extracted from 0.5 g soil of each sample, and the soil was prewashed with 1 ml of 0.5 M EDTA to remove organic matters in soils [22,23]. The soil mixture was collected by centrifugation at 12000 rpm for 5 min. Each prewashed soil precipitates were further treated with 0.6 ml of 0.5 M CaCl2 and 1.4 ml of ddH2O, and the soil precipitates were collected by centrifugation at 12000 rpm for 5 min [22]. The pretreated soil was lysed with 1 ml DNA extraction buffer (100 mM Tris-HCl, 100 mM EDTA, 100 mM sodium phosphate, 1.5 M NaCl, and 1% (w/v) cetyltrimethylammonium bromide, pH8.0), 2 l proteinase K (20 mg/ml) and 200 l of 20% SDS under the incubation at 65 ℃ for 2 hours. The crude lysate was centrifuged at 17000 g for 10 min and the supernatant was collected. The DNA in the supernatant was purified with the equal volume of phenol:chloroform:isoamyl alcohol (25:24:1) for two times and chloroform:isoamyl alcohol (24: 1) for one time. The final supernatant after purification was precipitated with 0.6 volumes of isopropanol, and the soil DNA were collected by centrifugation at 12000 rpm for 5 min. The DNA was dissolved in 30 l TE buffer with 2 l RNase (10 mg/ml), and RNA was removed by incubation at 37 ℃ for 30 min [24].

16S rDNA gene fragment amplification and soil microbial community analyses
The V3-V4 regions of microbial 16S rDNA genes were amplified with primers of 341F (5'-CCTAYGGGRBGCASCAG-3') and 806R (5'GGACTACNNGGGTATCTAAT-3'). The 25 µl PCR amplification mixture contained 25 ng DNA, 1 µl each primer (10 µM), 0.5 µl dNTP (2.5 mM), 12.5 µl 2* Vazyme Phata max buffer, 0.5 µl Vazyme Polymerase (Vazyme Biotech). The PCR was performed with an initial denaturation (5 minutes at 95 ℃), followed by 27 cycles of 15 seconds at 95 ℃, 15 seconds at 55 ℃, and 30 seconds at 72 ℃, and final with one cycle of 5 min at 72 ℃. The PCR products were purified with the KAPA Pure Beads (Roche) according to the manufacturer's instructions and further sequenced with an Illumina Miseq system (Illumina). The raw reads were processed and analyzed as described before [25]. The operational taxonomic unit was classified based on 97% identity. The principal coordinates analysis (PCoA) and Non-metric Multidimensional Scaling (NMDS) analyses based Unweighted UniFrac distance were generated by the Vegan 2.4.2 package in R. The raw reads had been submitted to the NCBI Sequences Read Archieve (SRA) database and the accession numbers were SAMN10602944-SAMN10602956. Table 1. Characterization of the sampling sites.

Overall soil microbial community composition
Two groups of 13 soil samples were collected, and assigned to be XC group and ZZ group (Table 1). For the 13 soil samples, a total of 716,285 high-quality 16S rDNA gene fragments were obtained, and they were classified into 4838 operational taxanomy units (OTU) based on 97% identity (Table S1). The average sequence and OTU numbers for each sample were 55099 and 1693, respectively, showing there was a large number of common OTUs distributed in these 13 soil samples. The richness and Chao1 indices of these two groups were similar, indicating most microbes in the soil samples had been revealed (Table S2). The Shannon_2 parameters suggested the diversity in these samples were high. Other indices of Simpson, dominance, equitability and the rank abundance hinted that the microbial distribution was not definite evenness and some abundant species were available in the soil samples (Tables S2). The microbial richness, Chao1, Shannon_2, dominance and equitability parameters of samples in ZZ group were higher than the corresponding indices in XC group. The microbial Simpson parameter of samples in ZZ group was lower than that in XC group (Table 3). At the phylum level, the most dominant bacteria in the XC soil samples were Proteobacteria, Actinobacteria, Acidobacteria, and Bacteroidetes, and they composed 44.5%, 15.5%, 12.3%, and 6.1%, respectively ( Figure 1A and Table S3); while the most the most dominant bacteria in the ZZ soil samples were Proteobacteria, Actinobacteria, Acidobacteria, and Bacteroidetes, and they composed 31.1%, 26.4%, 18.5%, and 2.8%, respectively ( Figure 1A and Table S3). At the genus level, 312 genera were identified, and 56.6% of all the sequences cannot be assigned to the known genera, indicating that most bacteria in these soils were unknown (Table S4). Among the assigned genera, the most dominant speices were assigned to 13 genera of Gaiella, Sphingomonas, Acidobacteria Gp6, Nocardioides, Arthrobacter, Acidobacteria Gp4, Acidobacteria Gp16, Gemmatimonas, Rhodanobacter, Nitrososphaera, Acidobacteria Gp3, Pseudomonas, and Streptomyces. These dominant genera were accounted for 22.7% and 26.5% of the XC group and ZZ group, respectively. Moreover, the distributed of these genera in these two groups were different ( Figure 1B and Table  S4). For all the 13 genera, the distribution of the two groups are different ( Figure 1B and Table S4). Especially, the distribution of Sphingomonas, Acidobacteria GP6, Acidobacteria Gp4, Acidobacteria Gp16 and Nitrososphaera between the two groups showed obvious difference. Figure 1 Phylum (a) and genus-level (b) microbial distribution of the two soil groups. Above_phylum and Above_genus mean microbial sequences can't be assigned to phylum and genus, respectively.

Dominant OTUs in the microbial communities
Though most microbes in the samples were uncultured, 7 of the 10 most abundant OTUs showed > 97% identities with isolated microbes, suggesting the function of these OTUs can be predicted from the known isolates (Table 2). OTU-4 is the most abundant identified OTU in the samples, and it is Sphingomonas limnosediminicola, which mainly distributed in the wet environment [26]. OTU-1 showed 100% identity with Pseudarthrobacter phenanthrenivorans, which is isolated from a creosote-contaminated soil [27]. OTU-3 was the 3rd most abundant OTU distributed in the soils, it composed 15.94% of microbes in HN-S2 [28]. OTU-17 is Rhodanobacter spathiphylli, which was firstly isolated from a compost-amended potting mix [29]. OTU-9 is Bradyrhizobium namibiense, which is a nitrogen-fixing bacterium [30]. OTU-44 is Nocardioides mesophilus, which is firstly isolated from soil [31]. OTU-94 is Sphingomonas aquatilis, which is widely distributed in the environments.

Microbial diversity in different soil samples
Usually, the microbial diversity of the soil samples in the same environment would be similar and clustered at phylum-level, for example, HN-S13, HN-S14, and HN-S15 soil samples covered with grass, HN-S18, HN-S19, and HN-S21 soil samples near one small stream were clustered in the phylogenetic tree ( Figure 1A). However, some microbial communities from the same area were different and were not clustered at phylum-level, for example, HN-S10, HN-S11, and HN-S12 were collected from the soils near a pig farm, but they distributed in different clusters in the phylogenetic tree ( Figure 1A). The same situation was available at the genus-level analyses, for example, HN-S18, HN-S19, and HN-S21 collected near the stream were clustered together; HN-S13, HN-S14, and HN-S15 were distributed at different clusters in the phylogenetic tree ( Figure 1B). PCoA analyses based on Unweighted UniFrac distance showed that 6 soil samples of HN-S13, HN-S14, HN-S15, HN-S18, HN-S19 and HN-S21 in ZZ group were clustered together (Figure 2A). In the meanwhile, HN-S11 and HN-S12 in XC group were clustered, and another 5 soil samples in the XC group formed the third clusters. The NMDS based on Unweighted UniFrac distance also indicated similar results. The 6 soil samples in ZZ group were clustered together, and another 7 soil samples in the XC group were clustered at two different area ( Figure 2B). Both PCoA and NMDS presented consistent beta diversity between groups. Besides, the distance between 6 soil samples in ZZ group and 5 soil samples HN-S1, HN-S2, HN-S8, HN-S9 and HN-S10 were close in PCoA and NMDS analyses.

Discussion
The dominant bacteria in these 13 soil samples are similar to previous soil microbiota investigations that the dominant bacteria in soils are Proteobacteria and Actinobacteria [2,13,32]. However, the microbial diversity in these 13 soil samples collected from Huabei plain is different from the microbial diversity in the East European plain where the most dominant bacteria are Actinobacteria (46.5%) and Proteobacteria (25.6%), it might due to the fact that the environment factors between them are distinct [17]. Moreover, the samples collected from the same place, especially samples from XC group, were not completely clustered at the phylum level, hinting even the microbial communities in same area with different environment factors were slightly different ( Figure 1A). Table 3 The alpha diversity of 13 soil samples.
More than 50% sequences cannot be assigned to known genera, suggesting most species in soils were uncultured and investigation of soil microbes were valuable [2,13]. The abundance of Sphingomonas genus, which has the ability to metabolized some pollutants, are higher in soils of XC group than in soils of ZZ group, hinting the pollutants in XC are higher than in ZZ group, this might due to the livestock breeding and other agriculture activities in the rural area (XC) group [33,34]. Bacteria from Gaiella genus can reduce nitrate to nitrite, and its distribution in all these two groups are abundant [35], hinting that these samples might contain high-level of nitrate. The Rhodanobacter genus can converse nitrate to nitrogen and its distribution in HN-S1 and HN-S11 are higher than other samples [36,37], this might due to the fact that a large amount of nitrate was fertilized in HN-S1 and a large amount of nitrate was available in HN-S11 which might derive from pig manure. Besides, the distribution of Nitrosophaera in ZZ group are higher than in XC group, this might due to the fact that some nitrogenous fertilizer was added to the soil samples collected in ZZ group.  Most OTUs in the soil samples showed < 97% identity with isolated bacteria ( Table  2), further indicating most species were uncultured. Among the top10 dominant OTUs, OTU-1 is able to metabolize phenanthrene, suggesting there might be some phenanthrene in the soils of HN-S8 which harbored high-level of OTU-1 ( Figure S1a) [38]. OTU-17 is very abundant in HN-S10, HN-S11 and HN-S12 which sampled from a pig farm and is related with compose, showing this OTU might be functioned in pig manure pollution removal. Some identified OTUs, including OTU-9 and OTU-94, are correlated with soil nutrition cycling and contaminant removal [30,33], and it might be due to the availability of small amount of pollutants in the soil samples.

Region
The PCoA and NMDS analyses showed consistent sample classification based on Unweighted UniFrac distance, suggesting the sample classification based on the microbial community were reliable. The bacteria in the ZZ group was more abundant than in the XC group, suggesting that human activities in metropolis increased microbial diversity [19,39]. The big differences between HN-S11, HN-S12 and another 11 soil samples might attribute to that the input of pig manure from HN-S11 and HN-S12 changed soil nutrition. The microbial distribution of HN-S10 was different from that of HN-S11 and HN-S12, as the pig farm had been abandoned for a few months before we sampled HN-S10, suggesting that the potential pig manure effects on soil microbial distribution had been disappeared [10]. As the pH and precipitation of all the samples are nearly the same (Table 1), the soil microbiota of ZZ group and XC group except HN-S11 and HN-S12 are similar, despite the distance between ZZ group and XC group is 300 kilometers. This soil microbiota similarity demonstrates similar pH and precipitation might result in similar core microbiota [6,8,9,14].

Conclusions
In summary, we investigated the microbial diversity of 13 soil samples collected from Huabei plain and found that Proteobacteria, Actinobacteria, Acidobacteria, and Bacteroidetes were the dominant bacteria. Moreover, the microbial species in the Huabei plain was similar, but the microbial distribution are different, indicating different area would have different core microbiome. Input of nutrition, such as pig manure of HN-11 and HN-12, to the soil would change soil microbial distribution, showing environmental factors are the key ecosystem driving roles for microbial distribution.

Data Availability Statement:
The raw reads of the 16S rRNA data had been submitted to the NCBI Sequences Read Archieve (SRA) database and the accession numbers were SAMN10602944-SAMN10602956.