Genetic Diversity of Cassava (Manihot esculenta Crantz) in Ecuador by Using SSR Markers

Cassava (Manihot esculenta Crantz), domesticated in the Amazonian region of South America, presents an important diversity in Ecuador, where it is a main staple food; however, only few Ecuadorian cassava accessions have been included in international molecular assessments. The purpose of this study was to apply suitable cassava microsatellites to characterize the genetic variability of the Ecuadorian cassava collection composed mainly of local landraces from the Coast, Andes and Amazonia regions. The use of microsatellite markers allowed the determination of the genetic diversity of the collection. Seven selected SSR primers, permitted to identify homozygous and heterozygous materials within the cassava collection of 133 accessions. The loci presented an average genetic diversity value of 0.7 and an average PIC value of 0.67, which is considered high. Low number of duplicates (8.8%) were identified in the Ecuadorian collection which is not fully duplicated at CIAT. Currently, a wide range of cassava diversity is still cultivated in multi-crop agro-ecosystem, mainly in the Coast and Amazonian regions. Especially in the Amazonian region, due to important cultural uses of cassava by local ethnic communities, more in depth studies in the region could unveil the genetic diversity present in situ today.


Introduction
In the Euphorbiaceae family, the genus Manihot contains 110 species ranging in habit from herbs to small trees [1,2]. Within this genus, cassava (Manihot esculenta Crantz) is the sixth major crop globally [1,3]. In the XI century, cassava was taken from Brazil to the Caribbean and Central America. In the XVI century, Portuguese took cassava to the west coast of Africa in the XVIII century, in the XIX century to the African east coast and from there onwards to India; the Spanish took cassava to the Pacific [4]. Nowadays, cassava is an important energy source in the diet of millions of people in tropical and subtropical areas of America, Asia and Africa [5,6]. As cassava grows under marginal conditions, it presents a great potential to increase food security in developing countries [1,7]. This species can be exploited as a root food crop, vegetable, feed, or industrial uses such as starch, ethanol, or bioplastics [6][7][8].
The purpose of this study was to apply suitable cassava microsatellites to characterize the genetic variability of the Ecuadorian cassava collection composed mainly of local landraces from the Coast, Andes and Amazonia regions, which have long been neglected in international studies.

Plant materials of Cassava
The cassava collection evaluated in this study is a subset of the landraces collected along with continental Ecuador, as explained in Monteros-Altamirano et al. [39]. This collection encompasses 136 accessions collected in the Coast, Andes and Amazonia regions of Ecuador. The Ecuadorian collection was planted at the Central Amazonian Experimental Station of INIAP, located at Via Sacha-San Carlos at 250 m a.s.l., with an average temperature of 24 °C and average precipitation of 3100 mm. Then, apical leaves samples, smaller than 1.5 cm in size, were taken for molecular characterization. Approximately 50 leaves were collected per accession in Ziploc bags with silica gel to avoid deterioration and allow the plant tissue to dry, allowing the sample to be preserved. The silica gel was changed every 24 hours for two days or until the leaf tissue was dehydrated.
Extraction and quantification of DNA the sample in an Eppendorf tube and 1000μl of prewarmed extraction buffer (CTAB 2%, PVP 1%, NaCl sodium chloride 1.4M, Tris-HCL pH8 0.1M, EDTA Ethylenediamine Tetraacetate Acid Disodium Salt pH8 0) were added, 2M and 2μl of ß-mercaptoethanol). The samples were incubated at 65°C in the water bath for one h with shaking every 30 minutes. Subsequently, they were centrifuged at 14000 rpm for 10 minutes; then the supernatant was recovered and 750μl of CIA (Chloroform-Isoamyl Alcohol in proportion 4: 1). After shaking, it was centrifuged at 4000 rpm for 15 minutes. The supernatant was recovered in a new tube, and again 750 µl of CIA was added, vortexed and centrifuged at 10,000 rpm for 5 minutes. The supernatant was transferred to a new tube, 200 µl of icecold ethanol was added and incubated at -20 °C for 30 minutes. It was centrifuged at 5000 rpm for one minute, after which the extracted DNA was captured. The obtained DNA was washed with 70% ethanol, then the ethanol was removed, and the DNA was dried at temperature. The DNA pellet was re-suspended in 200 µl of T.E. buffer with 2 µl of RNAse enzyme (10 µg / ml) for every 100 µl of DNA obtained. The samples were conserved until their quantification at -20ºC. After DNA extraction, the concentration and quality of the DNA samples were determined. Samples were analyzed by using the BioTek Epoch ™ microplate spectrophotometer. The Take3 microplate of the spectrophotometer consists of 16 wells, where 2µLblank (ultrapure water) were placed in the first two wells and 2µL DNA in each of the following wells. The reading was performed using the equipment software, and the data obtained were imported into an Excel table. Subsequently, the DNA was diluted at a concentration of 5ng / µL with ultrapure water and tartrazine and stored at -20 °C for later use.

Validation and amplification of DNA
The extracted DNA samples were validated to determine their amplification capacity using microsatellite primer (SSRY5). The cocktail of the PCR reaction was: Buffer PCR (5X), MgCl2 (25mM), dNTP's (5mM), Primer (10uM) F, Primer (10uM) R, Taq polymerase (5U/uL), DNA (5ng/µL). After the amplification of the samples, 18 µL of mineral oil was placed in each sample and amplified in the Basic Gradient thermocycler from Biometra. The amplification program included an initial cycle of denaturation at 94C per 5 min, 30 cycles of cyclical denaturation at 94C per 45 seconds, 1 minute of binding at 55C, 2 min cyclic elongation at 72C, and a final cycle of elongation at 72C per 10 min, to finally stabilizing at 10 C for 5 min [41]. Seven primers ( Table 1) taken from those proposed by Chavarriaga-Aguirre et al. [42] and Mba et al. [43] were applied. Primers were previously tested to determine different combinations: duplex 1 (SSRY40 and SSRY 153), duplex 2 (SSRY68 and SSRY31), duplex 3 (SSRY3 and SSRY151) and monoplex (SSRY100). Subsequently, to visualize the amplified DNA bands, 2 µL of the amplified DNA samples previously mixed with the loading buffer (blue juice) were loaded into the wells of a 2% agarose gel with a 100 bp molecular DNA Ladder (10488-058 INVITROGEN). The samples were run in a horizontal electrophoresis chamber at 110v for 30 minutes. The gels were visualized in the Dolphin View Wealtec photo-reducer.

Genotyping of cassava DNA samples
In the SAGA-GT Microsatellite software, a project was created for cassava, detailing the information of each primer such as size, the channel in which they amplify (700-800 nm), the range of band size, the duplexes formed and the position of the cassava samples was located in advance in each position of the gel. First, the acrylamide gel was prepared with 20 ml of K.B. Gel Matrix Plus 6.5%, 150 μl of APS (Ammonium Persulfate) at 10% and 15 μl of Temmed (Tetramethylethylene-diamine at 99%), the mixture was placed between the glass plates of the LI-COR 4300 and the comb was placed, after 1 hour of polymerization of the gel, it was placed in the LI-COR 4300 with the 1X TBE (Tris-Borate-EDTA) K.B. Plus LI-COR buffer. Subsequently, a pre-run of 25 min at 1500 V was performed to focus the laser at 700 and 800 nm. Then, 0.8 µL of the amplified products were loaded, previously diluted with Blue Stop Solution LI-COR in a 1:1 ratio and denaturated at 94 ° C for 5 min, and the run was started at 1500 V for 1 hour and a half. The molecular weight marker IRDye 30-350 bp were used.

Statistical analysis
The molecular characterization data were obtained using the SAGA GT-SSR version 3.3 software, a reading assistant for the images provided by the LI-COR. The data matrix obtained from SAGA was imported into Excel, where it was purged and saved to continue with the analysis.

Analysis of genetic diversity
The Power Marker V3.0 program was used for the genetic diversity analysis [44]. The following parameters were obtained: Number of observations, sample size, allele frequency, number of generated genotypes, and number of alleles per locus, total or observed heterozygosity (Ho), genetic diversity or expected heterozygosity (He) and Polymorphism Information Content (PIC).

Cluster Analysis
For Cluster Analysis, binary data matrix (1 and 0) was elaborated and analyzed in the Power Marker V3.0 software; a UPGMA tree representation was made to represent the individual relationships between accessions.

Identification of duplicates
The identification of duplicates was carried out through multilocus microsatellites genotypes and using similarity percentages obtained in the Excel Microsatellites add-on software, which has a value of 100% for similar individuals.

Variance analysis
To determine the number of groups with the most significant variability between them, a molecular analysis of variance tested from two to ten groups was used with the Arlequin 2.0 program.  Table 2 shows the SSRY151 locus, which presented the highest genetic diversity (0.87) for the other loci. The loci presented an average genetic diversity value of 0.7 for the seven primers. It is important to note that the primers presented values greater than 0.5 for the PIC. This parameter is essential to determine the index of information provided by these markers within the population. The SSRY151 locus was the most polymorphic with a PIC of 0.85, while the SSRY40 and SSRY3 loci were the least polymorphic with a PIC of 0.52. In addition, the SSRY3 locus presented the lowest heterozygosity observed with a value of 0.44, while on average, for the loci, there was a Ho of 0.61. Average PIC value was of 0.67.

Allelic frequency
The allelic frequency for each primer was determined to analyze the variability of the loci in detail. Allele frequencies and sizes at each locus are detailed in Appendix 1.

Multivariate analysis: Clusters description
The cluster analysis grouped accessions according to their genotype, allowing a better analysis of the population's genetic diversity. This analysis was performed using the UPGMA method on the main coordinates of the individuals. The dendrogram shown in Figure 1 represents the genetic relationships of the germplasm, where two main groups were identified (G1 and G2).

Group 1
This group is formed by subgroup A and contain mostly accessions from Manabí, few accessions from Santo Domingo de Los Tsáchilas, and a minimum number of accessions from Esmeraldas, all these provinces are from the Coast of Ecuador (Appendix 3).

Group 2.
This group has subgroups B and C and contains all the accessions collected in the Amazonian region, few accessions from Manabí, Santo Domingo de Los Tsáchilas, one accession from Esmeraldas. Subgroup B is made up of accessions from the north-central part of the Amazonian region (Pastaza. Napo, Sucumbíos and Francisco de Orellana) and from the coastal region the accessions of Manabí and Santo Domingo de Los Tsáchilas. In contrast, subgroup C comprises all the accessions collected in Morona Santiago and Zamora Chinchipe; it also contains accessions from Pastaza, Sucumbíos, Napo and Francisco de Orellana, all provinces from the Amazonia. In this group, there are also accessions from the coastal region (Esmeraldas, Manabí, Santo Domingo de los Tsáchilas) and accessions from Tungurahua. The latter province is known as an Andean province; however, the region of the Tungurahua province named "Baños", located in the foothills of the Andes, is known as the door of the Ecuadorian Amazonia toward the province of Pastaza (Appendix 3).   18545  18554  18542  19119  18562  18559  19111  18510  19118  19133  19115  19130  18560  17730  19140  19094  19125  18529  19138  17619  18533  17628  19158  19090  19136  17627  19091  17622  17615  17640  19155  17625  19127  19122  19123  17620  19121  19100  19117  19150  17621  19162  19128  19126  17608  19145  19149  19148  19134  19124  18535  17617

Analysis of variance
The Molecular Analysis of Variance (AMOVA) was carried out, grouping the accessions by regions of Ecuador (Coast, Sierra and Amazonia).
The analysis of this grouping is detailed in Table 3.  Table 3 shows a low percentage of variation among populations or regions (2%), but most of the variation (98%) is due to differentiation among individuals.

Nei Genetic Distance
The genetic distance of Nei [45] was determined as another parameter to analyze the population genetic structure. For details of the groups' distance, see Appendix 2, where a Nei distance of 0.401 was observed between the G1 and G2 groups.

Genetic diversity of the collection
The use of microsatellite markers allowed the determination of the genetic diversity of the collection. Furthermore, these markers generated a critical amount of genetic information due to their ability to cover the entire genome, in addition to not presenting intergenic interactions and having simple inheritance [23,46-48,].
Cassava is considered a diploid species, although it is sometimes considered polyploid and possibly an allopolyploid (tetra or hexaploid) [49]. However, only homozygous and heterozygous diploid individuals were found in this study, similarly to Pincay [50]. Our results are consistent with Domínguez et al. [51]; De Carvalho et al. [20]; Ceballos et al. [3]; Ceballos et al. [7] who indicate that cassava is a diploid species.
In the 133 national cassava accessions, 56 alleles were identified using seven primers. The SSRY151, SSRY68, and SSRY100 loci presented the highest polymorphism index. However, the seven primers presented between 6 and 11 alleles with an average of 8 alleles/loci. Casalla [52], mentioned that a marker is highly polymorphic when identifying more than two alleles per locus. Primer SSRY100 presented the highest number of alleles, similar to Arguello [53], indicating that this primer can be used in future characterizations due to its great discriminating power. Other studies with a similar number of alleles per locus are Beovides et al. [54] from 2 to 10; Arguello [53] range from 3 to 9 or Pincay [50] from 8 to 13.
The observed heterozygosity (Ho) presented an average value of 0.61 with values that vary between 0.5 and 0.86, similar to those found by Pincay [50 These high values confirm that cassava is heterozygous in nature, as in Arguello [53], who also mentions that the asexual reproduction mode of cassava influences the levels of inherited heterozygosity. On the other hand, according to Morillo et al. [55], this high heterozygosity may be related to the allogamous nature of the species. Furthermore, the heterozygosity of cassava was confirmed by other studies such as Alzate et al. [28]. Sosa et al. [56] stated that heterozygosity is one of the critical diversity quantification indices.

Representativeness of the Ecuadorian collection of cassava
High genetic diversity and a low number of duplicates (8,8%) were identified in the Ecuadorian collection. The 100% duplicates are based mainly on collection-sites closeness, e.g. identical ECU19070 and ECU19071 accessions were collected in Tungurahua in the same place called Río Verde; ECU18505 and ECU18524 were collected in close parishes of the neighboring Coastal provinces of Manabí and Esmeraldas. The triplicate (ECU19134, ECU19148, ECU19149) were collected all in the same province of Pastaza in the Amazonia; and (ECU19115, ECU19125, ECU19118) were collected in the Amazonian province of Sucumbíos. Only the identical ECU19107 and ECU19135 were collected in distant parishes, although from two bordering provinces of Napo and Pastaza in the Amazonia. It is essential to mention that our collection is not wholly duplicated at CIAT, as stated by Tay [4], who indicates that 116 Ecuadorian accessions are conserved at CIAT (CGIAR); however, 134 accessions are considered as only in situ (missing from CGIARs) and not duplicated elsewhere. Cross passport reference between CIAT and INIAP's collections by region determined that CIAT holds 74% of the Ecuadorian Coastal accessions and only 25% of our Amazonian accessions; however, INIAP holds only 14% of the accessions collected in the Sierra from CIAT. Therefore, this Ecuadorian collection filled the gap of missing accessions, especially from the Amazonian region of Ecuador.

Cassava in Ecuadorian farmers' fields
In the Coastal region, Manabí province is a traditional commercial production area. Mainly smallholders intercrop cassava and other cash crops such as coffee, peanut and maize, among others [57,58]. In communities closer to urban centers, cassava is essentially a cash crop, but in more isolated communities, the crop is used mainly for family consumption and animal feed [58]. Industrial cassava byproducts have been produced and exported from Manabí [59], e.g. family farmers in this province have extracted cassava starch for over a hundred years [59,60]. Producer-processer associations occurred years ago [60] and continue today. Our SSR data indicate that accessions from the Manabí province are grouped except for a few materials (Figure 1, Appendix 3). This may be because this province has been supported by several international projects collaborating with IN-IAP and CIAT [60] which probably increased cassava genetic resources to help the commercial production in the area. Additionally, the genetic distance of Nei (Appendix 2) indicates that there is a difference with other materials from the Coast: Esmeraldas (0.350) and Santo Domingo de Los Tsáchilas (0.297), which are provinces that have not had commercial exploitation of yuca like Manabí and consequently less gene flow.
Unlike the Coastal region, where cassava mestizo producers are more inclined to commerce, the cropping shift to self-consumption added to the high cultural importance within indigenous communities in the Amazonia region. Although our collection did not show high genetic diversity within the Amazonian region (Appendix 2), we believed several local landraces are still underrepresented. Currently, they are in the hands of local farmers, especially in indigenous communities of the Amazonian region, e.g. at least five landraces of cassava (lumu) have been reported growing together in one chakra or chagra (swidden garden) at Kichwa communities of Napo [61]. In the chakra Kichwa, similar to other indigenous communities, cassava grows along with other species such as maize, rice, plantain, and beans [62] up to 25 species [63]. Between four and 13 varieties of cassava (kene) are managed by one Waorani family in the chagra de yuca or kewenkore [64]. The Jívaroan indigenous group in Pastaza province, intercrop cassava that occupies most of the cropland of the households [65] and in the same province where Quichua, Shiwiar, and Zapatero indigenous people live reported up to 16 varieties of cassava per household [63]. In this region, cassava presents cultural importance, e.g. Cassava-cropping knowledge is transmitted among women's generations of Kichwas through the delivery of good seeds and practices in the chakra, accompanied by advice [66]. Kichwa mainly uses cassava for self-consumption as food or chicha (fermented cassava beverage); surplus might be directed to the market, especially by mestizos [67]. Then, in situ studies must be conducted to understand the genetic diversity among Amazonian indigenous communities and strengthen the conservation and traditional use of local cassava.