Submitted:
12 September 2025
Posted:
15 September 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Results
2.1. Phenotypic Characterization of Multiparent F4 Population
2.2. Population Structure Analysis
2.3. QTN Identification for OC and PC Traits via GWAS Analysis
2.4. Effect of Model Selection on GS Prediction Accuracy
2.5. Effect of TP Size and Marker Number on GS Prediction Accuracy
2.6. Effect of Trait Specific QTNs on GS Prediction Accuracy
3. Discussion
3.1. Effect of GS Models on Prediction Accuracy
3.2. Elucidating Effect of TP Size and Markers Number on GS
3.3. Genomics Selection for Antagonistic Trait
4. Materials and Methods
4.1. Plant Materials
4.2. Phenotype Data Collection and Analysis
4.3. Genotyping Analysis
4.4.Phylogenetic Relationship and Principal Component Analysis (PCA)
4.5.GWAS Analysis
4.6. Genomic Selection
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| OC | Oil content |
| PC | Protein content |
| QTNs | Quantitative trait nucleotides |
| GS | Genomic selection |
| TP | Training populations |
References
- Wilson, R.F. Seed composition. In Soybeans: Improvement, Production, and Uses; 2004; pp. 621-677.
- Zhang, J.; Song, Q.; Cregan, P.B.; Jiang, G.-L. Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max). Theor Appl Genet 2016, 129, 117–130. [Google Scholar] [CrossRef]
- Qin, J.; Shi, A.; Song, Q.; Li, S.; Wang, F.; Cao, Y.; Ravelombola, W.; Song, Q.; Yang, C.; Zhang, M. Genome wide association study and genomic election of amino acid concentrations in soybean seeds. Front Plant Sci 2019, 10, 1445. [Google Scholar] [CrossRef]
- Zhang, K.; Liu, S.; Li, W.; Liu, S.; Li, X.; Fang, Y.; Zhang, J.; Wang, Y.; Xu, S.; Zhang, J.; et al. Identification of QTNs controlling seed protein content in soybean using multi-locus genome-wide association studies. Front Plant Sci 2018, 9, 1690. [Google Scholar] [CrossRef]
- Wang, J.; Zhou, P.; Shi, X.; Yang, N.; Yan, L.; Zhao, Q.; Yang, C.; Guan, Y. Primary metabolite contents are correlated with seed protein and oil traits in near-isogenic lines of soybean. Crop J 2019, 7, 651–659. [Google Scholar] [CrossRef]
- Burton, J.W. Results relevant to soybean breeding. In Soybeans: Improvement, Production and Use. 1987, pp. 211-247.
- Wilcox, J.R. Increasing seed protein in soybean with eight cycles of recurrent selection. Crop Sci 1998, 38. [Google Scholar] [CrossRef]
- Karikari, B.; Li, S.; Bhat, J.A.; Cao, Y.; Kong, J.; Yang, J.; Gai, J.; Zhao, T. Genome-wide detection of major and epistatic effect QTLs for seed protein and oil content in soybean under multiple environments using high-density bin map. Int J Mol Sci 2019, 20, 979. [Google Scholar] [CrossRef]
- Li, W.H.; Liu, W.; Liu, L.; You, M.-S.; Liu, G.T.; Li, B.Y. QTL mapping for wheat flour color with additive, epistatic, and QTL×Environmental interaction effects. Agr Sci China 2011, 10, 651–660. [Google Scholar] [CrossRef]
- Zhang, T.; Wu, T.; Wang, L.; Jiang, B.; Zhen, C.; Yuan, S.; Hou, W.; Wu, C.; Han, T.; Sun, S. A combined linkage and GWAS analysis identifies QTLs linked to soybean seed protein and oil content. Int J Mol Sci 2019, 20, 5915. [Google Scholar] [CrossRef]
- Zatybekov, A.; Abugalieva, S.; Didorenko, S.; Gerasimova, Y.; Sidorik, I.; Anuarbek, S.; Turuspekov, Y. GWAS of agronomic traits in soybean collection included in breeding pool in Kazakhstan. BMC Plant Biol 2017, 17, 179. [Google Scholar] [CrossRef]
- Zeng, A.; Chen, P.; Korth, K.; Hancock, F.; Pereira, A.; Brye, K.; Wu, C.; Shi, A. Genome-wide association study (GWAS) of salt tolerance in worldwide soybean germplasm lines. Mol Breeding 2017, 37, 30. [Google Scholar] [CrossRef]
- Hu, D.; Zhang, H.; Du, Q.; Hu, Z.; Yang, Z.; Li, X.; Wang, J.; Huang, F.; Yu, D.; Wang, H.; et al. Genetic dissection of yield-related traits via genome-wide association analysis across multiple environments in wild soybean (Glycine soja Sieb. and Zucc.). Planta 2020, 251, 39. [Google Scholar] [CrossRef]
- Jonas, E.; de Koning, D.J. Does genomic selection have a future in plant breeding? Trends Biotechnol 2013, 31, 497–504. [Google Scholar] [CrossRef] [PubMed]
- Bernardo, R. Prediction of maize single-cross performance using RFLPs and information from related hybrids. Crop Sci 1994, 34. [Google Scholar] [CrossRef]
- Aguilar, I.; Misztal, I.; Johnson, D.L.; Legarra, A.; Tsuruta, S.; Lawlor, T.J. Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score1. J Dairy Sci 2010, 93, 743–752. [Google Scholar] [CrossRef] [PubMed]
- Christensen, O.F.; Lund, M.S. Genomic prediction when some animals are not genotyped. Genet Sel Evol 2010, 42. [Google Scholar] [CrossRef] [PubMed]
- McGowan, M.; Wang, J.; Dong, H.; Liu, X.; Jia, Y.; Wang, X.; Iwata, H.; Li, Y.; Lipka, A.E.; Zhang, Z. Ideas in genomic selection with the potential to transform plant molecular breeding: A review. Preprints 2020. [Google Scholar] [CrossRef]
- Tang, Y.; Liu, X.; Wang, J.; Li, M.; Wang, Q.; Tian, F.; Su, Z.; Pan, Y.; Liu, D.; Lipka, A.E.; et al. GAPIT Version 2: An enhanced integrated tool for genomic association and prediction. Plant Genome 2016, 9. [Google Scholar] [CrossRef]
- Wang, Q.; Tian, F.; Pan, Y.; Buckler, E.S.; Zhang, Z. A SUPER powerful method for genome wide association study. Plos One 2014, 9, e107684. [Google Scholar] [CrossRef]
- Heffner, E.L.; Sorrells, M.E.; Jannink, J.-L. Genomic selection for crop improvement. Crop Sci 2009, 49, 1–12. [Google Scholar] [CrossRef]
- Endelman, J.B. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 2011, 4. [Google Scholar] [CrossRef]
- Habier, D.; Fernando, R.L.; Kizilkaya, K.; Garrick, D.J. Extension of the bayesian alphabet for genomic selection. BMC Bioinformatics 2011, 12, 186. [Google Scholar] [CrossRef]
- Park, T.; Casella, G. The bayesian lasso. Journal of the American Statistical Association 2008, 103, 681–686. [Google Scholar] [CrossRef]
- Spindel, J.; Begum, H.; Akdemir, D.; Virk, P.; Collard, B.; Redoña, E.; Atlin, G.; Jannink, J.-L.; McCouch, S.R. Correction: Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS Genet 2015, 11, e1005350. [Google Scholar] [CrossRef]
- Thavamanikumar, S.; Dolferus, R.; Thumma, B.R. Comparison of genomic selection models to predict flowering time and spike grain number in two hexaploid wheat doubled haploid populations. G3-Genes Genom Genet 2015, 5, 1991–1998. [Google Scholar] [CrossRef] [PubMed]
- Crossa, J.; Pérez-Rodríguez, P.; Cuevas, J.; Montesinos-López, O.; Jarquín, D.; de los Campos, G.; Burgueño, J.; González-Camacho, J.M.; Pérez-Elizalde, S.; Beyene, Y.; et al. Genomic selection in plant breeding: methods, models, and perspectives. Trends Plant Sci 2017, 22, 961–975. [Google Scholar] [CrossRef]
- Cui, Y.; Li, R.; Li, G.; Zhang, F.; Zhu, T.; Zhang, Q.; Ali, J.; Li, Z.; Xu, S. Hybrid breeding of rice via genomic selection. Plant Biotechnol J 2020, 18, 57–67. [Google Scholar] [CrossRef]
- Kriaridou, C.; Tsairidou, S.; Houston, R.D.; Robledo, D. Genomic prediction using low density marker panels in aquaculture: performance across species, traits, and genotyping platforms. Front Genet 2020, 11. [Google Scholar] [CrossRef]
- Krishnappa, G.; Savadi, S.; Tyagi, B.S.; Singh, S.K.; Mamrutha, H.M.; Kumar, S.; Mishra, C.N.; Khan, H.; Gangadhara, K.; Uday, G.; et al. Integrated genomic selection for rapid improvement of crops. Genomics 2021, 113, 1070–1086. [Google Scholar] [CrossRef] [PubMed]
- e Sousa, M.B.; Galli, G.; Lyra, D.H.; Granato, Í.S.C.; Matias, F.I.; Alves, F.C.; Fritsche-Neto, R. Increasing accuracy and reducing costs of genomic prediction by marker selection. Euphytica 2019, 215, 18. [Google Scholar] [CrossRef]
- Chud, T.C.S.; Ventura, R.V.; Schenkel, F.S.; Carvalheiro, R.; Buzanskas, M.E.; Rosa, J.O.; Mudadu, M.d.A.; da Silva, M.V.G.B.; Mokry, F.B.; Marcondes, C.R.; et al. Strategies for genotype imputation in composite beef cattle. BMC Genet 2015, 16, 99. [Google Scholar] [CrossRef]
- Bandillo, N.; Jarquin, D.; Song, Q.; Nelson, R.; Cregan, P.; Specht, J.; Lorenz, A. A population structure and genome-wide association analysis on the USDA soybean germplasm collection. Plant Genome 2015, 8, 24. [Google Scholar] [CrossRef]
- Wang, Y.-y.; Li, Y.-q.; Wu, H.-y.; Hu, B.; Zheng, J.-j.; Zhai, H.; Lv, S.-x.; Liu, X.-l.; Chen, X.; Qiu, H.-m.; et al. Genotyping of soybean cultivars with medium-density array reveals the population structure and QTNs underlying maturity and seed traits. Front Plant Sci 2018, 9, 610. [Google Scholar] [CrossRef] [PubMed]
- Brown, K.E.; Kelly, J.K. Antagonistic pleiotropy can maintain fitness variation in annual plants. J Evolution Biol 2018, 31, 46–56. [Google Scholar] [CrossRef]
- Lee, C.; Pollak, E.J. Genetic antagonism between body weight and milk production in beef cattle. J Anim Sci 2002, 80, 316–321. [Google Scholar] [CrossRef] [PubMed]
- Tieman, D.; Zhu, G.; Resende, M.F., Jr.; Lin, T.; Nguyen, C.; Bies, D.; Rambla, J.L.; Beltran, K.S.; Taylor, M.; Zhang, B.; et al. A chemical genetic roadmap to improved tomato flavor. Science 2017, 355, 391–394. [Google Scholar] [CrossRef]
- Schmutz, J.; Cannon, S.B.; Schlueter, J.; Ma, J.; Mitros, T.; Nelson, W.; Hyten, D.L.; Song, Q.; Thelen, J.J.; Cheng, J.; et al. Genome sequence of the palaeopolyploid soybean. Nature 2010, 463, 178–183. [Google Scholar] [CrossRef] [PubMed]
- Bhat, J.A.; Yu, D.; Bohra, A.; Ganie, S.A.; Varshney, R.K. Features and applications of haplotypes in crop breeding. Commun Biol 2021, 4, 1266. [Google Scholar] [CrossRef]
- Faville, M.J.; Ganesh, S.; Cao, M.; Jahufer, M.Z.Z.; Bilton, T.P.; Easton, H.S.; Ryan, D.L.; Trethewey, J.A.K.; Rolston, M.P.; Griffiths, A.G.; et al. Predictive ability of genomic selection models in a multi-population perennial ryegrass training set using genotyping-by-sequencing. Theor Appl Genet 2018, 131, 703–720. [Google Scholar] [CrossRef]
- Ravelombola, W.S.; Qin, J.; Shi, A.; Nice, L.; Bao, Y.; Lorenz, A.; Orf, J.H.; Young, N.D.; Chen, S. Genome-wide association study and genomic selection for soybean chlorophyll content associated with soybean cyst nematode tolerance. BMC Genomics 2019, 20, 904. [Google Scholar] [CrossRef]
- Roorkiwal, M.; Jarquin, D.; Singh, M.K.; Gaur, P.M.; Bharadwaj, C.; Rathore, A.; Howard, R.; Srinivasan, S.; Jain, A.; Garg, V.; et al. Genomic-enabled prediction models using multi-environment trials to estimate the effect of genotype × environment interaction on prediction accuracy in chickpea. Sci Rep 2018, 8, 11701. [Google Scholar] [CrossRef]
- Meuwissen, T.H.E.; Hayes, B.J.; Goddard, M.E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef]
- Pérez, P.; de los Campos, G. Genome-wide regression and prediction with the BGLR statistical package. Genetics 2014, 198, 483–495. [Google Scholar] [CrossRef]
- Heffner, E.L.; Lorenz, A.J.; Jannink, J.-L.; Sorrells, M.E. Plant breeding with genomic selection: gain per unit time and cost. Crop Sci 2010, 50, 1681–1690. [Google Scholar] [CrossRef]
- Bhat, J.A.; Ali, S.; Salgotra, R.K.; Mir, Z.A.; Dutta, S.; Jadon, V.; Tyagi, A.; Mushtaq, M.; Jain, N.; Singh, P.K.; et al. Genomic selection in the era of next generation sequencing for complex traits in plant breeding. Front Genet 2016, 7, 221. [Google Scholar] [CrossRef]
- Arruda, M.P.; Brown, P.J.; Lipka, A.E.; Krill, A.M.; Thurber, C.; Kolb, F.L. Genomic selection for predicting fusarium head blight resistance in a wheat breeding program. Plant Genome 2015, 8, 3. [Google Scholar] [CrossRef]
- Bentley, A.R.; Scutari, M.; Gosman, N.; Faure, S.; Bedford, F.; Howell, P.; Cockram, J.; Rose, G.A.; Barber, T.; Irigoyen, J.; et al. Applying association mapping and genomic selection to the dissection of key traits in elite European wheat. Theor Appl Genet 2014, 127, 2619–2633. [Google Scholar] [CrossRef] [PubMed]
- Cericola, F.; Jahoor, A.; Orabi, J.; Andersen, J.R.; Janss, L.L.; Jensen, J. Optimizing training population size and genotyping strategy for genomic prediction using association study results and pedigree information. a case of study in advanced wheat breeding lines. Plos One 2017, 12, e0169606. [Google Scholar] [CrossRef]
- Werner, C.R.; Voss-Fels, K.P.; Miller, C.N.; Qian, W.; Hua, W.; Guan, C.-Y.; Snowdon, R.J.; Qian, L. Effective genomic selection in a narrow-genepool crop with low-density markers: asian rapeseed as an example. Plant Genome 2018, 11, 170084. [Google Scholar] [CrossRef] [PubMed]
- Rolling, W.R.; Dorrance, A.E.; McHale, L.K. Testing methods and statistical models of genomic prediction for quantitative disease resistance to Phytophthora sojae in soybean [Glycine max (L.) Merr] germplasm collections. Theor Appl Genet 2020, 133, 3441–3454. [Google Scholar] [CrossRef] [PubMed]
- Chebib, J.; Guillaume, F. Pleiotropy or linkage? Their relative contributions to the genetic correlation of quantitative traits and detection by multitrait GWA studies. Genetics 2021, 219. [Google Scholar] [CrossRef]
- Mir, R.R.; Bhat, J.A.; Jan, N.; Singh, B.; Razdan, A.K.; Bhat, M.A.; Kumar, A.; Srivastava, E.; Malviya, N. Role of molecular markers. In Alien Gene Transfer in Crop Plants, Volume 1; Springer: 2014; pp. 165-185.
- Milligan, B.G. Purification of chloroplast DNA using hexadecyltrimethylammonium bromide. Plant Mol Biol Rep 1989, 7, 144–149. [Google Scholar] [CrossRef]
- Price, M.N.; Dehal, P.S.; Arkin, A.P. FastTree 2 – Approximately maximum-likelihood trees for large alignments. PLOS ONE 2010, 5, e9490. [Google Scholar] [CrossRef]
- Yu, G.; Smith, D.K.; Zhu, H.; Guan, Y.; Lam, T.T.-Y. ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol Evol 2017, 8, 28–36. [Google Scholar] [CrossRef]
- Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.W.; Daly, M.J.; et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007, 81, 559–575. [Google Scholar] [CrossRef]
- Wang, S.B.; Feng, J.Y.; Ren, W.L.; Huang, B.; Zhou, L.; Wen, Y.J.; Zhang, J.; Dunwell, J.M.; Xu, S.; Zhang, Y.M. Improving power and accuracy of genome-wide association studies via a multi-locus mixed linear model methodology. Sci Rep 2016, 6, 19444. [Google Scholar] [CrossRef]
- Zhang, Y.-W.; Tamba, C.L.; Wen, Y.-J.; Li, P.; Ren, W.-L.; Ni, Y.-L.; Gao, J.; Zhang, Y.-M. mrMLM v4.0.2: An R platform for multi-locus genome-wide association studies. Genom Proteom Bioinf 2020, 18, 481–487:. [Google Scholar] [CrossRef]
- Wen, Y.-J.; Zhang, H.; Ni, Y.-L.; Huang, B.; Zhang, J.; Feng, J.-Y.; Wang, S.-B.; Dunwell, J.M.; Zhang, Y.-M.; Wu, R. Methodological implementation of mixed linear models in multi-locus genome-wide association studies. Brief Bioinform 2017, 19, 700–712. [Google Scholar] [CrossRef]
- Tamba, C.L.; Ni, Y.-L.; Zhang, Y.-M. Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies. PLoS Comput Biol 2017, 13, e1005357–e1005357. [Google Scholar] [CrossRef]
- Zhang, J.; Feng, J.Y.; Ni, Y.L.; Wen, Y.J.; Niu, Y.; Tamba, C.L.; Yue, C.; Song, Q.; Zhang, Y.M. pLARmEB: integration of least angle regression with empirical Bayes for multilocus genome-wide association studies. Heredity 2017, 118, 517–524. [Google Scholar] [CrossRef]
- Ren, W.-L.; Wen, Y.-J.; Dunwell, J.M.; Zhang, Y.-M. pKWmEB: integration of Kruskal–Wallis test with empirical Bayes under polygenic background control for multi-locus genome-wide association study. Heredity 2018, 120, 208–218. [Google Scholar] [CrossRef]
- Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]





Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).