Submitted:
11 June 2025
Posted:
12 June 2025
You are already at the latest version
Abstract
Keywords:
Introduction
The Promise of the Pangenome in Rice Breeding
Breakthroughs in Trait Discovery Enabled by Rice Pangenomics
Key Barriers to Translating Pangenomics Insights into Practical Breeding
Data Volume and Complexity Limit Functional Variant Discovery
Computational and Bioinformatic Challenges in Pangenome Representation
Computational Tools and Resource Limitations for Rice Pangenome Analysis
Advancing Genotyping Platforms to Capture Complex Pangenomic Variation
Emerging AI Approaches in Pangenome Interpretation: Gaps for Breeding
Translational and Organizational Challenges in Applying Pangenomics Discoveries
Conclusions and Outlook
| Pangenome composition | Pangenome construction method | Pangenome representations | Number of accessions | Sequencing platform | Reference |
| Asian cultivated rice and African cultivated rice | Iterative mapping and assembly | 64.10 Mb PAVs and 43,232 pan-genes | 12 | PacBio | [23] |
| Asian cultivated rice | Iterative mapping and assembly | 268 Mb PAVs and 53,758 pan-genes | 3,010 | Illumina | [8] |
| Asian cultivated rice and wild progenitor of Asian cultivated rice | De novo assembly and gene annotations comparison | 10,872 gene PAVs and 42,580 pan-genes | 66 | Illumina | [6] |
| Asian cultivated rice and African cultivated rice | Graph pangenome | ~24,469 PAVs and 66,636 pan-genes | 33 | PacBio | [7] |
| Asian and African wild and cultivated rice | Graph pangenome | Pan-genome of 1.52 Gb and 51 359 pan-genes | 251 | Nanopore | [57] |
| Asian and African wild and cultivated rice | De novo assembly and gene annotations comparison | 604 Mb PAVS and 60,293 pan-genes | 111 | PacBio | [14] |
| Asian and African wild and cultivated rice, weedy rice | De novo assembly and gene annotations comparison | 175,528 pan-gene families | 74 | PacBio | [63] |
| Asian cultivated rice | De novo assembly and gene annotations comparison | 297 to 786 genome-specific loci | 3 | Illumina | [5] |
| Asian cultivated rice and wild progenitor of Asian cultivated rice | De novo assembly and gene annotations comparison | 3.87 Gb of non-reference sequences and 69,531 pan-genes | 129 | PacBio and Nanopore | [9] |
| African cultivated rice | De novo assembly and gene annotations comparison | The gene number ranging from 49,662 to 51,262 | 3 | Illumina | [64] |
| Asian cultivated rice | Iterative mapping and assembly | 38,998 pan-genes and 71.74Mb non--reference sequences | 60 | Illumina | [25] |
| Wild and cultivated Oryza species | De novo assembly and gene annotations comparison | 101723 pan-gene families | 13 | PacBio | [12] |
| Asian cultivated rice | De novo assembly and gene annotations comparison | 119783 pan-gene families | 16 | PacBio | [10] |
| Category | Challenge | Potential Solution |
| Data volume & variant complexity | Pangenomes encompass tens of thousands of variable genes and millions of SVs, while multi-allelic variants (e.g., tandem repeats) remain underrepresented, complicating downstream analysis. | Integrate transcriptomic, epigenomic, and phenotypic data through high-throughput pipelines, and employ gene-centric summarization workflows to extract core and dispensable gene sets for targeted allele discovery. |
| Computational & bioinformatic complexity | Linear pangenome representations lack positional context, and graph-based tools (vg, GraphTyper2) demand substantial compute and memory, limiting scalability for large, repeat-rich plant genomes. | Develop scalable software tools such as VRPG that combine linear reference coordinate projection, annotation integration, and advanced data structures for graph-based pangenome analysis. |
| Tool adaptation & resource constraints | Many pangenome tools were developed for human datasets, and breeding programs often lack high-performance computing and specialized bioinformatics expertise. | Establish plant-specific benchmarking frameworks and optimize human-derived tools for crop genomes, following best practices from recent methodological reviews. |
| Genotyping platform limitations | Conventional SNP arrays miss non-reference and SVs, and novel pan-genome arrays require integration with existing breeding decision-support systems. | Integrate RPGA outputs with genome navigation tools like RiceNavi to streamline QTL pyramiding and breeding-route optimization within familiar breeder interfaces. |
| AI & Machine Learning Gaps | AI/ML shows promise for variant detection and trait prediction but faces usability, data, and trust issues. | Develop accessible, explainable AI/ML tools tailored for breeding; standardize and share high-quality breeding datasets; invest in collaborative training and infrastructure; design user-friendly decision-support platforms; prioritize model transparency and integration with existing breeding workflows. |
| Translational & organizational hurdles | Introgression of novel alleles via traditional backcrossing is time-consuming and prone to linkage drag, while CRISPR/Cas9 editing faces regulatory and breeder-acceptance barriers. | Encourage partnerships among breeders, bioinformaticians, and policymakers to align pangenome data with regulations and breeding workflows through training and clear communication. |

Authors Contribution
Data availability Statement
Acknowledgments
Conflicts of Interests
References
- Schreiber, M. , Jayakodi, M., Stein, N. & Mascher, M. Plant pangenomes for crop improvement, biodiversity and evolution. Nature Reviews Genetics 2024, 25, 563–577. [Google Scholar] [PubMed]
- Zhou, Y.; et al. Pan-genome inversion index reveals evolutionary insights into the subpopulation structure of Asian rice. Nature Communications 2023, 14, 1567. [Google Scholar] [CrossRef]
- Hu, H. , Zhao, J., Thomas, W.J.W., Batley, J. & Edwards, D. The role of pangenomics in orphan crop improvement. Nat Commun 2025, 16, 118. [Google Scholar]
- Chen, E. , Huang, X., Tian, Z., Wing, R.A. & Han, B. The genomics of Oryza species provides insights into rice domestication and heterosis. Annual review of plant biology 2019, 70, 639–665. [Google Scholar] [PubMed]
- Schatz, M.C.; et al. Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome biology 2014, 15, 1–16. [Google Scholar]
- Zhao, Q.; et al. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nature genetics 2018, 50, 278–284. [Google Scholar] [CrossRef]
- Qin, P.; et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 2021, 184, 3542–3558. [Google Scholar] [CrossRef]
- Wang, W.; et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 2018, 557, 43–49. [Google Scholar] [CrossRef] [PubMed]
- Guo, D.; et al. A pangenome reference of wild and cultivated rice. Nature 2025, 1–10. [Google Scholar] [CrossRef]
- Yu, Z.; et al. Rice Gene Index: a comprehensive pan-genome database for comparative and functional genomics of Asian rice. Molecular plant 2023, 16, 798–801. [Google Scholar] [CrossRef]
- Fornasiero, A.; et al. Oryza genome evolution through a tetraploid lens. Nature Genetics 2025, 57, 1287–1297. [Google Scholar] [CrossRef] [PubMed]
- Long, W.; et al. Genome evolution and diversity of wild and cultivated rice species. Nature Communications 2024, 15, 9994. [Google Scholar] [CrossRef] [PubMed]
- Hu, H.; et al. Plant pangenomics, current practice and future direction. Agriculture Communications 2024, 100039. [Google Scholar] [CrossRef]
- Zhang, F.; et al. Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes. Genome Research 2022, 32, 853–863. [Google Scholar] [CrossRef]
- Lin, Y.; et al. Identification of natural allelic variation in TTL1 controlling thermotolerance and grain size by a rice super pan-genome. Journal of Integrative Plant Biology 2023, 65, 2541–2551. [Google Scholar] [CrossRef]
- Chen, C.; et al. Natural Variation of PH8 Allele Improves Architecture and Cold Tolerance in Rice. Rice 2025, 18, 1–10. [Google Scholar] [CrossRef]
- Wei, X.; et al. Genomic investigation of 18,421 lines reveals the genetic architecture of rice. Science 2024, 385, eadm8762. [Google Scholar] [CrossRef]
- Zhou, Y.; et al. Graph pangenome captures missing heritability and empowers tomato breeding. Nature 2022, 606, 527–534. [Google Scholar] [CrossRef]
- Jayakodi, M. , Schreiber, M., Stein, N. & Mascher, M. Building pan-genome infrastructures for crop plants and their use in association genetics. DNA Research 2021, 28, dsaa030. [Google Scholar]
- Yang, L.; et al. GWAS meta-analysis using a graph-based pan-genome enhanced gene mining efficiency for agronomic traits in rice. Nature Communications 2025, 16, 3171. [Google Scholar] [CrossRef]
- Varshney, R.K.; et al. A chickpea genetic variation map based on the sequencing of 3,366 genomes. Nature 2021, 599, 622–627. [Google Scholar] [CrossRef] [PubMed]
- Huang, C. , Chen, Z. & Liang, C. Oryza pan-genomics: A new foundation for future rice research and improvement. The Crop Journal 2021, 9, 622–632. [Google Scholar]
- Wang, J.; et al. A pangenome analysis pipeline provides insights into functional gene identification in rice. Genome Biology 2023, 24, 19. [Google Scholar] [CrossRef]
- Wang, J.; et al. Pangenome-wide association study and transcriptome analysis reveal a novel QTL and candidate genes controlling both panicle and leaf blast resistance in rice. Rice 2024, 17, 27. [Google Scholar] [CrossRef]
- Woldegiorgis, S.T.; et al. Identification of heat-tolerant genes in non-reference sequences in rice by integrating pan-genome, transcriptomics, and QTLs. Genes 2022, 13, 1353. [Google Scholar] [CrossRef]
- Daware, A.; et al. Rice Pangenome Genotyping Array: an efficient genotyping solution for pangenome-based accelerated genetic improvement in rice. The Plant Journal 2023, 113, 26–46. [Google Scholar] [CrossRef]
- Naithani, S. , Deng, C.H., Sahu, S.K. & Jaiswal, P. Exploring pan-genomes: an overview of resources and tools for unraveling structure, function, and evolution of crop genes and genomes. Biomolecules 2023, 13, 1403. [Google Scholar]
- Guo, W.; et al. A barley pan-transcriptome reveals layers of genotype-dependent transcriptional complexity. Nature Genetics 2025, 1–10. [Google Scholar] [CrossRef] [PubMed]
- He, H.; et al. The pan-tandem repeat map highlights multiallelic variants underlying gene expression and agronomic traits in rice. Nature Communications 2024, 15, 7291. [Google Scholar] [CrossRef]
- Chodavarapu, R.K.; et al. Transcriptome and methylome interactions in rice hybrids. Proceedings of the National Academy of Sciences 2012, 109, 12040–12045. [Google Scholar] [CrossRef]
- Han, S.K.; et al. Mapping genomic regulation of kidney disease and traits through high-resolution and interpretable eQTLs. Nature communications 2023, 14, 2229. [Google Scholar] [CrossRef] [PubMed]
- Hu, H. , Li, R., Zhao, J., Batley, J. & Edwards, D. Technological development and advances for constructing and analyzing plant pangenomes. Technological development and advances for constructing and analyzing plant pangenomes. Genome biology and Evolution 2024, 16, evae081. [Google Scholar]
- Garrison, E.; et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nature biotechnology 2018, 36, 875–879. [Google Scholar] [CrossRef] [PubMed]
- Eggertsson, H.P.; et al. GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs. Nature communications 2019, 10, 5402. [Google Scholar] [CrossRef]
- Du, Z.-Z. , He, J.-B. & Jiao, W.-B. A comprehensive benchmark of graph-based genetic variant genotyping algorithms on plant genomes for creating an accurate ensemble pipeline. Genome Biology 2024, 25, 91. [Google Scholar]
- Miao, Z. & Yue, J.-X. Interactive visualization and interpretation of pangenome graphs by linear reference–based coordinate projection and annotation integration. Genome Research 2025, 35, 296–310. [Google Scholar]
- Li, W.; et al. Plant pan-genomics: recent advances, new challenges, and roads ahead. Journal of Genetics and Genomics 2022, 49, 833–846. [Google Scholar] [CrossRef] [PubMed]
- Sun, C.; et al. RPAN: rice pan-genome browser for∼ 3000 rice genomes. Nucleic acids research 2017, 45, 597–605. [Google Scholar] [CrossRef]
- Xu, Y.; et al. Smart breeding driven by big data, artificial intelligence, and integrated genomic-enviromic prediction. Molecular Plant 2022, 15, 1664–1695. [Google Scholar] [CrossRef]
- Mishra, S. , Srivastava, A.K., Khan, A.W., Tran, L.-S.P. & Nguyen, H.T. The era of panomics-driven gene discovery in plants. Trends in Plant Science 2024. [Google Scholar]
- Wei, X.; et al. A quantitative genomics map of rice provides genetic insights and guides breeding. Nature Genetics 2021, 53, 243–253. [Google Scholar] [CrossRef]
- Wang, T.; et al. A rice variation map derived from 10 548 rice accessions reveals the importance of rare variants. Nucleic Acids Research 2023, 51, 10924–10933. [Google Scholar] [CrossRef]
- Hu, H. , Danilevicz, M.F., Li, C. & Edwards, D. Pangenomics and Machine Learning in Improvement of Crop Plants. in Plant Molecular Breeding in Genomics Era: Concepts and Tools 321-347 (Springer, 2024).
- Bayer, P.E.; et al. The application of pangenomics and machine learning in genomic selection in plants. The Plant Genome 2021, 14, e20112. [Google Scholar] [CrossRef] [PubMed]
- Luo, C. , Liu, Y.H. & Zhou, X.M. VolcanoSV enables accurate and robust structural variant calling in diploid genomes from single-molecule long read sequencing. Nature Communications 2024, 15, 6956. [Google Scholar]
- Lin, J.; et al. SVision: a deep learning approach to resolve complex structural variants. Nature methods 2022, 19, 1230–1233. [Google Scholar] [CrossRef]
- Zhang, Y.; et al. Revolutionizing Crop Breeding: Next-Generation Artificial Intelligence and Big Data-Driven Intelligent Design. Engineering 2024. [Google Scholar] [CrossRef]
- Shakoor, N. , Northrup, D., Murray, S. & Mockler, T.C. Big data driven agriculture: big data analytics in plant breeding, genomics, and the use of remote sensing technologies to advance crop productivity. The Plant Phenome Journal 2019, 2, 1–8. [Google Scholar]
- Varshney, R.K.; et al. Designing future crops: genomics-assisted breeding comes of age. Trends in plant science 2021, 26, 631–649. [Google Scholar] [CrossRef]
- Lisboa, P.J. , Saralajew, S., Vellido, A., Fernández-Domenech, R. & Villmann, T. The coming of age of interpretable and explainable machine learning models. Neurocomputing 2023, 535, 25–39. [Google Scholar]
- Talabi, A.O.; et al. Orphan crops: a best fit for dietary enrichment and diversification in highly deteriorated marginal environments. Frontiers in Plant Science 2022, 13, 839704. [Google Scholar] [CrossRef]
- Murmu, S.; et al. A review of artificial intelligence-assisted omics techniques in plant defense: current trends and future directions. Frontiers in Plant Science 2024, 15, 1292054. [Google Scholar] [CrossRef] [PubMed]
- van Dijk, A.D.J. , Kootstra, G., Kruijer, W. & de Ridder, D. Machine learning in plant science and plant breeding. Machine learning in plant science and plant breeding. Iscience 2021, 24. [Google Scholar]
- Ghamkhar, K. , Hay, F.R., Engbers, M., Dempewolf, H. & Schurr, U. Realizing the potential of plant genetic resources: the use of phenomics for genebanks. Plants, People, Planet 2025, 7, 23–32. [Google Scholar]
- Wang, C. , Hu, S., Gardner, C. & Lübberstedt, T. Emerging avenues for utilization of exotic germplasm. Trends in Plant Science 2017, 22, 624–637. [Google Scholar]
- Dong, O.X.; et al. Marker-free carotenoid-enriched rice generated through targeted gene insertion using CRISPR-Cas9. Nature communications 2020, 11, 1178. [Google Scholar] [CrossRef]
- Shang, L.; et al. A super pan-genomic landscape of rice. Cell Research 2022, 32, 878–896. [Google Scholar] [CrossRef] [PubMed]
- Hu, H.; et al. Unravelling inversions: Technological advances, challenges, and potential impact on crop breeding. Plant biotechnology journal 2024, 22, 544–554. [Google Scholar] [CrossRef]
- Varshney, R.K.; et al. Analytical and decision support tools for genomics-assisted breeding. Trends in plant science 2016, 21, 354–363. [Google Scholar] [CrossRef]
- Bayer, P.E. , Golicz, A.A., Scheben, A., Batley, J. & Edwards, D. Plant pan-genomes are the new reference. Nature plants 2020, 6, 914–920. [Google Scholar]
- Tuggle, C.K.; et al. Current challenges and future of agricultural genomes to phenomes in the USA. Genome biology 2024, 25, 8. [Google Scholar] [CrossRef]
- Aziz, M.A. & Masmoudi, K. Molecular breakthroughs in modern plant breeding techniques. Horticultural Plant Journal 2025, 11, 15–41. [Google Scholar]
- Wu, D.; et al. A syntelog-based pan-genome provides insights into rice domestication and de-domestication. Genome Biology 2023, 24, 179. [Google Scholar] [CrossRef] [PubMed]
- Monat, C.; et al. De novo assemblies of three Oryza glaberrima accessions provide first insights about pan-genome of African rices. Genome biology and evolution 2017, 9, 1–6. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).