Submitted:
21 August 2023
Posted:
23 August 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Pangenome Construction, Visualization, and Data Analysis Tools
3. A Survey of Crop Pan-genomes and Genomic Resources
4. Plant pan-genomics-driven insights for understanding the basis of agronomic traits
5. Outlook, opportunities, and innovations in plant pan-genome research
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Computational Pan-Genomics, Consortium. "Computational Pan-Genomics: Status, Promises and Challenges." Brief Bioinform 19, no. 1 (2018): 118-35.
- Della Coletta, R.; Qiu, Y.; Ou, S.; Hufford, M.B.; Hirsch, C.N. How the pan-genome is changing crop genomics and improvement. Genome Biol. 2021, 22, 1–19. [Google Scholar] [CrossRef]
- Ho, S.S.; Urban, A.E.; Mills, R.E. Structural variation in the sequencing era. Nat. Rev. Genet. 2019, 21, 171–189. [Google Scholar] [CrossRef]
- Kyriakidou, M.; Tai, H.H.; Anglin, N.L.; Ellis, D.; Strömvik, M.V. Current Strategies of Polyploid Plant Genome Sequence Assembly. Front. Plant Sci. 2018, 9, 1660. [Google Scholar] [CrossRef] [PubMed]
- Sedlazeck, F.J.; Rescheneder, P.; Smolka, M.; Fang, H.; Nattestad, M.; von Haeseler, A.; Schatz, M.C. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 2018, 15, 461–468. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y., J. Yu, M. Jiang, W. Lei, X. Zhang, and H. Tang. "Sequencing and Assembly of Polyploid Genomes." Methods in Molecular Biology 2545 (2023): 429-58.
- Sahu, S.K.; Liu, H. Long-read sequencing (method of the year 2022): The way forward for plant omics research. Mol. Plant 2023, 16, 791–793. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Y.; Chebotarov, D.; Kudrna, D.; Llaca, V.; Lee, S.; Rajasekar, S.; Mohammed, N.; Al-Bader, N.; Sobel-Sorenson, C.; Parakkal, P.; et al. A platinum standard pan-genome resource that represents the population structure of Asian rice. Sci. Data 2020, 7, 1–11. [Google Scholar] [CrossRef]
- Wang, W.; Mauleon, R.; Hu, Z.; Chebotarov, D.; Tai, S.; Wu, Z.; Li, M.; Zheng, T.; Fuentes, R.R.; Zhang, F.; et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 2018, 557, 43–49. [Google Scholar] [CrossRef]
- Schatz, M. C. G. Maron, J. C. Stein, A. Hernandez Wences, J. Gurtowski, E. Biggers, H. Lee, M. Kramer, E. Antoniou, E. Ghiban, M. H. Wright, J. M. Chia, D. Ware, S. R. McCouch, and W. R. McCombie. "Whole Genome De Novo Assemblies of Three Divergent Strains of Rice, Oryza Sativa, Document Novel Gene Space of Aus and Indica." Genome Biology 15, no. 11 (2014): 506.
- Jayakodi, M.; Padmarasu, S.; Haberer, G.; Bonthala, V.S.; Gundlach, H.; Monat, C.; Lux, T.; Kamal, N.; Lang, D.; Himmelbach, A.; et al. The barley pan-genome reveals the hidden legacy of mutation breeding. Nature 2020, 588, 284–289. [Google Scholar] [CrossRef]
- Walkowiak, S.; Gao, L.; Monat, C.; Haberer, G.; Kassa, M.T.; Brinton, J.; Ramirez-Gonzalez, R.H.; Kolodziej, M.C.; Delorean, E.; Thambugala, D.; et al. Multiple wheat genomes reveal global variation in modern breeding. Nature 2020, 588, 277–283. [Google Scholar] [CrossRef]
- Hirsch, C.N.; Foerster, J.M.; Johnson, J.M.; Sekhon, R.S.; Muttoni, G.; Vaillancourt, B.; Peñagaricano, F.; Lindquist, E.; Pedraza, M.A.; Barry, K.; et al. Insights into the Maize Pan-Genome and Pan-Transcriptome. Plant Cell 2014, 26, 121–135. [Google Scholar] [CrossRef]
- Liu, Y.; Du, H.; Li, P.; Shen, Y.; Peng, H.; Liu, S.; Zhou, G.-A.; Zhang, H.; Liu, Z.; Shi, M.; et al. Pan-Genome of Wild and Cultivated Soybeans. Cell 2020, 182, 162–176. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.-H.; Zhou, G.; Ma, J.; Jiang, W.; Jin, L.-G.; Zhang, Z.; Guo, Y.; Zhang, J.; Sui, Y.; Zheng, L.; et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat. Biotechnol. 2014, 32, 1045–1052. [Google Scholar] [CrossRef] [PubMed]
- Song, J.-M.; Guan, Z.; Hu, J.; Guo, C.; Yang, Z.; Wang, S.; Liu, D.; Wang, B.; Lu, S.; Zhou, R.; et al. Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nat. Plants 2020, 6, 34–45. [Google Scholar] [CrossRef]
- Zhuang, W. Chen, M. Yang, J. Wang, M. K. Pandey, C. Zhang, W. C. Chang, L. Zhang, X. Zhang, R. Tang, V. Garg, X. Wang, H. Tang, C. N. Chow, J. Wang, Y. Deng, D. Wang, A. W. Khan, Q. Yang, T. Cai, P. Bajaj, K. Wu, B. Guo, X. Zhang, J. Li, F. Liang, J. Hu, B. Liao, S. Liu, A. Chitikineni, H. Yan, Y. Zheng, S. Shan, Q. Liu, D. Xie, Z. Wang, S. A. Khan, N. Ali, C. Zhao, X. Li, Z. Luo, S. Zhang, R. Zhuang, Z. Peng, S. Wang, G. Mamadou, Y. Zhuang, Z. Zhao, W. Yu, F. Xiong, W. Quan, M. Yuan, Y. Li, H. Zou, H. Xia, L. Zha, J. Fan, J. Yu, W. Xie, J. Yuan, K. Chen, S. Zhao, W. Chu, Y. Chen, P. Sun, F. Meng, T. Zhuo, Y. Zhao, C. Li, G. He, Y. Zhao, C. Wang, P. B. Kavikishor, R. L. Pan, A. H. Paterson, X. Wang, R. Ming, and R. K. Varshney. "The Genome of Cultivated Peanut Provides Insight into Legume Karyotypes, Polyploid Evolution and Crop Domestication." Nature Genetics 51, no. 5 (2019): 865-76.
- International Wheat Genome Sequencing, Consortium. "Shifting the Limits in Wheat Research and Breeding Using a Fully Annotated Reference Genome." Science 361, no. 6403 (2018).
- Edger, P.P.; Poorten, T.J.; VanBuren, R.; Hardigan, M.A.; Colle, M.; McKain, M.R.; Smith, R.D.; Teresi, S.J.; Nelson, A.D.L.; Wai, C.M.; et al. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 2019, 51, 541–547. [Google Scholar] [CrossRef] [PubMed]
- Kyriakidou, M.; Anglin, N.L.; Ellis, D.; Tai, H.H.; Strömvik, M.V. Genome assembly of six polyploid potato genomes. Sci. Data 2020, 7, 1–6. [Google Scholar] [CrossRef] [PubMed]
- Shang, L.; Li, X.; He, H.; Yuan, Q.; Song, Y.; Wei, Z.; Lin, H.; Hu, M.; Zhao, F.; Zhang, C.; et al. A super pan-genomic landscape of rice. Cell Res. 2022, 32, 878–896. [Google Scholar] [CrossRef]
- He, Q.; Tang, S.; Zhi, H.; Chen, J.; Zhang, J.; Liang, H.; Alam, O.; Li, H.; Zhang, H.; Xing, L.; et al. A graph-based genome and pan-genome variation of the model plant Setaria. Nat. Genet. 2023, 55, 1232–1242. [Google Scholar] [CrossRef]
- Yap, I. V., D. Schneider, J. Kleinberg, D. Matthews, S. Cartinhour, and S. R. McCouch. "A Graph-Theoretic Approach to Comparing and Integrating Genetic, Physical and Sequence-Based Maps." Genetics 165, no. 4 (2003): 2235-47.
- Tettelin, H.; Masignani, V.; Cieslewicz, M.J.; Donati, C.; Medini, D.; Ward, N.L.; Angiuoli, S.V.; Crabtree, J.; Jones, A.L.; Durkin, A.S.; et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial "pan-genome". Proc. Natl. Acad. Sci. USA 2005, 102, 13950–13955. [Google Scholar] [CrossRef]
- Springer, N.M.; Ying, K.; Fu, Y.; Ji, T.; Yeh, C.-T.; Jia, Y.; Wu, W.; Richmond, T.; Kitzman, J.; Rosenbaum, H.; et al. Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content. PLOS Genet. 2009, 5, e1000734. [Google Scholar] [CrossRef]
- E Anderson, J.; Kantar, M.B.; Kono, T.Y.; Fu, F.; O Stec, A.; Song, Q.; Cregan, P.B.; E Specht, J.; Diers, B.W.; Cannon, S.B.; et al. A Roadmap for Functional Structural Variants in the Soybean Genome. G3 Genes|Genomes|Genetics 2014, 4, 1307–1318. [Google Scholar] [CrossRef]
- Golicz, A.A.; Bayer, P.E.; Barker, G.C.; Edger, P.P.; Kim, H.; Martinez, P.A.; Chan, C.K.K.; Severn-Ellis, A.; McCombie, W.R.; Parkin, I.A.P.; et al. The pangenome of an agronomically important crop plant Brassica oleracea. Nat. Commun. 2016, 7, 13390. [Google Scholar] [CrossRef]
- Tao, Y.; Luo, H.; Xu, J.; Cruickshank, A.; Zhao, X.; Teng, F.; Hathorn, A.; Wu, X.; Liu, Y.; Shatte, T.; et al. Extensive variation within the pan-genome of cultivated and wild sorghum. Nat. Plants 2021, 7, 766–773. [Google Scholar] [CrossRef]
- Xu, X.; Liu, X.; Ge, S.; Jensen, J.D.; Hu, F.; Dong, Y.; Gutenkunst, R.N.; Fang, L.; Huang, L.; Li, J.; et al. Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nat. Biotechnol. 2011, 30, 105–111. [Google Scholar] [CrossRef] [PubMed]
- Lam, H.-M.; Xu, X.; Liu, X.; Chen, W.; Yang, G.; Wong, F.-L.; Li, M.-W.; He, W.; Qin, N.; Wang, B.; et al. Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat. Genet. 2010, 42, 1053–1059. [Google Scholar] [CrossRef] [PubMed]
- Gui, S.; Wei, W.; Jiang, C.; Luo, J.; Chen, L.; Wu, S.; Li, W.; Wang, Y.; Li, S.; Yang, N.; et al. A pan-Zea genome map for enhancing maize improvement. Genome Biol. 2022, 23, 1–22. [Google Scholar] [CrossRef] [PubMed]
- Allaby, R.G.; Ware, R.L.; Kistler, L. A re-evaluation of the domestication bottleneck from archaeogenomic evidence. Evol. Appl. 2018, 12, 29–37. [Google Scholar] [CrossRef]
- Tirnaz, S. Zandberg, W. J. W. Thomas, J. Marsh, D. Edwards, and J. Batley. "Application of Crop Wild Relatives in Modern Breeding: An Overview of Resources, Experimental and Computational Methodologies." Frontiers of Plant Science 13 (2022): 1008904.
- Papa, R.; Gepts, P. Asymmetry of gene flow and differential geographical structure of molecular diversity in wild and domesticated common bean (Phaseolus vulgaris L.) from Mesoamerica. Theor. Appl. Genet. 2003, 106, 239–250. [Google Scholar] [CrossRef]
- McNally, K.L.; Childs, K.L.; Bohnert, R.; Davidson, R.M.; Zhao, K.; Ulat, V.J.; Zeller, G.; Clark, R.M.; Hoen, D.R.; Bureau, T.E.; et al. Genomewide SNP variation reveals relationships among landraces and modern varieties of rice. Proc. Natl. Acad. Sci. 2009, 106, 12273–12278. [Google Scholar] [CrossRef]
- Brozynska, M.; Furtado, A.; Henry, R.J. Genomics of crop wild relatives: expanding the gene pool for crop improvement. Plant Biotechnol. J. 2015, 14, 1070–1085. [Google Scholar] [CrossRef]
- Bohra, A.; Kilian, B.; Sivasankar, S.; Caccamo, M.; Mba, C.; McCouch, S.R.; Varshney, R.K. Reap the crop wild relatives for breeding future crops. Trends Biotechnol. 2022, 40, 412–431. [Google Scholar] [CrossRef]
- McCouch, S. R. , and L. H. Rieseberg. "Harnessing Crop Diversity." Proceedings of the National Academy of Sciences of the United States of America 120, no. 14 (2023): e2221410120.
- McCouch, S. Toward a plant genomics initiative: Thoughts on the value of cross-species and cross-genera comparisons in the grasses. Proc. Natl. Acad. Sci. 1998, 95, 1983–1985. [Google Scholar] [CrossRef] [PubMed]
- Würschum, T.; Rapp, M.; Miedaner, T.; Longin, C.F.H.; Leiser, W.L. Copy number variation of Ppd-B1 is the major determinant of heading time in durum wheat. BMC Genet. 2019, 20, 64–8. [Google Scholar] [CrossRef] [PubMed]
- Knox, A.K.; Dhillon, T.; Cheng, H.; Tondelli, A.; Pecchioni, N.; Stockinger, E.J. CBF gene copy number variation at Frost Resistance-2 is associated with levels of freezing tolerance in temperate-climate cereals. Theor. Appl. Genet. 2010, 121, 21–35. [Google Scholar] [CrossRef] [PubMed]
- Maron, L. G. T. Guimaraes, M. Kirst, P. S. Albert, J. A. Birchler, P. J. Bradbury, E. S. Buckler, A. E. Coluccio, T. V. Danilova, D. Kudrna, J. V. Magalhaes, M. A. Pineros, M. C. Schatz, R. A. Wing, and L. V. Kochian. "Aluminum Tolerance in Maize Is Associated with Higher Mate1 Gene Copy Number." Proceedings of the National Academy of Sciences of the United States of America 110, no. 13 (2013): 5241-6.
- Cook, D.E.; Lee, T.G.; Guo, X.; Melito, S.; Wang, K.; Bayless, A.M.; Wang, J.; Hughes, T.J.; Willis, D.K.; Clemente, T.E.; et al. Copy Number Variation of Multiple Genes at Rhg1 Mediates Nematode Resistance in Soybean. Science 2012, 338, 1206–1209. [Google Scholar] [CrossRef] [PubMed]
- Liu, Q.; Xu, J.; Zhu, Y.; Mo, Y.; Yao, X.-F.; Wang, R.; Ku, W.; Huang, Z.; Xia, S.; Tong, J.; et al. The Copy Number Variation of OsMTD1 Regulates Rice Plant Architecture. Front. Plant Sci. 2021, 11. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Xiong, G.; Hu, J.; Jiang, L.; Yu, H.; Xu, J.; Fang, Y.; Zeng, L.; Xu, E.; Xu, J.; et al. Copy number variation at the GL7 locus contributes to grain size diversity in rice. Nat. Genet. 2015, 47, 944–948. [Google Scholar] [CrossRef]
- Bosman, R.N.; Vervalle, J.A.-M.; November, D.L.; Burger, P.; Lashbrooke, J.G. Grapevine genome analysis demonstrates the role of gene copy number variation in the formation of monoterpenes. Front. Plant Sci. 2023, 14, 1112214. [Google Scholar] [CrossRef]
- Falginella, L., S. D. Castellarin, R. Testolin, G. A. Gambetta, M. Morgante, and G. Di Gaspero. "Expansion and Subfunctionalisation of Flavonoid 3',5'-Hydroxylases in the Grapevine Lineage." BMC Genomics 11 (2010): 562.
- Nilsen, K. T., S. Walkowiak, D. Xiang, P. Gao, T. D. Quilichini, I. R. Willick, B. Byrns, A. N'Diaye, J. Ens, K. Wiebe, Y. Ruan, R. D. Cuthbert, M. Craze, E. J. Wallington, J. Simmonds, C. Uauy, R. Datla, and C. J. Pozniak. "Copy Number Variation of Tddof Controls Solid-Stemmed Architecture in Wheat." Proceedings of the National Academy of Sciences of the United States of America 117, no. 46 (2020): 28708-18.
- Gao, L.; Gonda, I.; Sun, H.; Ma, Q.; Bao, K.; Tieman, D.M.; Burzynski-Chang, E.A.; Fish, T.L.; Stromberg, K.A.; Sacks, G.L.; et al. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat. Genet. 2019, 51, 1044–1051. [Google Scholar] [CrossRef]
- Liu, J.; Dawe, R.K. Large haplotypes highlight a complex age structure within the maize pan-genome. Genome Res. 2023, 33, 359–370. [Google Scholar] [CrossRef]
- Tao, Y.; Zhao, X.; Mace, E.; Henry, R.; Jordan, D. Exploring and Exploiting Pan-genomics for Crop Improvement. Mol. Plant 2019, 12, 156–169. [Google Scholar] [CrossRef]
- Bayer, P.E.; Golicz, A.A.; Scheben, A.; Batley, J.; Edwards, D. Plant pan-genomes are the new reference. Nat. Plants 2020, 6, 914–920. [Google Scholar] [CrossRef]
- Jayakodi, M.; Schreiber, M.; Stein, N.; Mascher, M. Building pan-genome infrastructures for crop plants and their use in association genetics. DNA Res. 2021, 28. [Google Scholar] [CrossRef] [PubMed]
- Li, W., J. Liu, H. Zhang, Z. Liu, Y. Wang, L. Xing, Q. He, and H. Du. "Plant Pan-Genomics: Recent Advances, New Challenges, and Roads Ahead." Journal of Genetics and Genomics 49, no. 9 (2022): 833-46.
- Yan, H.; Sun, M.; Zhang, Z.; Jin, Y.; Zhang, A.; Lin, C.; Wu, B.; He, M.; Xu, B.; Wang, J.; et al. Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet. Nat. Genet. 2023, 55, 507–518. [Google Scholar] [CrossRef] [PubMed]
- Zhou, H.; Yan, F.; Hao, F.; Ye, H.; Yue, M.; Woeste, K.; Zhao, P.; Zhang, S. Pan-genome and transcriptome analyses provide insights into genomic variation and differential gene expression profiles related to disease resistance and fatty acid biosynthesis in eastern black walnut (Juglans nigra). Hortic. Res. 2023, 10, uhad015. [Google Scholar] [CrossRef] [PubMed]
- Golicz, A. A., J. Batley, and D. Edwards. "Towards Plant Pangenomics." Plant Biotechnology Journal 14, no. 4 (2016): 1099-105.
- Garrison, E.; Sirén, J.; Novak, A.M.; Hickey, G.; Eizenga, J.M.; Dawson, E.T.; Jones, W.; Garg, S.; Markello, C.; Lin, M.F.; et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat. Biotechnol. 2018, 36, 875–879. [Google Scholar] [CrossRef]
- Rakocevic, G.; Semenyuk, V.; Lee, W.-P.; Spencer, J.; Browning, J.; Johnson, I.J.; Arsenijevic, V.; Nadj, J.; Ghose, K.; Suciu, M.C.; et al. Fast and accurate genomic analyses using genome graphs. Nat. Genet. 2019, 51, 354–362. [Google Scholar] [CrossRef]
- Cheng, H.; Concepcion, G.T.; Feng, X.; Zhang, H.; Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 2021, 18, 170–175. [Google Scholar] [CrossRef]
- Padgitt-Cobb, L.K.; Kingan, S.B.; Wells, J.; Elser, J.; Kronmiller, B.; Moore, D.; Concepcion, G.; Peluso, P.; Rank, D.; Jaiswal, P.; et al. A draft phased assembly of the diploid Cascade hop (Humulus lupulus) genome. Plant Genome 2021, 14, e20072. [Google Scholar] [CrossRef]
- Eizenga, J. M. M. Novak, J. A. Sibbesen, S. Heumos, A. Ghaffaari, G. Hickey, X. Chang, J. D. Seaman, R. Rounthwaite, J. Ebler, M. Rautiainen, S. Garg, B. Paten, T. Marschall, J. Siren, and E. Garrison. "Pangenome Graphs." Annu Rev Genomics Hum Genet 21 (2020): 139-62.
- Hickey, G.; Heller, D.; Monlong, J.; Sibbesen, J.A.; Sirén, J.; Eizenga, J.; Dawson, E.T.; Garrison, E.; Novak, A.M.; Paten, B. Genotyping structural variants in pangenome graphs using the vg toolkit. Genome Biol. 2020, 21, 1–17. [Google Scholar] [CrossRef]
- Vernikos, G. S. "A Review of Pangenome Tools and Recent Studies." In The Pangenome: Diversity, Dynamics and Evolution of Genomes, edited by H. Tettelin and D. Medini, 89-112. Cham (CH), 2020.
- Glick, L.; Mayrose, I. The Effect of Methodological Considerations on the Construction of Gene-Based Plant Pan-genomes. Genome Biol. Evol. 2023, 15. [Google Scholar] [CrossRef]
- Koren, S.; Walenz, B.P.; Berlin, K.; Miller, J.R.; Bergman, N.H.; Phillippy, A.M. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017, 27, 722–736. [Google Scholar] [CrossRef]
- Kolmogorov, M.; Yuan, J.; Lin, Y.; Pevzner, P.A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 2019, 37, 540–546. [Google Scholar] [CrossRef] [PubMed]
- Swain, M.T.; Tsai, I.J.; A Assefa, S.; Newbold, C.; Berriman, M.; Otto, T.D. A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs. Nat. Protoc. 2012, 7, 1260–1284. [Google Scholar] [CrossRef] [PubMed]
- Li, D.; Liu, C.-M.; Luo, R.; Sadakane, K.; Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 2015, 31, 1674–1676. [Google Scholar] [CrossRef] [PubMed]
- Tolstoganov, I.; Bankevich, A.; Chen, Z.; A Pevzner, P. cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs. Bioinformatics 2019, 35, i61–i70. [Google Scholar] [CrossRef]
- Meleshko, D.; Mohimani, H.; Tracanna, V.; Hajirasouliha, I.; Medema, M.H.; Korobeynikov, A.; Pevzner, P.A. BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs. Genome Res. 2019, 29, 1352–1362. [Google Scholar] [CrossRef]
- Li, H.; Feng, X.; Chu, C. The design and construction of reference pangenome graphs with minigraph. Genome Biol. 2020, 21, 1–19. [Google Scholar] [CrossRef]
- Garrison, E., A. Guarracino, S. Heumos, F. Villani, Z. Bao, L. Tattini, J. Hagmann, S. Vorbrugg, S. Marco-Sola, C. Kubica, D. G. Ashbrook, K. Thorell, R. L. Rusholme-Pilcher, G. Liti, E. Rudbeck, S. Nahnsen, Z. Yang, M. N. Moses, F. L. Nobrega, Y. Wu, H. Chen, J. de Ligt, P. H. Sudmant, N. Soranzo, V. Colonna, R. W. Williams, and P. Prins. "Building Pangenome Graphs." bioRxiv (2023).
- Hickey, G.; Monlong, J.; Ebler, J.; Novak, A.M.; Eizenga, J.M.; Gao, Y.; Abel, H.J.; Antonacci-Fulton, L.L.; Asri, M.; Baid, G.; et al. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat. Biotechnol. 2023, 1–11. [Google Scholar] [CrossRef]
- Armstrong, J.; Hickey, G.; Diekhans, M.; Fiddes, I.T.; Novak, A.M.; Deran, A.; Fang, Q.; Xie, D.; Feng, S.; Stiller, J.; et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 2020, 587, 246–251. [Google Scholar] [CrossRef]
- Jonkheer, E. M. M. van Workum, S. Sheikhizadeh Anari, B. Brankovics, J. R. de Haan, L. Berke, T. A. J. van der Lee, D. de Ridder, and S. Smit. "Pantools V3: Functional Annotation, Classification and Phylogenomics." Bioinformatics 38, no. 18 (2022): 4403-05.
- Guarracino, A.; Heumos, S.; Nahnsen, S.; Prins, P.; Garrison, E. ODGI: understanding pangenome graphs. Bioinformatics 2022, 38, 3319–3326. [Google Scholar] [CrossRef]
- Ewels, P.A.; Peltzer, A.; Fillinger, S.; Patel, H.; Alneberg, J.; Wilm, A.; Garcia, M.U.; Di Tommaso, P.; Nahnsen, S. The nf-core framework for community-curated bioinformatics pipelines. Nat. Biotechnol. 2020, 38, 276–278. [Google Scholar] [CrossRef] [PubMed]
- Vaughn, J.N.; Branham, S.E.; Abernathy, B.; Hulse-Kemp, A.M.; Rivers, A.R.; Levi, A.; Wechter, W.P. Graph-based pangenomics maximizes genotyping density and reveals structural impacts on fungal resistance in melon. Nat. Commun. 2022, 13, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Li, H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 2018, 34, 3094–3100. [Google Scholar] [CrossRef] [PubMed]
- Marçais, G.; Delcher, A.L.; Phillippy, A.M.; Coston, R.; Salzberg, S.L.; Zimin, A. MUMmer4: A fast and versatile genome alignment system. PLOS Comput. Biol. 2018, 14, e1005944. [Google Scholar] [CrossRef]
- Rautiainen, M.; Marschall, T. GraphAligner: rapid and versatile sequence-to-graph alignment. Genome Biol. 2020, 21, 1–28. [Google Scholar] [CrossRef]
- Kavya, V.N.S.; Tayal, K.; Srinivasan, R.; Sivadasan, N. Sequence Alignment on Directed Graphs. J. Comput. Biol. 2019, 26, 53–67. [Google Scholar] [CrossRef]
- Büchler, T.; Olbrich, J.; Ohlebusch, E. Efficient short read mapping to a pangenome that is represented by a graph of ED strings. Bioinformatics 2023, 39. [Google Scholar] [CrossRef]
- Poplin, R.; Chang, P.-C.; Alexander, D.; Schwartz, S.; Colthurst, T.; Ku, A.; Newburger, D.; Dijamco, J.; Nguyen, N.; Afshar, P.T.; et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 2018, 36, 983–987. [Google Scholar] [CrossRef]
- Yun, T.; Li, H.; Chang, P.-C.; Lin, M.F.; Carroll, A.; McLean, C.Y. Accurate, scalable cohort variant calls using DeepVariant and GLnexus. Bioinformatics 2020, 36, 5582–5589. [Google Scholar] [CrossRef]
- Chiang, C.; Layer, R.M.; Faust, G.G.; Lindberg, M.R.; Rose, D.B.; Garrison, E.P.; Marth, G.T.; Quinlan, A.R.; Hall, I.M. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat. Methods 2015, 12, 966–968. [Google Scholar] [CrossRef]
- Eggertsson, H.P.; Jonsson, H.; Kristmundsdottir, S.; Hjartarson, E.; Kehr, B.; Masson, G.; Zink, F.; E Hjorleifsson, K.; Jonasdottir, A.; Jonasdottir, A.; et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 2017, 49, 1654–1660. [Google Scholar] [CrossRef]
- Ebler, J.; Ebert, P.; Clarke, W.E.; Rausch, T.; Audano, P.A.; Houwaart, T.; Mao, Y.; Korbel, J.O.; Eichler, E.E.; Zody, M.C.; et al. Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nat. Genet. 2022, 54, 518–525. [Google Scholar] [CrossRef]
- Naithani, S.; Geniza, M.; Jaiswal, P. Variant Effect Prediction Analysis Using Resources Available at Gramene Database. 1533. [CrossRef]
- Emms, D.M.; Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019, 20, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Li, L. J. Stoeckert, Jr., and D. S. Roos. "Orthomcl: Identification of Ortholog Groups for Eukaryotic Genomes." Genome Research 13, no. 9 (2003): 2178-89.
- Miller, J.B.; Pickett, B.D.; Ridge, P.G. JustOrthologs: a fast, accurate and user-friendly ortholog identification algorithm. Bioinformatics 2018, 35, 546–552. [Google Scholar] [CrossRef] [PubMed]
- Zhou, S.; Chen, Y.; Guo, C.; Qi, J. PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events. Methods Ecol. Evol. 2020, 11, 943–954. [Google Scholar] [CrossRef]
- Altenhoff, A. M., C. M. Train, K. J. Gilbert, I. Mediratta, T. Mendes de Farias, D. Moi, Y. Nevers, H. S. Radoykova, V. Rossier, A. Warwick Vesztrocy, N. M. Glover, and C. Dessimoz. "Oma Orthology in 2021: Website Overhaul, Conserved Isoforms, Ancestral Gene Order and More." Nucleic Acids Res 49, no. D1 (2021): D373-D79.
- Persson, E.; Sonnhammer, E.L.L. InParanoid-DIAMOND: faster orthology analysis with the InParanoid algorithm. Bioinformatics 2022, 38, 2918–2919. [Google Scholar] [CrossRef]
- Naithani, S.; Gupta, P.; Preece, J.; D’eustachio, P.; Elser, J.L.; Garg, P.; A Dikeman, D.; Kiff, J.; Cook, J.; Olson, A.; et al. Plant Reactome: a knowledgebase and resource for comparative pathway analysis. Nucleic Acids Res. 2019, 48, D1093–D1103. [Google Scholar] [CrossRef]
- Durant. ; Sabot, F.; Conte, M.; Rouard, M. Panache: a web browser-based viewer for linearized pangenomes. Bioinformatics 2021, 37, 4556–4558. [Google Scholar] [CrossRef]
- Droc, G.; Martin, G.; Guignon, V.; Summo, M.; Sempéré, G.; Durant, E.; Soriano, A.; Baurens, F.-C.; Cenci, A.; Breton, C.; et al. The banana genome hub: a community database for genomics in the Musaceae. Hortic. Res. 2022, 9, uhac221. [Google Scholar] [CrossRef]
- Yokoyama, T. T., Y. Sakamoto, M. Seki, Y. Suzuki, and M. Kasahara. "Momi-G: Modular Multi-Scale Integrated Genome Graph Browser." BMC Bioinformatics 20, no. 1 (2019): 548.
- Wick, R.R.; Schultz, M.B.; Zobel, J.; Holt, K.E. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 2015, 31, 3350–3352. [Google Scholar] [CrossRef]
- Beyer, W.; Novak, A.M.; Hickey, G.; Chan, J.; Tan, V.; Paten, B.; Zerbino, D.R. Sequence tube maps: making graph genomes intuitive to commuters. Bioinformatics 2019, 35, 5318–5320. [Google Scholar] [CrossRef]
- Gonnella, G.; Niehus, N.; Kurtz, S. GfaViz: flexible and interactive visualization of GFA sequence graphs. Bioinformatics 2018, 35, 2853–2855. [Google Scholar] [CrossRef]
- Mikheenko, A.; Kolmogorov, M. Assembly Graph Browser: interactive visualization of assembly graphs. Bioinformatics 2019, 35, 3476–3478. [Google Scholar] [CrossRef]
- Kunyavskaya, O.; Prjibelski, A.D. SGTK: a toolkit for visualization and assessment of scaffold graphs. Bioinformatics 2018, 35, 2303–2305. [Google Scholar] [CrossRef] [PubMed]
- Durbin, R. Efficient haplotype matching and storage using the positional Burrows–Wheeler transform (PBWT). Bioinformatics 2014, 30, 1266–1272. [Google Scholar] [CrossRef] [PubMed]
- Novak, A.M.; Garrison, E.; Paten, B. A graph extension of the positional Burrows–Wheeler transform and its applications. Algorithms Mol. Biol. 2017, 12, 1–12. [Google Scholar] [CrossRef]
- Grytten, I., K. D. Rand, A. J. Nederbragt, G. O. Storvik, I. K. Glad, and G. K. Sandve. "Graph Peak Caller: Calling Chip-Seq Peaks on Graph-Based Reference Genomes." PLoS Comput Biol 15, no. 2 (2019): e1006731.
- Wang, J.; Yang, W.; Zhang, S.; Hu, H.; Yuan, Y.; Dong, J.; Chen, L.; Ma, Y.; Yang, T.; Zhou, L.; et al. A pangenome analysis pipeline provides insights into functional gene identification in rice. Genome Biol. 2023, 24, 1–22. [Google Scholar] [CrossRef] [PubMed]
- Qamar, M.T.U.; Zhu, X.; Xing, F.; Chen, L.-L. ppsPCP: a plant presence/absence variants scanner and pan-genome construction pipeline. Bioinformatics 2019, 35, 4156–4158. [Google Scholar] [CrossRef]
- Harper, L.; Campbell, J.; Cannon, E.K.S.; Jung, S.; Poelchau, M.; Walls, R.; Andorf, C.; Arnaud, E.; Berardini, T.Z.; Birkett, C.; et al. AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture. Database 2018, 2018. [Google Scholar] [CrossRef]
- Adam-Blondon, A.-F.; Alaux, M.; Pommier, C.; Cantu, D.; Cheng, Z.-M.; Cramer, G.; Davies, C.; Delrot, S.; Deluc, L.; Di Gaspero, G.; et al. Towards an open grapevine information system. Hortic. Res. 2016, 3, 16056. [Google Scholar] [CrossRef]
- Bolser, D., D. M. Staines, E. Pritchard, and P. Kersey. "Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data." Methods in Molecular Biology 1374 (2016): 115-40.
- Gupta, P.; Naithani, S.; Preece, J.; Kim, S.; Cheng, T.; D’eustachio, P.; Elser, J.; Bolton, E.E.; Jaiswal, P. Plant Reactome and PubChem: The Plant Pathway and (Bio)Chemical Entity Knowledgebases. 2443. [CrossRef]
- Tello-Ruiz, M.K.; Naithani, S.; Gupta, P.; Olson, A.; Wei, S.; Preece, J.; Jiao, Y.; Wang, B.; Chougule, K.; Garg, P.; et al. Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res. 2020, 49, D1452–D1463. [Google Scholar] [CrossRef] [PubMed]
- Pasha, A.; Subramaniam, S.; Cleary, A.; Chen, X.; Berardini, T.; Farmer, A.; Town, C.; Provart, N. Araport Lives: An Updated Framework for Arabidopsis Bioinformatics. Plant Cell 2020, 32, 2683–2686. [Google Scholar] [CrossRef] [PubMed]
- Shamimuzzaman, M., J. M. Gardiner, A. T. Walsh, D. A. Triant, J. J. Le Tourneau, A. Tayal, D. R. Unni, H. N. Nguyen, J. L. Portwood, 2nd, E. K. S. Cannon, C. M. Andorf, and C. G. Elsik. "Maizemine: A Data Mining Warehouse for the Maize Genetics and Genomics Database." Frontiers of Plant Science 11 (2020): 592730.
- Gladman, N.; Olson, A.; Wei, S.; Chougule, K.; Lu, Z.; Tello-Ruiz, M.; Meijs, I.; Van Buren, P.; Jiao, Y.; Wang, B.; et al. SorghumBase: a web-based portal for sorghum genetic information and community advancement. Planta 2022, 255, 1–8. [Google Scholar] [CrossRef] [PubMed]
- Arkin, A.P.; Cottingham, R.W.; Henry, C.S.; Harris, N.L.; Stevens, R.L.; Maslov, S.; Dehal, P.; Ware, D.; Perez, F.; Canon, S.; et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nat. Biotechnol. 2018, 36, 566–569. [Google Scholar] [CrossRef] [PubMed]
- Yates, A.D.; Allen, J.; Amode, R.M.; Azov, A.G.; Barba, M.; Becerra, A.; Bhai, J.; I Campbell, L.; Martinez, M.C.; Chakiachvili, M.; et al. Ensembl Genomes 2022: an expanding genome resource for non-vertebrates. Nucleic Acids Res. 2021, 50, D996–D1003. [Google Scholar] [CrossRef] [PubMed]
- Naithani, S.; Preece, J.; D'Eustachio, P.; Gupta, P.; Amarasinghe, V.; Dharmawardhana, P.D.; Wu, G.; Fabregat, A.; Elser, J.L.; Weiser, J.; et al. Plant Reactome: a resource for plant pathways and comparative analysis. Nucleic Acids Res. 2016, 45, D1029–D1039. [Google Scholar] [CrossRef]
- Tello-Ruiz, M.K.; Naithani, S.; Stein, J.C.; Gupta, P.; Campbell, M.; Olson, A.; Wei, S.; Preece, J.; Geniza, M.J.; Jiao, Y.; et al. Gramene 2018: unifying comparative genomics and pathway resources for plant research. Nucleic Acids Res. 2017, 46, D1181–D1189. [Google Scholar] [CrossRef]
- Naithani, S.; Raja, R.; Waddell, E.N.; Elser, J.; Gouthu, S.; Deluc, L.G.; Jaiswal, P. VitisCyc: a metabolic pathway knowledgebase for grapevine (Vitis vinifera). Front. Plant Sci. 2014, 5. [Google Scholar] [CrossRef]
- Naithani, S.; Partipilo, C.M.; Raja, R.; Elser, J.L.; Jaiswal, P. FragariaCyc: A Metabolic Pathway Database for Woodland Strawberry Fragaria vesca. Front. Plant Sci. 2016, 7, 242. [Google Scholar] [CrossRef]
- Woodhouse, M. R. K. Cannon, J. L. Portwood, 2nd, L. C. Harper, J. M. Gardiner, M. L. Schaeffer, and C. M. Andorf. "A Pan-Genomic Approach to Genome Databases Using Maize as a Model System." BMC Plant Biology 21, no. 1 (2021): 385.
- Kanehisa, M.; Furumichi, M.; Sato, Y.; Kawashima, M.; Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 2022, 51, D587–D592. [Google Scholar] [CrossRef]
- Paley, S.; Karp, P.D. The BioCyc Metabolic Network Explorer. BMC Bioinform. 2021, 22, 1–6. [Google Scholar] [CrossRef] [PubMed]
- Naithani, S.; Jaiswal, P. Pathway Analysis and Omics Data Visualization Using Pathway Genome Databases: FragariaCyc, a Case Study. 1533. [CrossRef]
- Hawkins, C.; Ginzburg, D.; Zhao, K.; Dwyer, W.; Xue, B.; Xu, A.; Rice, S.; Cole, B.; Paley, S.; Karp, P.; et al. Plant Metabolic Network 15: A resource of genome-wide metabolism databases for 126 plants and algae. J. Integr. Plant Biol. 2021, 63, 1888–1905. [Google Scholar] [CrossRef] [PubMed]
- Foerster, H., A. Bombarely, J. N. D. Battey, N. Sierro, N. V. Ivanov, and L. A. Mueller. "Solcyc: A Database Hub at the Sol Genomics Network (Sgn) for the Manual Curation of Metabolic Networks in Solanum and Nicotiana Specific Databases." Database (Oxford) 2018 (2018).
- Goodstein, D.M.; Shu, S.; Howson, R.; Neupane, R.; Hayes, R.D.; Fazo, J.; Mitros, T.; Dirks, W.; Hellsten, U.; Putnam, N.; et al. Phytozome: A comparative platform for green plant genomics. Nucleic Acids Res. 2012, 40, D1178–D1186. [Google Scholar] [CrossRef] [PubMed]
- Deng, C.H., S. Naithani, S. Kumari, I. Cobo-Simon, E.H. Quezada-Rodriguez, M. Skrabisova, N. Gladman, M.J. Correll, A.B. Sikiru, O.O. Afuwape, A. Marrano, I. Rebollo, W. Zhang, and S. Jung. "Agricultural Sciences in the Big Data Era: Genotype and Phenotype Data Standardization, Utilization and Integration.." Preprints - American Chemical Society, Division of Petroleum Chemistry (2023).
- Sun, C., Z. Hu, T. Zheng, K. Lu, Y. Zhao, W. Wang, J. Shi, C. Wang, J. Lu, D. Zhang, Z. Li, and C. Wei. "Rpan: Rice Pan-Genome Browser for Approximately 3000 Rice Genomes." Nucleic Acids Res 45, no. 2 (2017): 597-605.
- Zhao, Q.; Feng, Q.; Lu, H.; Li, Y.; Wang, A.; Tian, Q.; Zhan, Q.; Lu, Y.; Zhang, L.; Huang, T.; et al. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nat. Genet. 2018, 50, 278–284. [Google Scholar] [CrossRef] [PubMed]
- Gui, S.; Yang, L.; Li, J.; Luo, J.; Xu, X.; Yuan, J.; Chen, L.; Li, W.; Yang, X.; Wu, S.; et al. ZEAMAP, a Comprehensive Database Adapted to the Maize Multi-Omics Era. iScience 2020, 23, 101241. [Google Scholar] [CrossRef]
- Valentin, G. Abdel, D. Gaetan, D. Jean-Francois, C. Matthieu, and R. Mathieu. "Greenphyldb V5: A Comparative Pangenomic Database for Plant Genomes." Nucleic Acids Res 49, no. D1 (2021): D1464-D71.
- Bayer, P.E.; Petereit, J.; Durant. ; Monat, C.; Rouard, M.; Hu, H.; Chapman, B.; Li, C.; Cheng, S.; Batley, J.; et al. Wheat Panache: A pangenome graph database representing presence–absence variation across sixteen bread wheat genomes. Plant Genome 2022, 15, e20221. [Google Scholar] [CrossRef]
- Blake, V.C.; Woodhouse, M.R.; Lazo, G.R.; Odell, S.G.; Wight, C.P.; A Tinker, N.; Wang, Y.; Gu, Y.Q.; Birkett, C.L.; Jannink, J.-L.; et al. GrainGenes: centralized small grain resources and digital platform for geneticists and breeders. Database 2019, 2019. [Google Scholar] [CrossRef]
- Montenegro, J.D.; Golicz, A.A.; Bayer, P.E.; Hurgobin, B.; Lee, H.; Chan, C.-K.K.; Visendi, P.; Lai, K.; Doležel, J.; Batley, J.; et al. The pangenome of hexaploid bread wheat. Plant J. 2017, 90, 1007–1013. [Google Scholar] [CrossRef]
- Li, N.; He, Q.; Wang, J.; Wang, B.; Zhao, J.; Huang, S.; Yang, T.; Tang, Y.; Yang, S.; Aisimutuola, P.; et al. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat. Genet. 2023, 55, 852–860. [Google Scholar] [CrossRef]
- Barchi, L.; Rabanus-Wallace, M.T.; Prohens, J.; Toppino, L.; Padmarasu, S.; Portis, E.; Rotino, G.L.; Stein, N.; Lanteri, S.; Giuliano, G. Improved genome assembly and pan-genome provide key insights into eggplant domestication and breeding. Plant J. 2021, 107, 579–596. [Google Scholar] [CrossRef]
- Ou, L.; Li, D.; Lv, J.; Chen, W.; Zhang, Z.; Li, X.; Yang, B.; Zhou, S.; Yang, S.; Li, W.; et al. Pan-genome of cultivated pepper (Capsicum) and its use in gene presence-absence variation analyses. New Phytol. 2018, 220, 360–363. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Huang, H.; Tibbs-Cortes, L.E.; Vanous, A.; Zhang, Z.; Sanguinet, K.; Garland-Campbell, K.A.; Yu, J.; Li, X. Streamline unsupervised machine learning to survey and graph indel-based haplotypes from pan-genomes. Mol. Plant 2023, 16, 975–978. [Google Scholar] [CrossRef] [PubMed]
- Torkamaneh, D.; Lemay, M.; Belzile, F. The pan-genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content. Plant Biotechnol. J. 2021, 19, 1852–1862. [Google Scholar] [CrossRef]
- Hübner, S.; Bercovich, N.; Todesco, M.; Mandel, J.R.; Odenheimer, J.; Ziegler, E.; Lee, J.S.; Baute, G.J.; Owens, G.L.; Grassa, C.J.; et al. Sunflower pan-genome analysis shows that hybridization altered gene content and disease resistance. Nat. Plants 2018, 5, 54–62. [Google Scholar] [CrossRef] [PubMed]
- Jin, S.; Han, Z.; Hu, Y.; Si, Z.; Dai, F.; He, L.; Cheng, Y.; Li, Y.; Zhao, T.; Fang, L.; et al. Structural variation (SV)-based pan-genome and GWAS reveal the impacts of SVs on the speciation and diversification of allotetraploid cottons. Mol. Plant 2023, 16, 678–693. [Google Scholar] [CrossRef]
- Liu, H.; Wang, X.; Liu, S.; Huang, Y.; Guo, Y.-X.; Xie, W.-Z.; Liu, H.; Qamar, M.T.U.; Xu, Q.; Chen, L.-L. Citrus Pan-Genome to Breeding Database (CPBD): A comprehensive genome database for citrus breeding. Mol. Plant 2022, 15, 1503–1505. [Google Scholar] [CrossRef]
- Li, Q.; Qi, J.; Qin, X.; Dou, W.; Lei, T.; Hu, A.; Jia, R.; Jiang, G.; Zou, X.; Long, Q.; et al. CitGVD: a comprehensive database of citrus genomic variations. Hortic. Res. 2020, 7, 1–8. [Google Scholar] [CrossRef]
- Sun, X.; Jiao, C.; Schwaninger, H.; Chao, C.T.; Ma, Y.; Duan, N.; Khan, A.; Ban, S.; Xu, K.; Cheng, L.; et al. Phased diploid genome assemblies and pan-genomes provide insights into the genetic history of apple domestication. Nat. Genet. 2020, 52, 1423–1432. [Google Scholar] [CrossRef]
- Song, J.M.; Liu, D.X.; Xie, W.; Yang, Z.; Guo, L.; Liu, K.; Yang, Q.; Chen, L. BnPIR: Brassica napus pan-genome information resource for 1689 accessions. Plant Biotechnol. J. 2021, 19, 412–414. [Google Scholar] [CrossRef]
- Qi, W.; Lim, Y.-W.; Patrignani, A.; Schläpfer, P.; Bratus-Neuenschwander, A.; Grüter, S.; Chanez, C.; Rodde, N.; Prat, E.; Vautrin, S.; et al. The haplotype-resolved chromosome pairs of a heterozygous diploid African cassava cultivar reveal novel pan-genome and allele-specific transcriptome features. GigaScience 2022, 11. [Google Scholar] [CrossRef]
- Ruperao, P.; Thirunavukkarasu, N.; Gandham, P.; Selvanayagam, S.; Govindaraj, M.; Nebie, B.; Manyasa, E.; Gupta, R.; Das, R.R.; Odeny, D.A.; et al. Sorghum Pan-Genome Explores the Functional Utility for Genomic-Assisted Breeding to Accelerate the Genetic Gain. Front. Plant Sci. 2021, 12. [Google Scholar] [CrossRef]
- Varshney, R.K.; Roorkiwal, M.; Sun, S.; Bajaj, P.; Chitikineni, A.; Thudi, M.; Singh, N.P.; Du, X.; Upadhyaya, H.D.; Khan, A.W.; et al. A chickpea genetic variation map based on the sequencing of 3,366 genomes. Nature 2021, 599, 622–627. [Google Scholar] [CrossRef]
- Zhao, J.; Bayer, P.E.; Ruperao, P.; Saxena, R.K.; Khan, A.W.; Golicz, A.A.; Nguyen, H.T.; Batley, J.; Edwards, D.; Varshney, R.K. Trait associations in the pangenome of pigeon pea ( Cajanus cajan ). Plant Biotechnol. J. 2020, 18, 1946–1954. [Google Scholar] [CrossRef] [PubMed]
- Yu, J.; Golicz, A.A.; Lu, K.; Dossa, K.; Zhang, Y.; Chen, J.; Wang, L.; You, J.; Fan, D.; Edwards, D.; et al. Insight into the evolution and functional characteristics of the pan-genome assembly from sesame landraces and modern cultivars. Plant Biotechnol. J. 2018, 17, 881–892. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Yuan, D.; Wang, P.; Wang, Q.; Sun, M.; Liu, Z.; Si, H.; Xu, Z.; Ma, Y.; Zhang, B.; et al. Cotton pan-genome retrieves the lost sequences and genes during domestication and selection. Genome Biol. 2021, 22, 1–26. [Google Scholar] [CrossRef]
- Sun, Y.; Wang, J.; Li, Y.; Jiang, B.; Wang, X.; Xu, W.-H.; Wang, Y.-Q.; Zhang, P.-T.; Zhang, Y.-J.; Kong, X.-D. Pan-Genome Analysis Reveals the Abundant Gene Presence/Absence Variations Among Different Varieties of Melon and Their Influence on Traits. Front. Plant Sci. 2022, 13, 835496. [Google Scholar] [CrossRef]
- Li, H.; Wang, S.; Chai, S.; Yang, Z.; Zhang, Q.; Xin, H.; Xu, Y.; Lin, S.; Chen, X.; Yao, Z.; et al. Graph-based pan-genome reveals structural and sequence variations related to agronomic traits and domestication in cucumber. Nat. Commun. 2022, 13, 1–14. [Google Scholar] [CrossRef]
- Qiao, Q. P. Edger, L. Xue, L. Qiong, J. Lu, Y. Zhang, Q. Cao, A. E. Yocca, A. E. Platts, S. J. Knapp, M. Van Montagu, Y. Van de Peer, J. Lei, and T. Zhang. "Evolutionary History and Pan-Genome Dynamics of Strawberry (Fragaria Spp.)." Proceedings of the National Academy of Sciences of the United States of America 118, no. 45 (2021).
- Wang, H.; Tu, R.; Ruan, Z.; Chen, C.; Peng, Z.; Zhou, X.; Sun, L.; Hong, Y.; Chen, D.; Liu, Q.; et al. Photoperiod and gravistimulation-associated Tiller Angle Control 1 modulates dynamic changes in rice plant architecture. Theor. Appl. Genet. 2023, 136, 1–19. [Google Scholar] [CrossRef] [PubMed]
- Yu, B.; Lin, Z.; Li, H.; Li, X.; Li, J.; Wang, Y.; Zhang, X.; Zhu, Z.; Zhai, W.; Wang, X.; et al. TAC1, a major quantitative trait locus controlling tiller angle in rice. Plant J. 2007, 52, 891–898. [Google Scholar] [CrossRef]
- Boukail, S.; Macharia, M.; Miculan, M.; Masoni, A.; Calamai, A.; Palchetti, E.; Dell’acqua, M. Genome wide association study of agronomic and seed traits in a world collection of proso millet (Panicum miliaceum L.). BMC Plant Biol. 2021, 21, 1–12. [Google Scholar] [CrossRef]
- Liu, C.; Wang, Y.; Peng, J.; Fan, B.; Xu, D.; Wu, J.; Cao, Z.; Gao, Y.; Wang, X.; Li, S.; et al. High-quality genome assembly and pan-genome studies facilitate genetic discovery in mung bean and its improvement. Plant Commun. 2022, 3, 100352. [Google Scholar] [CrossRef]
- D’Hont, A.; Denoeud, F.; Aury, J.M.; Baurens, F.-C.; Carreel, F.; Garsmeur, O.; Noel, B.; Bocs, S.; Droc, G.; Rouard, M.; et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 2012, 488, 213–217. [Google Scholar] [CrossRef] [PubMed]
- Fernie, A.R.; Aharoni, A. Pan-Genomic Illumination of Tomato Identifies Novel Gene–Trait Interactions. Trends Plant Sci. 2019, 24, 882–884. [Google Scholar] [CrossRef] [PubMed]
- Huff, M.; Hulse-Kemp, A.M.; E Scheffler, B.; Youngblood, R.C.; A Simpson, S.; Babiker, E.; Staton, M. Long-read, chromosome-scale assembly of Vitis rotundifolia cv. Carlos and its unique resistance to Xylella fastidiosa subsp. fastidiosa. BMC Genom. 2023, 24, 1–17. [Google Scholar] [CrossRef] [PubMed]
- Oren, E. Dafna, G. Tzuri, I. Halperin, T. Isaacson, M. Elkabetz, A. Meir, U. Saar, S. Ohali, T. La, C. Romay, Y. Tadmor, A. A. Schaffer, E. S. Buckler, R. Cohen, J. Burger, and A. Gur. "Pan-Genome and Multi-Parental Framework for High-Resolution Trait Dissection in Melon (Cucumis Melo)." Plant Journal 112, no. 6 (2022): 1525-42.
- Hasan, N.; Choudhary, S.; Naaz, N.; Sharma, N.; Laskar, R.A. Recent advancements in molecular marker-assisted selection and applications in plant breeding programmes. J. Genet. Eng. Biotechnol. 2021, 19, 1–26. [Google Scholar] [CrossRef]
- Garrido-Cardenas, J.A.; Mesa-Valle, C.; Manzano-Agugliaro, F. Trends in plant research using molecular markers. Planta 2017, 247, 543–557. [Google Scholar] [CrossRef]
- Moncada, P.; McCouch, S. Simple sequence repeat diversity in diploid and tetraploid Coffea species. Genome 2004, 47, 501–509. [Google Scholar] [CrossRef]
- McCouch, S.R.; Chen, X.; Panaud, O.; Temnykh, S.; Xu, Y.; Cho, Y.G.; Huang, N.; Ishii, T.; Blair, M. Microsatellite marker development, mapping and applications in rice genetics and breeding. Plant Mol. Biol. 1997, 35, 89–99. [Google Scholar] [CrossRef]
- Tanksley, S.D.; McCouch, S.R. Seed Banks and Molecular Maps: Unlocking Genetic Potential from the Wild. Science 1997, 277, 1063–1066. [Google Scholar] [CrossRef]
- Morales, K.Y.; Singh, N.; Perez, F.A.; Ignacio, J.C.; Thapa, R.; Arbelaez, J.D.; Tabien, R.E.; Famoso, A.; Wang, D.R.; Septiningsih, E.M.; et al. An improved 7K SNP array, the C7AIR, provides a wealth of validated SNP markers for rice breeding and genetics studies. PLOS ONE 2020, 15, e0232479. [Google Scholar] [CrossRef]
- Miller, J.R.; Zhou, P.; Mudge, J.; Gurtowski, J.; Lee, H.; Ramaraj, T.; Walenz, B.P.; Liu, J.; Stupar, R.M.; Denny, R.; et al. Hybrid assembly with long and short reads improves discovery of gene family expansions. BMC Genom. 2017, 18, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Cheng, C.; Fei, Z.; Xiao, P. Methods to improve the accuracy of next-generation sequencing. Front. Bioeng. Biotechnol. 2023, 11, 982111. [Google Scholar] [CrossRef] [PubMed]
- Myburg, A.A.; Grattapaglia, D.; Tuskan, G.A.; Hellsten, U.; Hayes, R.D.; Grimwood, J.; Jenkins, J.; Lindquist, E.; Tice, H.; Bauer, D.; et al. The genome of Eucalyptus grandis. Nature 2014, 510, 356–362. [Google Scholar] [CrossRef] [PubMed]
- Shulaev, V.; Sargent, D.J.; Crowhurst, R.N.; Mockler, T.C.; Folkerts, O.; Delcher, A.L.; Jaiswal, P.; Mockaitis, K.; Liston, A.; Mane, S.P.; et al. The genome of woodland strawberry (Fragaria vesca). Nat. Genet. 2010, 43, 109–116. [Google Scholar] [CrossRef]
- Wu, S.; Sun, H.; Gao, L.; Branham, S.; McGregor, C.; Renner, S.S.; Xu, Y.; Kousik, C.; Wechter, W.P.; Levi, A.; et al. A Citrullus genus super-pangenome reveals extensive variations in wild and cultivated watermelons and sheds light on watermelon evolution and domestication. Plant Biotechnol. J. 2023, 21, 1926–1928. [Google Scholar] [CrossRef]
- Naithani, S.; Dikeman, D.; Garg, P.; Al-Bader, N.; Jaiswal, P. Beyond gene ontology (GO): using biocuration approach to improve the gene nomenclature and functional annotation of rice S-domain kinase subfamily. PeerJ 2021, 9, e11052. [Google Scholar] [CrossRef]
- Naithani, S.; Komath, S.S.; Nonomura, A.; Govindjee, G. Plant lectins and their many roles: Carbohydrate-binding and beyond. J. Plant Physiol. 2021, 266, 153531. [Google Scholar] [CrossRef]
- Monaco, M.K.; Sen, T.Z.; Dharmawardhana, P.D.; Ren, L.; Schaeffer, M.; Naithani, S.; Amarasinghe, V.; Thomason, J.; Harper, L.; Gardiner, J.; et al. Maize Metabolic Network Construction and Transcriptome Analysis. Plant Genome 2013, 6. [Google Scholar] [CrossRef]
- Jaiswal, P. , and B. Usadel. "Plant Pathway Databases." Methods in Molecular Biology 1374 (2016): 71-87.
- Naithani, S.; Nonogaki, H.; Jaiswal, P. Exploring Crossroads Between Seed Development and Stress Response. 2017. [Google Scholar] [CrossRef]
- The Gene Ontology Consortium; A Aleksander, S. ; Balhoff, J.; Carbon, S.; Cherry, J.M.; Drabkin, H.J.; Ebert, D.; Feuermann, M.; Gaudet, P.; Harris, N.L.; et al. The Gene Ontology knowledgebase in 2023. Genetics 2023, 224. [Google Scholar] [CrossRef]
- Cooper, L. , and P. Jaiswal. "The Plant Ontology: A Tool for Plant Genomics." Methods in Molecular Biology 1374 (2016): 89-114.
- Walls, R.L.; Cooper, L.; Elser, J.; Gandolfo, M.A.; Mungall, C.J.; Smith, B.; Stevenson, D.W.; Jaiswal, P. The Plant Ontology Facilitates Comparisons of Plant Development Stages Across Species. Front. Plant Sci. 2019, 10, 631. [Google Scholar] [CrossRef]
- Naithani, S.; Mohanty, B.; Elser, J.; D’eustachio, P.; Jaiswal, P. Biocuration of a Transcription Factors Network Involved in Submergence Tolerance during Seed Germination and Coleoptile Elongation in Rice (Oryza sativa). Plants 2023, 12, 2146. [Google Scholar] [CrossRef] [PubMed]
- Naithani, S. Dharmawardhana, and J. B. Nasrallah. "Scr." In Handbook of Biologically Active Peptides, edited by Abba J. Kastin, 58-66: Academic Press, 2013.
- Bolger, M.; Schwacke, R.; Usadel, B. MapMan Visualization of RNA-Seq Data Using Mercator4 Functional Annotations. 2021, 2354, 195–212. [CrossRef]
- Naithani, S.; Gupta, P.; Preece, J.; Garg, P.; Fraser, V.; Padgitt-Cobb, L.K.; Martin, M.; Vining, K.; Jaiswal, P. Involving community in genes and pathway curation. Database 2019, 2019. [Google Scholar] [CrossRef] [PubMed]
- Gupta, P.; Geniza, M.; Naithani, S.; Phillips, J.L.; Haq, E.; Jaiswal, P. Chia (Salvia hispanica) Gene Expression Atlas Elucidates Dynamic Spatio-Temporal Changes Associated With Plant Growth and Development. Front. Plant Sci. 2021, 12. [Google Scholar] [CrossRef] [PubMed]
- Hendre, P.S.; Muthemba, S.; Kariba, R.; Muchugi, A.; Fu, Y.; Chang, Y.; Song, B.; Liu, H.; Liu, M.; Liao, X.; et al. African Orphan Crops Consortium (AOCC): status of developing genomic resources for African orphan crops. Planta 2019, 250, 989–1003. [Google Scholar] [CrossRef]
- Chang, Y.; Liu, H.; Liu, M.; Liao, X.; Sahu, S.K.; Fu, Y.; Song, B.; Cheng, S.; Kariba, R.; Muthemba, S.; et al. The draft genomes of five agriculturally important African orphan crops. GigaScience 2019, 8, 152. [Google Scholar] [CrossRef]





| Tool name and URL | Remarks and Citation |
|---|---|
| Genome assembly | |
| Hifiasm https://github.com/chhylp123/hifiasm |
Construct haplotype-resolved assemblies from accurate HiFi Reads [60] |
| Canu https://github.com/marbl/canu |
Assemble genomes of any size from single molecule sequences and provides graphical fragment assembly that can be integrated with complementary phasing and scaffolding methods [66]. |
| Flye https://github.com/fenderglass/Flye |
Assemble single molecule, long-read sequencing data into genomes using repeat graphs [67]. |
| PAGIT https://www.sanger.ac.uk/tool/pagit |
PAGIT is a package of tools for generating high-quality draft genome sequences by ordering contigs, closing gaps, correcting sequence errors, and transferring annotation. PAGIT is compiled for Linux/UNIX systems and is available as a virtual machine [68]. |
| MEGAHIT https://github.com/voutcn/megahit |
Ultra-fast NGS assembler for metagenomes [69]. |
| SPADes https://cab.spbu.ru/software/spades/ |
A set of genome assembly and analysis tools for long- and short-reads [70,71]. |
| Pangenome graph construction, normalization, identification of structural variants, and visualization | |
| vgtools (vg construct, 'vg call', 'vg giraffe', 'vg map', or 'vg mpmap') https://github.com/vgteam/vg |
Toolset for eukaryotic pangenome graph construction, read mapping, variant calling, and graph visualization [58]. |
| Minigraph https://github.com/lh3/minigraph |
Tool for graph construction, mapping, and variant calling [72]. |
| PGGB https://github.com/pangenome/pggb |
Pan-genome graph construction, normalization, and visualization, using ODGI as the backbone [73]. |
| MGRgraph https://github.com/LeilyR/Multi-genome-Reference |
An Algorithm to Build a Multi-genome. |
| Cactus https://github.com/ComparativeGenomicsToolkit/cactus |
A reference-free multiple genome alignment program that can use progressive mode to build pan-genome across different species [74,75] |
| PanTools https://pantools.readthedocs.io/en/latest/user_guide/install.html |
A platform for pan-genome graph construction, read mapping, phylogeny analysis, pan-graph query, and pan-gene annotation [76]. |
| Smoothxg https://github.com/pangenome/smoothxg |
Local re-construction of variation graphs |
| ODGI https://github.com/pangenome/odgi |
Optimized Dynamic Genome/Graph Implementation (ODGI) is a tool suite that represents graphs, including structurally complex regions with minimal memory overhead [77]. It is a pangenome toolbox with more than 30 tools to transform, analyze, simplify, validate, annotate, and visualize pangenome graphs. |
| nf-core/pangenome https://github.com/nf-core/pangenome |
Nextflow pipeline for all-vs-all alignment, pangenome graph construction, normalization, remove redundancy, and visualization (through ODGI) [78]. |
| SeqWish https://github.com/ekg/seqwish |
Build a variation graph from pairwise alignments [73]. |
| PanPipe https://github.com/USDA-ARS-GBRU/PanPipes |
End-to-end pan-genomic graph construction and genetic analysis pipeline [79]. |
| PanGene https://github.com/lh3/pangene |
It is used for ortholog and paralog analysis and building pan-gene graphs. |
| Minimap2 https://github.com/lh3/minimap2 |
This alignment program can be used for mapping DNA or long mRNA sequences to a reference [80]. |
| NGMLR https://github.com/philres/ngmlr |
This program align PacBio long reads to genomes and for detecting complex structural variations [5]. |
| MUMmer4 https://mummer.sourceforge.net/ https://github.com/mummer4/mummer |
This is a genome to genome aligner [81]. |
| GraphAligner https://github.com/maickrau/GraphAligner |
A tool for aligning long reads to genome graphs [82]. |
| V-ALIGN https://github.com/tcsatc/V-ALIGN |
V-ALIGN allows gapped sequence alignment directly on the input graph and supports affine and linear gaps [83]. |
| PaSGAL https://github.com/ParBLiSS/PaSGAL |
Parallel Sequence to Graph Aligner (PaSGAL) facilitates local sequence alignment of sequences to variation graphs, splicing graphs, etc. |
| GED-MAP https://github.com/thomas-buechler-ulm/gedmap |
Prototype of efficient short read mapping to pangenome graph [84]. |
| DeepVariant https://github.com/google/deepvariant |
A universal SNP and small-indel variant caller [85,86]. |
| SpeedSeq https://github.com/hall-lab/speedseq |
A platform for alignment, variant calling, and functional annotation [87]. |
| graphTyper https://github.com/DecodeGenetics/graphtyper |
This is a graph-based variant caller. It realigns short-read sequence data to a pangenome and is used for discovering and genotyping sequence variants [88]. |
| PanGenie https://github.com/eblerjana/pangenie |
Kmer-based genotyper for structural variation detection on pangenome graphs from short-read sequencing data to genotype a broad spectrum of genetic variation [89]. |
| VEP https://ensembl.gramene.org/tools.html |
Variant Effect Prediction (VEP) helps in analyzing the consequences of sequence variations [90]. |
| OrthoFinder https://github.com/davidemms/OrthoFinder |
This an ortholog inference method [91]. |
| OrthoMCL https://orthomcl.org/orthomcl/app |
A method for identification of ortholog groups for eukaryotic genomes [92]. |
| JustOrthologs https://github.com/ridgelab/JustOrthologs/ |
JustOrthologs is a fast ortholog identification algorithm that uses conservation of gene structure [93]. |
| PhyloMCL https://sourceforge.net/projects/phylomcl/files/Materials/ |
PhyloMCL: Accurate clustering of hierarchical orthogroups guided by phylogenetic relationship and inference of polyploidy events [94]. |
| OMA https://github.com/DessimozLab/OmaStandalone/tree/v2.4.0 https://omabrowser.org/oma/home/ |
Orthologous Matrix (OMA) is a method and database for orthologs identification from public domains and customized genomes [95]. |
| InParanoid-Diamond https://bitbucket.org/sonnhammergroup/inparanoid/src |
InParanoid algorithm for orthology analysis and gene family identification [96]. This is used for orthology projection of plant gene families in the Plant Reactome (https://plantreactome.gramene.org) [97]. |
| Panache https://github.com/SouthGreenPlatform/panache |
This is a web browser-based viewer for linearized pangenomes [98]. It is being used in visualizing pan-genome data in several plant portals, e.g., the banana genome hub [99]. |
| MoMI-G https://github.com/MoMI-G/MoMI-G/ |
Genome graph browser for SVs visualization. Users can filter and visualize annotations and inspect SVs with read alignments over the genome graph [100]. |
| panGraphViewer https://github.com/TF-Chan-Lab/panGraphViewer |
PanGraphViewer was developed using Python3 for pangenome graph visualization and runs on all major operating systems. |
| Bandage https://rrwick.github.io/Bandage/ |
This is an interactive tool for visualizing de novo assembled genomes [101]. |
| Bandage-NG https://github.com/asl/BandageNG |
GUI program to interact with assembly graphs based on OGDF (The Open Graph Drawing Framework and Open Graph algorithms and Data structures Framework) |
| sequenceTubeMaps https://github.com/vgteam/sequenceTubeMap |
Interactive visualization of genomes [102]. |
| GfaViz https://github.com/ggonnella/gfaviz |
Interactive visualization of graphical fragment assembly (GFA) genome graphs [103]. |
| AGB https://github.com/almiheenko/AGB |
Assembly Graph Browser (AGB) is an Interactive tool for constructing and visualization of large assembly graphs, repeat sequence analysis [104]. |
| IGGE https://github.com/immersivegraphgenomeexplorer/IGGE |
An interactive graph genomes browser. |
| GFAViewer https://lh3.github.io/gfatools/ |
Part of the GFA server for online visualization of GFA files |
| SGTK https://github.com/olga24912/SGTK |
Scaffold graph toolkit for construction and interactive visualization of scaffold graphs using sequencing data [105]. |
| Maffer https://github.com/pangenome/maffer |
Convert sorted graphs to multiple alignment format (MAF). |
| Gfatools https://github.com/lh3/gfatools |
A set of tools to parse, subgraph, and convert GFA or rGFA format to FASTA/BED format. |
| Pgge https://github.com/pangenome/pgge |
This is a pan-genome graph evaluator |
| WGT https://github.com/Kuanhao-Chao/Wheeler_Graph_Toolkit |
Tools and algorithms for recognizing, visualizing, and generating Wheeler graphs. |
| GBWT https://github.com/jltsiren/gbwt |
A tool used for haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT) approach [106,107]. |
| Spodgi https://github.com/pangenome/spodgi |
Convert ODGI genome graph file to SPARQL database. |
| GraphPeakCaller https://github.com/uio-bmi/graph_peak_caller |
A tool for calling transcription factor peaks on graph-based reference genomes using ChIP-seq data [108]. |
| PSVCP https://github.com/wjian8/psvcp_v1.01 |
A pangenome analysis pipeline (PSVCP) to construct pan-genome, call structural variants, and run population genotyping used for rice pan-genome [109]. |
| ppsPCP http://cbi.hzau.edu.cn/ppsPCP/ |
It is designed specifically for constructing fully annotated plant pan-genomes. It scans presence/absence variants [110]. |
| Pan-genome resource | Remarks |
|---|---|
| Gramene Link: https://www.gramene.org/pansites Species: maize, rice, grapevine, and sorghum |
The Gramene hosts 128 reference plant genomes [115] and a pan-genome site for four crops: maize, rice, grapevine, and sorghum. |
| SorghumBase Sorghum: https://www.sorghumbase.org |
SorghumBase portal hosts a sorghum pan-genome browser consisting of five sorghum reference genome assemblies and genetic variant information for natural diversity panels and ethyl methanesulfonate (EMS)-induced mutant populations [118]. |
| RPAN Link: https://cgm.sjtu.edu.cn/3kricedb In addition to RPAN, the data and analyzed outputs from 3K RGP are available at the following websites: http://snp-seek.irri.org/ http://www.rmbreeding.cn/index.php http://www.ricecloud.org https://aws.amazon.com/public-data-sets/3000-rice-genome Species: rice (O. sativa) and its wild relatives. |
The Rice Pan-genome Browser (RPAN) hosts the Rice Pan-genome browser and genomic variation data generated from 3,010 diverse rice accessions generated by 3000 Rice Genome Project (3K RGP) [8,9,133,134]. It contains ~370Mbp IRGSP genome and ~260Mbp novel sequences comprising 50,995 genes (23,914 core genes). RPAN provides a phylogenetic tree browser to view the phylogeny of 3K rice accessions and a genome browser to view gene annotation and presence-absence variations (PAVs). Users can access pan-gene views and associated genetic variations. |
| RiceSuperPIRdb Link: http://www.ricesuperpir.com Species: 251 genomes representing domesticated rice accessions and wild relatives (202 O. sativa, 28 O. rufipogan, 11 O. glaberrima, and 10 O. barthii accessions). |
The Rice Super Pan-genome Information Resource Database (RiceSuperPIRdb) provides an interactive web-based browser for the rice super pan-genome. It was built using reference-free high-quality whole genome alignment of 251 independent genome assemblies. Genome annotations (including annotations for transposable elements) and node-specific K-mer spectrum pangenome graphs are available for each assembly. In addition, genetic variation graphs support linking query data. This super pan-genome also facilitates the identification of lineage-specific haplotypes for trait-associated genes [21]. |
| PanOryza Link: https://panoryza.org Species: It hosts the data for Magic-16 rice accessions; see https://panoryza.cgrb.oregonstate.edu/node/5 |
PanOryza provides consistency in the rice gene annotation across all rice varieties and the rice pan-genome browser supported by JBrowse Genome Browser. |
| MaizeGDB Link: https://nam-genomes.org Species: maize |
MaizeGDB hosts 48 maize genomes, including 26 high-quality PacBio genome assemblies of the Nested Associated Mapping (NAM) population founder lines and their associated datasets. It allows users to connect genomes, gene models, expression, methylome, sequence variation, structural variation, transposable elements, and diversity data across the maize pan-genome framework. That is supported by the JBrowse genome browser [125]. |
| ZEAMAP Link: www.zeamap.com Species: maize |
ZEAMAP is a comprehensive database incorporating multiple reference genomes, annotations, comparative genomics, transcriptomes, open chromatin regions, chromatin interactions, high-quality genetic variants, phenotypes, metabolomics, genetic maps, genetic mapping loci, population structures, and populational DNA methylation signals within maize inbred lines" [135] |
| GreenPhylDB Link: https://www.greenphyl.org/cgi-bin/index.cgi Species: 46 plant species. 19 pangenomes, including rice, maize, banana, grape, and cacao. In addition, it hosts 27 reference genomes. |
GreenPhylDB is part of the South Green Bioinformatics platform (https://www.southgreen.fr) [136]. It aids exploration of gene families and homologous relationships among plant genomes. |
| The Wheat Panache web portal Link: http://www.appliedbioinformatics.com.au/wheat_panache Species: wheat |
This wheat pangenome graph visualization is supported by Panache tool. It allows users to explore SVs across the chosen wheat accessions [137]. |
| GrainGenes Link: https://wheat.pw.usda.gov/GG3/pangenome Species: wheat, barley, rye, oat |
GrainGenes hosts molecular and phenotypic information for wheat, barley, rye, oat, etc. including several genome assemblies, genome browsers, and a T. aestivum (bread wheat) pangenome [138]. |
| Wheat-Pangenome Link: http://appliedbioinformatics.com.au/cgi-bin/gb2/gbrowse/WheatPan/ Species: bread wheat (Triticum aestivum) |
Wheat-Pangenome facilitates comparison of an improved reference for the Chinese Spring wheat genome with 18 wheat cultivars [139]. |
| SGN Links: https://solgenomics.net Subsites: http://solomics.agis.org.cn/tomato https://solgenomics.net/projects/tgg https://solgenomics.net/organism/Solanum_lycopersicum/genome/ https://solgenomics.net/organism/Solanum_melongena/genome Species: tomato, potato, pepper, petunia, and eggplant. |
The Solanaceae Genomics Network (SGN) database hosts genomic information about the nightshade family. Currently, it has pan-genome data for tomato and eggplant. International Tomato Genome Sequencing Project produced the tomato pan-genome data, that provides a collection of tomato reference genome assemblies from 46 accessions (22 Solanum lycopersicum, 13 Solanum lycopersicum var. cerasiforme; and 11 Solanum pimpinellifolium) that can be viewed at http://solomics.agis.org.cn/tomato/tool/jbrowse_nav and downloaded from http://solomics.agis.org.cn/tomato/ftp/ [140]. For details about the eggplant pan-genome and pan-plastome data see Barchi et al., 2021 [141]. |
| PepperPan Link: http://www.pepperpan.org:8012/ Species: Capsicum annuum (pepper) and its wild relatives |
The pepper pan-genome was constructed by mapping the sequences of 383 cultivars to the Zunla-1 genome as the reference [142]. The novel contig sequences (accession number GWHAAAT00000000) is available at http://bigd.big.ac.cn/gwh. |
| BRIDGEcereal Link: https://bridgecereal.scinet.usda.gov Species: wheat, maize, barley, sorghum, and rice. |
The Blastn Recovered Insertion and Deletion near Gene Explorer (BRIDGEcereal) is a web application for mining pangenomes of cereals. It facilitates the identification of potential indels (insertion or deletions) for genes of interest with publicly accessible pangenomes of five major cereal crops, including wheat, maize, barley, sorghum, and rice [143]. |
| PanSoy Link: https://www.soybase.org/projects/SoyBase.C2021.01.php Species: Glycine soja (wild soybean) and Glycine max (soybean) |
From GmHapMap collection 204 phylogenetically and geographically distinct soybean accessions were chosen for construction of soybean pan-genome using de novo genome assembly method [144]. |
| Sunflower Genome Database Link: https://www.sunflowergenome.org Species: Helianthus annuus (sunflower) |
Sunflower pan-genome was generated using sequence from 287 cultivated lines, 17 Native American landraces, and 189 wild accessions representing 11 compatible wild species. Raw data used for pan-genome construction is available at NCBI and SNPs data is available at Sunflower Genome Database [145]. |
| COTTONOMICS Link: http://cotton.zju.edu.cn Species: Cotton |
It provides genome-wide, gene-scale structural variations detected from 11 assembled allopolyploid cotton genomes and are linked to important agronomic traits [146]. |
| BGH Link: https://banana-genome-hub.southgreen.fr Species: Musa Ensete, and genomics data of 15 Musaceae species |
The Banana Genome Hub (BGH), a web-based platform, supports users in exploring genes and gene families, gene expression patterns, associated SNP markers, etc. Users can also view chromosome structures, synteny, presence, absence variation, and genome ancestry mosaics [99]. |
| CPBD Links: http://citrus.hzau.edu.cn/ Species: sweet orange (Citrus sinensis), mandarin (Citrus reticulata), pummelo (Citrus grandis), grapefruit (Citrus paradisi), and lemon (Citrus limon). |
The Citrus Pan-genome to Breeding Database (CPBD) was built using 23 genomes of 17 citrus species and has genetic variation data from 167 citrus accessions mapped to two reference genomes [147]. |
| CitGVD Links: http://citgvd.cric.cn/home/index Species: citrus accessions |
The Citrus Genome Database (CitGVD) hosts citrus genomic data, genetic variation data, and built-in analysis tools. It contains 1493258964 non-redundant SNPs, INDELs, and 84 phenotypes from 346 citrus individuals. Users can browse/search annotated genetic variations and visualize results graphically in a genome browser or tabular outputs. This portal supports comparative genomics between two or among several citrus accessions [148]. |
| Apple pan-genome Link: http://bioinfo.bti.cornell.edu/apple_genome Species: Apple (Malus domestica) and its wild progenitors M. sieversii, and M. sylvestris |
Apple pan-genome was constructed using phased diploid genome assemblies of Malus domestica cv. Gala, M. sieversii and M. sylvestris, and 91 sequenced genomes of additional accessions [149]. |
| BnPIR Link: http://cbi.hzau.edu.cn/bnapus Species: Brassica oleracea, Brassica macrocarpa (cultivated and wild cabbage), Brassica napus |
The Brassica napus pan-genome information resource (BnPIR) hosts eight high-quality B. napus reference genomes generated using PacBio sequencing and a collection of 1688 rapeseed re-sequencing data. This pangenome resource provides a pan-gene module, pan-genome Browser, and synteny data. It contains multi-omics data and common bioinformatics tools [150]. |
| Cassava Pan-genome Link: https://cassavabase.org/ Species: cassava (Manihot esculenta) |
Two high-quality, chromosome-scale haploid genomes assemblies for African cassava cultivar TME204 (resistant to cassava mosaic diseases caused by African cassava mosaic viruses) were generated using both short-read and long-read sequencing methods (Illumina PE reads, PacBio CLRs, and HiFi reads [151]. |
| Other public pan-genome data available (not yet included in crop databases or supported by Genome Browser and associated tools) | |
| Pearl millet Link: http://117.78.45.2:91/download/ Species: Pearl millet |
Pearl millet pan-genome was constructed using whole genome assemblies of 11 accessions generated using a combination of PacBio long-read sequences, Bionano optical mapping data, Hi-C data, and Illumina short-read sequence data [55]. |
| Sorghum pan-genome Link: The bulk data is available at http://dataverse.icrisat.org/dataset.xhtml?persistentId=doi:10.21421/D2/RIO2QM. Species: sorghum |
This pan-genome was assembled using iterative mapping of whole-genome sequence data from 176 sorghum accessions to a sorghum reference assembly v3.0.1 from Phytozome [152]. The sorghum pan-genome consists of 210,805 sequences, including sequences from the sorghum reference genome assembly v3.1.0. In addition, it has 209935 assembled contig sequences from 176 sorghum accessions. The total genes represented in this data are 35719 genes (34211 genes from reference). |
| Barley pan-genome Link: https://bitbucket.org/ipk_dg_public/barley_pangenome/src/master/ https://galaxy-web.ipk-gatersleben.de/libraries Species: Barley cultivars, landraces, and a wild relative. |
This first-generation barley pan-genome is based on chromosome-scale sequence assemblies for the 20 barley varieties, including landraces, cultivars, and a wild barley from global barley diversity, and by mapping whole-genome shotgun sequencing data from additional 300 barley accessions [11]. |
| Soybean pan-genome Links: The genetic diversities from the 29 genomes is available at https://figshare.com/s/689ae685ad2c368f2568 SNPs and small indels) data from the 2,898 accessions is available at (http://bigd.big.ac.cn/gvm/getProjectDetail?project=GVM000063. Species: Glycine soja (wild soybean) and 26 Glycine max (soybean) accession |
This graph-based pan-genome assembly was generated using de novo genome assemblies for 26 representative soybeans selected from 2,898 deeply sequenced accessions [14]. The sequencing data, assembled chromosomes, unplaced scaffolds, and annotations are available at the Genome Sequence Archive and Genome Warehouse database in BIG Data Center (https://bigd.big.ac.cn/gsa/index.jsp) under Accession Number PRJCA002030. |
| Chickpea pan-genome Links: Pan-genome assembly and annotations: https://doi.org/10.6084/m9.figshare.16592819 The variant calls: https://cegresources.icrisat.org/cicerseq Manhattan and QQ-plots for Genome-Wide Association Study (GWAS) analysis: https://doi.org/10.6084/m9.figshare.15015309 |
The chickpea pan-genome was constructed using sequenced 3,366 germplasm lines, including 3,171 cultivated and 195 wild accessions, using iterative mapping and assembly method [153]. |
| Pigeon pea Link: https://research-repository.uwa.edu.au/en/datasets/pigeon-pea-pangenome-contig-assembly-annotation-snps-pav Species: pigeon pea (Cajanus cajan) |
The pigeon pea pan-genome based on 89 accessions, including 70 from South Asia, 8 from sub-Saharan Africa, 7 from South-East Asia, 2 from Mesoamerica and 1 from Europe) was constructed. The pangenome was generated using the reference genome assembly (C. cajan_V1.0) and iterative mapping and assembly method [154]. |
| Sesame pan-genome Species: sesame (Sesamum indicum L.) |
The sesame pan-genome was constructed by mapping genome sequencing data from two landraces S. indicum cv. Baizhima and Mishuozhima and two cultivars, Yuzhi11, and Swetha to the S. indicum var. Zhongzhi13 reference genome [155]. |
| Cotton Variome Links: Genetic variation is available at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA576032 and https://figshare.com/s/cb3c104782a1dcd90ab0 Species: Gossypium hirsutum and Gossypium barbadense |
This is a comprehensive resource providing genetic variation data from 1961 cottons accessions from G. hirsutum and G. barbadense [156]. |
| Melon pan-genome Link: https://figshare.com/articles/dataset/melon_pangenome/17195072 |
Pan-genome of Cucumismelo L. consists of geneome sequence data from 297 accessions [157] |
| Cucumber pan-genome Data availability: Genome assemblies of the 11 cucumber accessions have been deposited in NCBI GenBank under the accession number PRJNA657438. |
Pan-genome graph constructed for 11 cucumber accessions. These 11 representative accessions from the 115-line core collection were sequenced using the PacBio platform, and the genome assemblies were generated using long-read and short-read data [158]. |
| Strawberry pan-genome The genome assembly and annotation files are available in the Genome Database for Rosaceae. The pan-genome browser or query support is not available. Link: https://www.rosaceae.org/species/fragaria/all Species: cultivated and wild strawberry accessions |
This strawberry pan-genome was generated using chromosome-scale reference genomes assemblies of five diploid strawberry species (Fragaria mandschurica, Fragaria daltoniana, Fragaria pentaphylla, F. nilgerrensis, and F. viridis) strawberry species and genome resequencing data of 128 accessions [159]. |
| Walnut pan-genome Link:https://db.cngb.org/search/project/CNP0001209. Species: walnut (Juglans nigra) |
A high-quality genome assembly of black walnut (Juglans nigra) genotype NWAFU168 using both short-read and long-read platforms (Illumina, Pacbio, and Hi-C) was constructed. A Walnut pan-genome was built using this reference genome and mapping sequence data from 74 walnut accessions [56]. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
