Submitted:
20 March 2024
Posted:
21 March 2024
You are already at the latest version
Abstract
Keywords:
- GDBs, which provide a natural fit for network-based representation of biological information, are becoming increasingly popular as a way to manage and query heterogeneous data, and to provide new insights into data connections.
- Knowledge graphs facilitate discovery of unexpected relationships across integrated multi-modal data that can lead to generation of new hypotheses in systems biology.
- This review is based on 681 systematically identified GDB-related publications from the fields of biology and bioinformatics in PubMed and PMC repositories, further filtered down to 179 publications based on applicability in systems biology.
- We outline the prospects of applying GDBs in systems biology with technologies such as Elasticsearch.
- We highlight the ongoing efforts towards the development of unified GDB platforms for integration and exchange of heterogeneous biomedical data between multiple projects.
1. Introduction
2. Background
Relational Databases
Graph Databases
3. Results
- ‘Pathway and network exploration’ - Applications of GDBs for the exploration of biomolecular pathways and networks, focusing on the Systems Biology Graphical Notation (SBGN) standard format [11] and protein-protein interactions (PPIs);
- ‘Analytical approaches and tools enabled by GDBs’ - Methods and tools based on graph algorithms facilitated by the GDB integration. For the software-based publications, we considered tool availability and sustainability, online and public access.
- ‘Ontologies’ - Graph-based ontologies for biological data integration and transformation.
- ‘Systems biology use case: COVID-19 resources’ - KGs adapted or newly developed for the COVID-19 research.
Pathway and Network Exploration
Process Description
| Database | Content | Accessible at | Publications |
|---|---|---|---|
| Reactome | Pathways in SBML- and SBGN-compatible format | github.com/reactome/graph-core | [6] |
| Plant Reactome | Pathways in SBML- and SBGN-compatible format | plantreactome.gramene.org/ | [25] |
| Recon2 | Metabolic pathways in SBML format | github.com/ibalaur/MetabolicFramework | [22] |
| PANTHER | Pathways built in CellDesigner in SBML- and SBGN-compatible format | Can be installed using StonPy (github.com/adrienrougny/stonpy) | [23,24] |
| Atlas of Cancer Signalling Network | Signalling network of cancer-related mechanisms built in CellDesigner in SBML- and SBGN-compatible format | Can be installed using StonPy (github.com/adrienrougny/stonpy) | [23,24] |
| COVID-19 Disease Map | Signalling pathways in SBML- and SBGN-compatible format focused on the COVID-19 mechanisms | c19dm-neo4j.lcsb.uni.lu/browser/ | [26] |
| KEGG Pathway Database | Signalling and metabolic pathways | biochem4j.org/ | [27] |
Protein-Protein Interactions
Analytical Approaches and Tools Enabled by GDBs
Ontologies
Ontologies and GDBs
Ontologies for Data Integration in GDBs
Ontology-Based GDB Queries
Systems Biology Use-Case: COVID-19 Resources
4. Discussion
Challenges and Lessons Learned
Perspectives
Pathway Resources Available in Process-Description-Type and Activity-Flow-Type Formats
Elasticsearch and GDBs
Efforts towards a Uniform Development of Knowledge Bases
5. Conclusions
6. Methods
Author contributions
Availability
Acknowledgement
Competing interests
References
- Lysenko, A.; Roznovăţ, I.A.; Saqi, M.; et al. Representing and querying disease networks using graph databases. BioData Min. 2016, 9, 23. [Google Scholar] [CrossRef]
- Kitano, H. Systems biology: a brief overview. Science 2002, 295, 1662–1664. [Google Scholar] [CrossRef] [PubMed]
- Graw, S.; Chappell, K.; Washam, C.L.; et al. Multi-omics data integration considerations and study design for biological systems and disease. Mol. Omics 2021, 17, 170–185. [Google Scholar] [CrossRef]
- Have, C.T.; Jensen, L.J. Are graph databases ready for bioinformatics? Bioinforma. Oxf. Engl. 2013, 29, 3107–3108. [Google Scholar] [CrossRef] [PubMed]
- Timón-Reina, S.; Rincón, M.; Martínez-Tomás, R. An overview of graph databases and their applications in the biomedical domain. Database J. Biol. Databases Curation 2021, 2021, baab026. [Google Scholar] [CrossRef] [PubMed]
- Fabregat, A.; Korninger, F.; Viteri, G.; et al. Reactome graph database: Efficient access to complex pathway data. PLoS Comput. Biol. 2018, 14, e1005968. [Google Scholar] [CrossRef]
- Yoon, B.-H.; Kim, S.-K.; Kim, S.-Y. Use of Graph Database for the Integration of Heterogeneous Biological Data. Genomics Inform. 2017, 15, 19–27. [Google Scholar] [CrossRef]
- Biological database modeling. 2008.
- Kriegel, A.; Trukhnov, B.M. SQL bible: explore the new SQL standard ; write more effective queries or develop code ; work with Oracle, IBM DB2, and SQL Server. 2008. [Google Scholar]
- Francis, N.; Green, A.; Guagliardo, P.; et al. Cypher: An Evolving Query Language for Property Graphs. Proc. 2018 Int. Conf. Manag. Data 2018, 1433–1445. [Google Scholar] [CrossRef]
- Le Novère, N.; Hucka, M.; Mi, H.; et al. The Systems Biology Graphical Notation. Nat. Biotechnol. 2009, 27, 735–741. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
- Hucka, M.; Finney, A.; Sauro, H.M.; et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinforma. Oxf. Engl. 2003, 19, 524–531. [Google Scholar] [CrossRef]
- Demir, E.; Cary, M.P.; Paley, S.; et al. The BioPAX community standard for pathway data sharing. Nat. Biotechnol. 2010, 28, 935–942. [Google Scholar] [CrossRef]
- Rougny, A.; Touré, V.; Moodie, S.; et al. Systems Biology Graphical Notation: Process Description language Level 1 Version 2. 0. J. Integr. Bioinforma. 2019, 16, 20190022. [Google Scholar] [CrossRef] [PubMed]
- Fabregat, A.; Sidiropoulos, K.; Viteri, G.; et al. Reactome pathway analysis: a high-performance in-memory approach. BMC Bioinformatics 2017, 18, 142. [Google Scholar] [CrossRef] [PubMed]
- Jassal, B.; Matthews, L.; Viteri, G.; et al. The reactome pathway knowledgebase. Nucleic Acids Res. 2020, 48, D498–D503. [Google Scholar] [CrossRef] [PubMed]
- Gillespie, M.; Jassal, B.; Stephan, R.; et al. The reactome pathway knowledgebase 2022. Nucleic Acids Res. 2022, 50, D687–D692. [Google Scholar] [CrossRef] [PubMed]
- Mi, H.; Muruganujan, A.; Ebert, D.; et al. PANTHER version 14, more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Res. 2019, 47, D419–D426. [Google Scholar] [CrossRef] [PubMed]
- Thiele, I.; Swainston, N.; Fleming, R.M.T.; et al. A community-driven global reconstruction of human metabolism. Nat. Biotechnol. 2013, 31, 419–425. [Google Scholar] [CrossRef] [PubMed]
- Noronha, A.; Daníelsdóttir, A.D.; Gawron, P.; et al. ReconMap: an interactive visualization of human metabolism. Bioinforma. Oxf. Engl. 2017, 33, 605–607. [Google Scholar] [CrossRef] [PubMed]
- Balaur, I.; Mazein, A.; Saqi, M.; et al. Recon2Neo4j: applying graph database technologies for managing comprehensive genome-scale networks. Bioinforma. Oxf. Engl. 2017, 33, 1096–1098. [Google Scholar] [CrossRef]
- Rougny, A.; Balaur, I.; Luna, A.; et al. StonPy: a tool to parse and query collections of SBGN maps in a graph database. Bioinforma. Oxf. Engl. 2023, 39, btad100. [Google Scholar] [CrossRef] [PubMed]
- Rougny, A.; Touré, V.; Albanese, J.; et al. SBGN Bricks Ontology as a tool to describe recurring concepts in molecular networks. Brief. Bioinform. 2021, 22, bbab049. [Google Scholar] [CrossRef]
- Naithani, S.; Gupta, P.; Preece, J.; et al. Plant Reactome: a knowledgebase and resource for comparative pathway analysis. Nucleic Acids Res. 2020, 48, D1093–D1103. [Google Scholar] [CrossRef] [PubMed]
- Mazein, A.; Acencio, M.L.; Balaur, I.; et al. A guide for developing comprehensive systems biology maps of disease mechanisms: planning, construction and maintenance. Front. Bioinforma. 2023, 3, 1197310. [Google Scholar] [CrossRef] [PubMed]
- Swainston, N.; Batista-Navarro, R.; Carbonell, P.; et al. biochem4j: Integrated and extensible biochemical knowledge through graph databases. PloS One 2017, 12, e0179130. [Google Scholar] [CrossRef] [PubMed]
- Sonawane, A.R.; Weiss, S.T.; Glass, K.; et al. Network Medicine in the Age of Biomedical Big Data. Front. Genet. 2019, 10, 294. [Google Scholar] [CrossRef] [PubMed]
- Hermjakob, H.; Montecchi-Palazzi, L.; Lewington, C.; et al. IntAct: an open source molecular interaction database. Nucleic Acids Res. 2004, 32, D452–D455. [Google Scholar] [CrossRef]
- Szklarczyk, D.; Gable, A.L.; Lyon, D.; et al. STRING v11, protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019, 47, D607–D613. [Google Scholar] [CrossRef]
- Keshava Prasad, T.S.; Goel, R.; Kandasamy, K.; et al. Human Protein Reference Database--2009 update. Nucleic Acids Res. 2009, 37, D767–D772. [Google Scholar] [CrossRef]
- Oughtred, R.; Rust, J.; Chang, C.; et al. The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci. Publ. Protein Soc. 2021, 30, 187–200. [Google Scholar] [CrossRef]
- Herwig, R.; Hardt, C.; Lienhard, M.; et al. Analyzing and interpreting genome data at the network level with ConsensusPathDB. Nat. Protoc. 2016, 11, 1889–1907. [Google Scholar] [CrossRef]
- Licata, L.; Briganti, L.; Peluso, D.; et al. MINT, the molecular interaction database: 2012 update. Nucleic Acids Res. 2012, 40, D857–D861. [Google Scholar] [CrossRef]
- Kotlyar, M.; Rossos, A.E.M.; Jurisica, I. Prediction of Protein-Protein Interactions. Curr. Protoc. Bioinforma. 2017, 60, 8.2.1–8.2.14. [Google Scholar] [CrossRef]
- Huttlin, E.L.; Ting, L.; Bruckner, R.J.; et al. The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell 2015, 162, 425–440. [Google Scholar] [CrossRef]
- Chen, C.-Y.; Ho, A.; Huang, H.-Y.; et al. Dissecting the human protein-protein interaction network via phylogenetic decomposition. Sci. Rep. 2014, 4, 7153. [Google Scholar] [CrossRef]
- Robin, V.; Bodein, A.; Scott-Boyer, M.-P.; et al. Overview of methods for characterization and visualization of a protein-protein interaction network in a multi-omics integration context. Front. Mol. Biosci. 2022, 9, 962799. [Google Scholar] [CrossRef]
- Xia, J.; Benner, M.J.; Hancock, R.E.W. NetworkAnalyst--integrative approaches for protein-protein interaction network analysis and visual exploration. Nucleic Acids Res. 2014, 42, W167–W174. [Google Scholar] [CrossRef]
- Himmelstein, D.S.; Zietz, M.; Rubinetti, V.; et al. Hetnet connectivity search provides rapid insights into how biomedical entities are related. GigaScience 2022, 12, giad047. [Google Scholar] [CrossRef]
- Morris, J.H.; Soman, K.; Akbas, R.E.; et al. The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information. Bioinforma. Oxf. Engl. 2023, 39, btad080. [Google Scholar] [CrossRef]
- Sadegh, S.; Skelton, J.; Anastasi, E.; et al. Network medicine for disease module identification and drug repurposing with the NeDRex platform. Nat. Commun. 2021, 12, 6848. [Google Scholar] [CrossRef]
- Del Toro, N.; Shrivastava, A.; Ragueneau, E.; et al. The IntAct database: efficient access to fine-grained molecular interaction data. Nucleic Acids Res. 2022, 50, D648–D653. [Google Scholar] [CrossRef]
- Nair, S.; Váradi, M.; Nadzirin, N.; et al. PDBe aggregated API: programmatic access to an integrative knowledge graph of molecular structure data. Bioinforma. Oxf. Engl. 2021, 37, 3950–3952. [Google Scholar] [CrossRef]
- Varadi, M.; Anyango, S.; Appasamy, S.D.; et al. PDBe and PDBe-KB: Providing high-quality, up-to-date and integrated resources of macromolecular structures to support basic and applied research and education. Protein Sci. Publ. Protein Soc. 2022, 31, e4439. [Google Scholar] [CrossRef]
- PDBe-KB consortium. PDBe-KB: collaboratively defining the biological context of structural data. Nucleic Acids Res. 2022, 50, D534–D542. [Google Scholar] [CrossRef]
- Esteban-Gil, A.; Fernández-Breis, J.T.; Boeker, M. Analysis and visualization of disease courses in a semantically-enabled cancer registry. J. Biomed. Semant. 2017, 8, 46. [Google Scholar] [CrossRef]
- Zahoránszky-Kőhalmi, G.; Sheils, T.; Oprea, T.I. SmartGraph: a network pharmacology investigation platform. J. Cheminformatics 2020, 12, 5. [Google Scholar] [CrossRef]
- Santos, A.; Colaço, A.R.; Nielsen, A.B.; et al. A knowledge graph to interpret clinical proteomics data. Nat. Biotechnol. 2022, 40, 692–702. [Google Scholar] [CrossRef]
- Mishra, V.; Re, D.B.; Le Verche, V.; et al. Systematic elucidation of neuron-astrocyte interaction in models of amyotrophic lateral sclerosis using multi-modal integrated bioinformatics workflow. Nat. Commun. 2020, 11, 5579. [Google Scholar] [CrossRef]
- de Bono, B.; Gillespie, T.; Surles-Zeigler, M.C.; et al. Representing Normal and Abnormal Physiology as Routes of Flow in ApiNATOMY. Front. Physiol. 2022, 13, 795303. [Google Scholar] [CrossRef]
- Mei, S.; Huang, X.; Xie, C.; et al. GREG-studying transcriptional regulation using integrative graph databases. Database J. Biol. Databases Curation 2020, 2020, baz162. [Google Scholar] [CrossRef]
- Kerzner, E.; Lex, A.; Sigulinsky, C.L.; et al. Graffinity: Visualizing Connectivity in Large Graphs. Comput. Graph. Forum J. Eur. Assoc. Comput. Graph. 2017, 36, 251–260. [Google Scholar] [CrossRef]
- Lakshmi, K.; Meyyappan, T. Compact in-memory representation of large graph databases for efficient mining of maximal frequent sub graphs. Concurr. Comput. Pract. Exp. 2021, 33, e5243. [Google Scholar] [CrossRef]
- Lambusch, F.; Waltemath, D.; Wolkenhauer, O.; et al. Identifying frequent patterns in biochemical reaction networks: a workflow. Database J. Biol. Databases Curation 2018, 2018, bay051. [Google Scholar] [CrossRef]
- Aguilera-Mendoza, L.; Marrero-Ponce, Y.; Beltran, J.A.; et al. Graph-based data integration from bioactive peptide databases of pharmaceutical interest: toward an organized collection enabling visual network analysis. Bioinforma. Oxf. Engl. 2019, 35, 4739–4747. [Google Scholar] [CrossRef]
- Messina, A.; Fiannaca, A.; La Paglia, L.; et al. BioGraph: a web application and a graph database for querying and analyzing bioinformatics resources. BMC Syst. Biol. 2018, 12, 98. [Google Scholar] [CrossRef]
- Courtot, M.; Juty, N.; Knüpfer, C.; et al. Controlled vocabularies and semantics in systems biology. Mol. Syst. Biol. 2011, 7, 543. [Google Scholar] [CrossRef]
- Sauro, H.M.; Bergmann, F.T. Standards and ontologies in computational systems biology. Essays Biochem. 2008, 45, 211–222. [Google Scholar] [CrossRef]
- Gillespie, T.H.; Tripathy, S.J.; Sy, M.F.; et al. The Neuron Phenotype Ontology: A FAIR Approach to Proposing and Classifying Neuronal Types. Neuroinformatics 2022, 20, 793–809. [Google Scholar] [CrossRef]
- The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 2019, 47, D330–D338. [Google Scholar] [CrossRef]
- Schriml, L.M.; Arze, C.; Nadendla, S.; et al. Disease Ontology: a backbone for disease semantic integration. Nucleic Acids Res. 2012, 40, D940–D946. [Google Scholar] [CrossRef]
- Unni, D.R.; Moxon, S.A.T.; Bada, M.; et al. Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science. Clin. Transl. Sci. 2022, 15, 1848–1855. [Google Scholar] [CrossRef]
- Martin, D.; Brun, C.; Remy, E.; et al. GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol. 2004, 5, R101. [Google Scholar] [CrossRef]
- Bizer, C.; Heath, T.; Idehen, K.; et al. Linked data on the web (LDOW2008). Proc. 17th Int. Conf. World Wide Web 2008, 1265–1266. [Google Scholar] [CrossRef]
- Lekschas, F.; Gehlenborg, N. SATORI: a system for ontology-guided visual exploration of biomedical data repositories. Bioinforma. Oxf. Engl. 2018, 34, 1200–1207. [Google Scholar] [CrossRef]
- Livingston, K.M.; Bada, M.; Baumgartner, W.A.; et al. KaBOB: ontology-based semantic integration of biomedical databases. BMC Bioinformatics 2015, 16, 126. [Google Scholar] [CrossRef] [PubMed]
- Natale, D.A.; Arighi, C.N.; Blake, J.A.; et al. Protein Ontology (PRO): enhancing and scaling up the representation of protein entities. Nucleic Acids Res. 2017, 45, D339–D346. [Google Scholar] [CrossRef] [PubMed]
- Chen, C.; Huang, H.; Ross, K.E.; et al. Protein ontology on the semantic web for knowledge discovery. Sci. Data 2020, 7, 337. [Google Scholar] [CrossRef] [PubMed]
- Shefchek, K.A.; Harris, N.L.; Gargano, M.; et al. The Monarch Initiative in 2019, an integrative data and analytic platform connecting phenotypes to genotypes across species. Nucleic Acids Res. 2020, 48, D704–D715. [Google Scholar] [CrossRef] [PubMed]
- Köhler, S.; Carmody, L.; Vasilevsky, N.; et al. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res. 2019, 47, D1018–D1027. [Google Scholar] [CrossRef] [PubMed]
- Xu, Q.; Shi, Y.; Lu, Q.; et al. GORouter: an RDF model for providing semantic query and inference services for Gene Ontology and its associations. BMC Bioinformatics 2008, 9 Suppl 1, S6. [Google Scholar] [CrossRef]
- Belleau, F.; Nolin, M.-A.; Tourigny, N.; et al. Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 2008, 41, 706–716. [Google Scholar] [CrossRef]
- Cheung, K.-H.; Frost, H.R.; Marshall, M.S.; et al. A journey to Semantic Web query federation in the life sciences. BMC Bioinformatics 2009, 10 Suppl 10, S10. [Google Scholar] [CrossRef]
- Asiaee, A.H.; Doshi, P.; Minning, T.; et al. From Questions to Effective Answers: On the Utility of Knowledge-Driven Querying Systems for Life Sciences Data. Data Integr. Life Sci. 2013, 7970, 38–45. [Google Scholar] [CrossRef]
- Asiaee, A.H.; Minning, T.; Doshi, P.; et al. A framework for ontology-based question answering with application to parasite immunology. J. Biomed. Semant. 2015, 6, 31. [Google Scholar] [CrossRef]
- Galgonek, J.; Hurt, T.; Michlíková, V.; et al. Advanced SPARQL querying in small molecule databases. J. Cheminformatics 2016, 8, 31. [Google Scholar] [CrossRef]
- Zhang, R.; Hristovski, D.; Schutte, D.; et al. Drug repurposing for COVID-19 via knowledge graph completion. J. Biomed. Inform. 2021, 115, 103696. [Google Scholar] [CrossRef]
- Al-Saleem, J.; Granet, R.; Ramakrishnan, S.; et al. Knowledge Graph-Based Approaches to Drug Repurposing for COVID-19. J. Chem. Inf. Model. 2021, 61, 4058–4067. [Google Scholar] [CrossRef]
- Reese, J.T.; Unni, D.; Callahan, T.J.; et al. KG-COVID-19, A Framework to Produce Customized Knowledge Graphs for COVID-19 Response. Patterns N. Y. N 2021, 2, 100155. [Google Scholar] [CrossRef]
- Zahoránszky-Kőhalmi, G.; Siramshetty, V.B.; Kumar, P.; et al. A Workflow of Integrated Resources to Catalyze Network Pharmacology Driven COVID-19 Research. J. Chem. Inf. Model. 2022, 62, 718–729. [Google Scholar] [CrossRef]
- Chen, C.; Ross, K.E.; Gavali, S.; et al. COVID-19 Knowledge Graph from semantic integration of biomedical literature and databases. Bioinforma. Oxf. Engl. 2021, 37, 4597–4598. [Google Scholar] [CrossRef]
- Gütebier, L.; Bleimehl, T.; Henkel, R.; et al. CovidGraph: a graph to fight COVID-19. Bioinforma. Oxf. Engl. 2022, 38, 4843–4845. [Google Scholar] [CrossRef]
- Peng, J.; Xu, D.; Lee, R.; et al. Expediting knowledge acquisition by a web framework for Knowledge Graph Exploration and Visualization (KGEV): case studies on COVID-19 and Human Phenotype Ontology. BMC Med. Inform. Decis. Mak. 2022, 22, 147. [Google Scholar] [CrossRef] [PubMed]
- Domingo-Fernández, D.; Baksi, S.; Schultz, B.; et al. COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology. Bioinforma. Oxf. Engl. 2021, 37, 1332–1334. [Google Scholar] [CrossRef]
- Shi, W.; Fan, G.; Shen, Z.; et al. gcCov: Linked open data for global coronavirus studies. mLife 2022, 1, 92–95. [Google Scholar] [CrossRef] [PubMed]
- Chatterjee, A.; Nardi, C.; Oberije, C.; et al. Knowledge Graphs for COVID-19, An Exploratory Review of the Current Landscape. J. Pers. Med. 2021, 11, 300. [Google Scholar] [CrossRef]
- Wang, L.L.; Lo, K.; Chandrasekhar, Y.; et al. CORD-19, The Covid-19 Open Research Dataset. arXiv 2020, arXiv:2004.10706v4. [Google Scholar]
- Freshour, S.L.; Kiwala, S.; Cotto, K.C.; et al. Integration of the Drug-Gene Interaction Database (DGIdb 4. 0) with open crowdsource efforts. Nucleic Acids Res. 2021, 49, D1144–D1151. [Google Scholar] [CrossRef]
- Piñero, J.; Ramírez-Anguita, J.M.; Saüch-Pitarch, J.; et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020, 48, D845–D855. [Google Scholar] [CrossRef]
- UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 2023, 51, D523–D531. [Google Scholar] [CrossRef]
- Gene Ontology Consortium; Aleksander, S.A.; Balhoff, J; et al. The Gene Ontology knowledgebase in 2023. Genetics 2023, 224, iyad031. [Google Scholar]
- Kotiranta, P.; Junkkari, M.; Nummenmaa, J. Performance of Graph and Relational Databases in Complex Queries. Appl. Sci. 2022, 12, 6490. [Google Scholar] [CrossRef]
- Sullivan, D.E.; Gabbard, J.L.; Shukla, M.; et al. Data integration for dynamic and sustainable systems biology resources: challenges and lessons learned. Chem. Biodivers. 2010, 7, 1124–1141. [Google Scholar] [CrossRef] [PubMed]
- Lapatas, V.; Stefanidakis, M.; Jimenez, R.C.; et al. Data integration in biological research: an overview. J. Biol. Res. Thessalon. Greece 2015, 22, 9. [Google Scholar] [CrossRef] [PubMed]
- Thessen, A.E.; Bogdan, P.; Patterson, D.J.; et al. From Reductionism to Reintegration: Solving society’s most pressing problems requires building bridges between data types across the life sciences. PLoS Biol. 2021, 19, e3001129. [Google Scholar] [CrossRef] [PubMed]
- Hasnain, A.; Mehmood, Q.; Sana EZainab, S.; et al. BioFed: federated query processing over life sciences linked open data. J. Biomed. Semant. 2017, 8, 13. [Google Scholar] [CrossRef] [PubMed]
- Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.J.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef] [PubMed]
- Lin, D.; Crabtree, J.; Dillo, I.; et al. The TRUST Principles for digital repositories. Sci. Data 2020, 7, 144. [Google Scholar] [CrossRef]
- Touré, V.; Le Novère, N.; Waltemath, D.; et al. Quick tips for creating effective and impactful biological pathways using the Systems Biology Graphical Notation. PLoS Comput. Biol. 2018, 14, e1005740. [Google Scholar] [CrossRef]
- Türei, D.; Korcsmáros, T.; Saez-Rodriguez, J. OmniPath: guidelines and gateway for literature-curated signaling pathway resources. Nat. Methods 2016, 13, 966–967. [Google Scholar] [CrossRef]
- Mi, H.; Schreiber, F.; Moodie, S.; et al. Systems Biology Graphical Notation: Activity Flow language Level 1 Version 1. 2. J. Integr. Bioinforma. 2015, 12, 265. [Google Scholar] [CrossRef]
- Ceccarelli, F.; Turei, D.; Gabor, A.; et al. Bringing data from curated pathway resources to Cytoscape with OmniPath. Bioinforma. Oxf. Engl. 2020, 36, 2632–2633. [Google Scholar] [CrossRef]
- Rodchenkov, I.; Babur, O.; Luna, A.; et al. Pathway Commons 2019 Update: integration, analysis and exploration of pathway data. Nucleic Acids Res. 2020, 48, D489–D497. [Google Scholar] [CrossRef]
- Cerami, E.G.; Gross, B.E.; Demir, E.; et al. Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res. 2011, 39, D685–D690. [Google Scholar] [CrossRef]
- Segura Bedmar, I.; Martínez, P.; Carruana Martín, A. Search and Graph Database Technologies for Biomedical Semantic Indexing: Experimental Analysis. JMIR Med. Inform. 2017, 5, e48. [Google Scholar] [CrossRef]
- Quan, X.; Cai, W.; Xi, C.; et al. AIMedGraph: a comprehensive multi-relational knowledge graph for precision medicine. Database J. Biol. Databases Curation 2023, 2023, baad006. [Google Scholar] [CrossRef]
- Alliance of Genome Resources Consortium. Alliance of Genome Resources Portal: unified model organism research platform. Nucleic Acids Res. 2020, 48, D650–D658. [Google Scholar] [CrossRef] [PubMed]
- Himmelstein, D.S.; Lizee, A.; Hessler, C.; et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife 2017, 6, e26726. [Google Scholar] [CrossRef] [PubMed]
- Biomedical Data Translator Consortium. Toward A Universal Biomedical Data Translator. Clin. Transl. Sci. 2019, 12, 86–90. [Google Scholar] [CrossRef] [PubMed]
- Hannestad, L.M.; Dančík, V.; Godden, M.; et al. Knowledge Beacons: Web services for data harvesting of distributed biomedical knowledge. PloS One 2021, 16, e0231916. [Google Scholar] [CrossRef] [PubMed]
- Wood, E.C.; Glen, A.K.; Kvarfordt, L.G.; et al. RTX-KG2, a system for building a semantically standardized knowledge graph for translational biomedicine. BMC Bioinformatics 2022, 23, 400. [Google Scholar] [CrossRef] [PubMed]
- Mendez, D.; Gaulton, A.; Bento, A.P.; et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019, 47, D930–D940. [Google Scholar] [CrossRef] [PubMed]
- Wishart, D.S.; Knox, C.; Guo, A.C.; et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006, 34, D668–D672. [Google Scholar] [CrossRef] [PubMed]
- Kanehisa, M.; Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef] [PubMed]
- Lobentanzer, S.; Aloy, P.; Baumbach, J.; et al. Democratizing knowledge representation with BioCypher. Nat. Biotechnol. 2023, 41, 1056–1059. [Google Scholar] [CrossRef]
- Tenenbaum, J.D.; Whetzel, P.L.; Anderson, K.; et al. The Biomedical Resource Ontology (BRO) to enable resource discovery in clinical and translational research. J. Biomed. Inform. 2011, 44, 137–145. [Google Scholar] [CrossRef]
- Barrio-Hernandez, I.; Schwartzentruber, J.; Shrivastava, A.; et al. Network expansion of genetic associations defines a pleiotropy map of human cell biology. Nat. Genet. 2023, 55, 389–398. [Google Scholar] [CrossRef]
Biographical Notes

| Ontology | Content | GDB | OWL | Accessible at | Publications |
|---|---|---|---|---|---|
| Disease Ontology | Medical terms and human diseases | Neo4j | Yes | disease-ontology.org/ | [62] |
| Knowledge Base Of Biomedicine | Biomedical data | AllegroGraph or Virtuoso | Partially | Installed locally via github.com/drlivingston/kabob | [67] |
| Protein Ontology | Taxon-specific and taxon-neutral protein-related entities | Virtuoso | Yes | proconsortium.org/ | [68,69] |
| Human Phenotype Ontology | Phenotypic abnormalities in humans | Unknown but part of the Monarch Initiative (monarchinitiative.org) that uses RDF and Neo4j | Yes | hpo.jax.org/app/ | [70,71] |
| Unified Phenotype Ontology | Organism-specific phenotypes | Unknown but part of the Monarch Initiative (monarchinitiative.org) that uses RDF and Neo4j | Yes | ols.monarchinitiative.org/ontologies/upheno2 | [70] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).