Submitted:
06 August 2024
Posted:
08 August 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Definition of a novel nested relational algebra natural equi-join operator for composing nested morphisms, also supporting left outer joins by parameter tuning (Section 5).
- Definition of an object-oriented database view for updating the databases on the fly without the need for heavy restructuring the loaded and indexed physical model (Section 4.3).
- As the latter view relies on the definition of a logical model extending GSM (Section 4.1), we show that the physical model is isomorphic to an indexed set of GSM databases expressed within the logical model (Lemma 1).
2. Preliminary Notation
2.1. Higher-Order Functions
- The zipping operator maps n tuples (or records) to a record of tuples (or records) r defined as over i iff. all n tuples are defined over i:
- Given a function and a generic collection C, the mapping operator returns a new collection by applying f to each component of C:
- Given a binary predicate p and a collection C, the filter function trims C by restricting it to its values satisfying p:
- Given a binary function , an initial value (accumulator), and a tuple C, the (left) fold operator is a tail-recursive function returning either for an empty tuple, or for a tuple :
-
Given a collection of strings C and a separator s, collapse (also referred to as join in programming languages such as Javascript or Python) returns a single string where all the strings in C are separated by s. Given “^” the usual string concatenation operator, this can be epxressed in terms of as follows:When s is the empty string “”, then we can use as a shorthand.
- Given a function and two values and , the update of f so that it will return b for a and will return any other previous value for otherwise is defined as follows:
-
Given a function f and an input value, the HOF optionally getting the value of if and returning z otherwise is defined as:Please observe that we can use this function in combination with Put to set multiple nested functions:
3. Related Works
3.1. Graph Data
3.1.1. Logical Model
Direct Acyclic Graphs (DAGs) and Topological Sort

Property Graphs
RDF
3.1.2. Query Languages
- Graph Traversal and Pattern Matching: these are mainly navigational languages performing the graph visit through “tractable” algorithms through polynomial time visits with respect to the graph size [17,22,23]. Consequently, such solutions do not necessarily involve to run a subgraph isomorphism problem, except when expressly requested by specific semantics [21,24].
- (Simple) Graph Grammars: as discussed in the forthcoming paragraph, they can add and remove new vertices and edges which do not necessarily depend on previously matched data, but they are unable to express full data transformation operations.
- Graph Algebras: these are mainly designed either to change the structure of property graphs through undary operators, or to combine them through n-ary (often binary) ones. These are not to be confused to the path-algebras for expressing graph traversal and pattern matching constructs, as they allow to completely transform graphs alongside the data associated to them as well as dealing with graph data collections [25,26,27,28].
- “Proper” Graph Query Languages: We say that a graph query language is “proper” when its expressive power includes all the aforementioned query languages, and possibly expressing the graph algebraic operators while being able of expressing, to some extent, graph grammar rewriting rules, independently from their ability of expressing them in a fully-declarative way. This is achieved to some extent to commonly-available languages, such as SPARQL and Cypher [4].
Graph Grammars
Proper Graph Query Languages


3.2. Nested Relational Model
3.2.1 Logical Model
3.2.2 Query Languages
3.2.3 Columnar Physical Model
4. Generalised Semistructured Model v2.0

4.1. Logical Model
4.2. Physical Model
4.3. GSM View Δ(g)
4.3.1. Object Replacement and Resolution
4.3.2. View Materialisation
4.4. Morphism Notation
5. Nested Natural Equi-Join

5.1. Properties
6. Generalised Graph Grammar

6.1. Syntax and Informal Semantics


6.2. Determining the Order of Application of the Rules

6.3. Containment Matching

6.3.1. Pseudocode Notation for Li
6.3.2. Procedural Semantics for Matching and Caching Containments
6.3.3. Algorithmic Choices for Optimization
6.4. Morphism Instantiation and Indexing

6.5. Graph Rewriting Operations (op from Ri)

7. Language Properties
8. Time Complexity
9. Empirical Evaluation
9.1. Comparing Cypher and Neo4J with Our Proposed Implementation
9.2. Scalability for the Proposed Implementation
10. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest


Appendix A
Appendix A.1. Variable Resolution

Appendix A.2. Predicate Evaluation ()
.
Appendix A.3. Expression Evaluation (expr)

Appendix A.4. Full SOS Rewriting Specifications


Appendix B. Proofs
Appendix B.1. Transformation Isomorphism
Appendix B.2. Nested Equi-Join Properties
Appendix B.3.Language Properties






Appendix B.4. Time Complexity
References
- Schmitt, I. QQL: A DB&IR Query Language. VLDB J. 2008, 17, 39–56. [Google Scholar] [CrossRef]
- Chamberlin, D.D.; Boyce, R.F. SEQUEL: A Structured English Query Language. Proceedings of 1974 ACM-SIGMOD Workshop on Data Description, Access and Control, Ann Arbor, Michigan, USA, -3, 1974, 2 Volumes; Altshuler, G.; Rustin, R.; Plagman, B.D., Eds. ACM, 1974, pp. 249–264. 1 May. [CrossRef]
- Rodriguez, M.A. The Gremlin graph traversal machine and language (invited talk). Proceedings of the 15th Symposium on Database Programming Languages; Association for Computing Machinery: New York, NY, USA, 2015. [Google Scholar] [CrossRef]
- Robinson, I.; Webber, J.; Eifrem, E. Graph Databases; O’Reilly Media, Inc., 2013.
- Angles, R.; Gutierrez, C. The Expressive Power of SPARQL; 2008; pp. 114–129.
- Bergami, G.; Petermann, A.; Montesi, D. THoSP: an algorithm for nesting property graphs. Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA); Association for Computing Machinery: New York, NY, USA, 2018. [Google Scholar] [CrossRef]
- Bergami, G. On Efficiently Equi-Joining Graphs. Proceedings of the 25th International Database Engineering & Applications Symposium; Association for Computing Machinery: New York, NY, USA, 2021. [Google Scholar] [CrossRef]
- Ehrig, H.; Habel, A.; Kreowski, H.J. Introduction to graph grammars with applications to semantic networks. Computers & Mathematics with Applications 1992, 23, 557–572. [Google Scholar] [CrossRef]
- Bergami, G. A new Nested Graph Model for Data Integration. PhD thesis, alma, 2018.
- Das, S.; Srinivasan, J.; Perry, M.; Chong, E.I.; Banerjee, J. A Tale of Two Graphs: Property Graphs as RDF in Oracle. Proceedings of the 17th International Conference on Extending Database Technology, EDBT 2014, Athens, Greece, -28, 2014., 2014, pp. 762–773. 24 March. [CrossRef]
- Turi, D.; Plotkin, G.D. Towards a Mathematical Operational Semantics. Proceedings, 12th Annual IEEE Symposium on Logic in Computer Science, Warsaw, Poland, - July 2, 1997. IEEE Computer Society, 1997, pp. 280–291. 29 June. [CrossRef]
- Bergami, G.; Appleby, S.; Morgan, G. Quickening Data-Aware Conformance Checking through Temporal Algebras. Information 2023, 14. [Google Scholar] [CrossRef]
- Kahn, A.B. Topological sorting of large networks. Commun. ACM 1962, 5, 558–562. [Google Scholar] [CrossRef]
- Sugiyama, K.; Tagawa, S.; Toda, M. Methods for Visual Understanding of Hierarchical System Structures. IEEE Trans. Syst. Man Cybern. 1981, 11, 109–125. [Google Scholar] [CrossRef]
- Hölsch, J.; Grossniklaus, M. An Algebra and Equivalences to Transform Graph Patterns in Neo4j. Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference, EDBT/ICDT Workshops 2016, Bordeaux, France, , 2016.; Palpanas, T.; Stefanidis, K., Eds., 2016, Vol. 1558, CEUR Workshop Proceedings. 15 March.
- Gutierrez, C.; Hurtado, C.A.; Mendelzon, A.O.; Pérez, J. Foundations of Semantic Web databases. Journal of Computer and System Sciences 2011, 77, 520–541. [Google Scholar] [CrossRef]
- Fionda, V.; Pirrò, G.; Gutierrez, C. NautiLOD: A Formal Language for the Web of Data Graph. TWEB 2015, 9, 5:1–5:43. [Google Scholar] [CrossRef]
- Hartig, O.; Pérez, J. , The Semantic Web - ISWC 2015: 14th International Semantic Web Conference, Bethlehem, PA, USA, October 11-15, 2015, Proceedings, Part I; Springer International Publishing: Cham, 2015. [Google Scholar] [CrossRef]
- Carroll, J.J.; Dickinson, I.; Dollin, C.; Reynolds, D.; Seaborne, A.; Wilkinson, K. Jena: Implementing the Semantic Web Recommendations. Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers & Amp; Posters; ACM: New York, NY, USA, 2004. [Google Scholar] [CrossRef]
- Sirin, E.; Parsia, B.; Grau, B.C.; Kalyanpur, A.; Katz, Y. Pellet: A practical OWL-DL reasoner. Web Semantics: Science, Services and Agents on the World Wide Web 2007, 5, 51–53. [Google Scholar] [CrossRef]
- Angles, R.; Arenas, M.; Barceló, P.; Hogan, A.; Reutter, J.L.; Vrgoc, D. Foundations of Modern Query Languages for Graph Databases. ACM Comput. Surv. 2017, 50, 68:1–68:40. [Google Scholar] [CrossRef]
- Fan, W.; Li, J.; Ma, S.; Tang, N.; Wu, Y. Adding regular expressions to graph reachability and pattern queries. Frontiers of Computer Science 2012, 6, 313–338. [Google Scholar] [CrossRef]
- Barceló, P.; Fontaine, G.; Lin, A.W. , Artificial Intelligence, and Reasoning: 19th International Conference, LPAR-19, Stellenbosch, South Africa, December 14-19, 2013. Proceedings; McMillan, K.; Middeldorp, A.; Voronkov, A., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2013; pp. 71–85. doi:10.1007/978-3-642-45221-5_5.Data. In Logic for Programming, Artificial Intelligence, and Reasoning: 19th International Conference, LPAR-19, Stellenbosch, South Africa, December 14-19, 2013. Proceedings; McMillan, K., Middeldorp, A., Voronkov, A., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2013; Springer Berlin Heidelberg: Berlin, Heidelberg, 2013; pp. 71–85. [Google Scholar] [CrossRef]
- Junghanns, M.; Kießling, M.; Averbuch, A.; Petermann, A.; Rahm, E. Cypher-based Graph Pattern Matching in Gradoop. Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems, GRADES@SIGMOD/PODS 2017, Chicago, IL, USA, May 14 - 19, 2017, 2017, pp. 3:1–3:8. [Google Scholar] [CrossRef]
- Ghrab, A.; Romero, O.; Skhiri, S.; Vaisman, A.A.; Zimányi, E. GRAD: On Graph Database Modeling. CoRR, 1602. [Google Scholar]
- Ghrab, A.; Romero, O.; Skhiri, S.; Vaisman, A.; Zimányi, E. , Advances in Databases and Information Systems: 19th East European Conference, ADBIS 2015, Poitiers, France, September 8-11, 2015, Proceedings; Springer International Publishing: Cham, 2015. [Google Scholar] [CrossRef]
- Junghanns, M.; Petermann, A.; Teichmann, N.; Gomez, K.; Rahm, E. Analyzing Extended Property Graphs with Apache Flink. SIGMOD workshop on Network Data Analytics (NDA) 2016. [Google Scholar]
- Wolter, U.; Truong, T.T. Graph Algebras and Derived Graph Operations. Logics 2023, 1, 182–239. [Google Scholar] [CrossRef]
- Rozenberg, G. (Ed.) Handbook of Graph Grammars and Computing by Graph Transformation: Volume I. Foundations; WSP, 1997.
- Pérez, J.; Arenas, M.; Gutierrez, C. Semantics and Complexity of SPARQL 2009. 34.
- Szárnyas, G. Incremental View Maintenance for Property Graph Queries. Proc. of SIGMOD. ACM, 2018, p. 1843–1845.
- Consens, M.P.; Mendelzon, A.O. GraphLog: A Visual Formalism for Real Life Recursion. Proc. of PODS. ACM, 1990, pp. 404–416.
- Hölsch, J.; Grossniklaus, M. An Algebra and Equivalences to Transform Graph Patterns in Neo4j. Fifth International Workshop on Querying Graph Structured Data 2016. [Google Scholar]
- Bergami, G.; Zegadło, W. Towards a Generalised Semistructured Data Model and Query Language. SIGWEB Newsl. 2023, 2023. [Google Scholar] [CrossRef]
- Shmedding, F. Incremental SPARQL Evaluation for Query Answering on Linked Data. Second International Workshop on Consuming Linked Data, 2011, COLD2011.
- Huang, J.; Abadi, D.J.; Ren, K. Scalable SPARQL Querying of Large RDF Graphs. PVLDB 2011, 4, 1123–1134. [Google Scholar] [CrossRef]
- Atre, M. Left Bit Right: For SPARQL Join Queries with OPTIONAL Patterns (Left-outer-joins). SIGMOD Conference. ACM, 2015, pp. 1793–1808.
- Codd, E.F. A relational model of data for large shared data banks. Commun. ACM 1970, 13, 377–387. [Google Scholar] [CrossRef]
- Liu, H.C.; Ramamohanarao, K. Algebraic equivalences among nested relational expressions. Proceedings of the Third International Conference on Information and Knowledge Management; Association for Computing Machinery: New York, NY, USA, 1994. [Google Scholar] [CrossRef]
- Levene, M.; Loizou, G. The nested universal relation data model. Journal of Computer and System Sciences 1994, 49, 683–717. [Google Scholar] [CrossRef]
- Colby, L.S. A recursive algebra and query optimization for nested relations. Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data; Association for Computing Machinery: New York, NY, USA, 1989. [Google Scholar] [CrossRef]
- Leser, U.; Naumann, F. Informationsintegration: Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen; dpunkt, 2006.
- Elmasri, R.A.; Navathe, S.B. Fundamentals of Database Systems, 7th ed.; Pearson, 2016.
- Atzeni, P.; Ceri, S.; Paraboschi, S.; Torlone, R. Database Systems - Concepts, Languages and Architectures; McGraw-Hill Book Company, 1999.
- Colby, L.S. A recursive algebra and query optimization for nested relations. SIGMOD Rec. 1989, 18, 273–283. [Google Scholar] [CrossRef]
- den Bussche, J.V. Simulation of the nested relational algebra by the flat relational algebra, with an application to the complexity of evaluating powerset algebra expressions. Theor. Comput. Sci. 2001, 254, 363–377. [Google Scholar] [CrossRef]
- Idreos, S.; Groffen, F.; Nes, N.; Manegold, S.; Mullender, K.S.; Kersten, M.L. MonetDB: Two Decades of Research in Column-oriented Database Architectures. IEEE Data Eng. Bull. 2012, 35, 40–45. [Google Scholar]
- Green, T.J.; Karvounarakis, G.; Tannen, V. Provenance semirings. Proceedings of the Twenty-Sixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems; Association for Computing Machinery: New York, NY, USA, 2007. [Google Scholar] [CrossRef]
- Chapman, A.; Missier, P.; Simonelli, G.; Torlone, R. Capturing and querying fine-grained provenance of preprocessing pipelines in data science. VLDB 2020, 14. [Google Scholar] [CrossRef]
- Junghanns, M.; Petermann, A.; Rahm, E. Distributed Grouping of Property Graphs with GRADOOP. In Datenbanksysteme für Business, Technologie und Web (BTW 2017); Gesellschaft für Informatik, Bonn, 2017; pp. 103–122.
- Bellatreche, L.; Kechar, M.; Bahloul, S.N. Bringing Common Subexpression Problem from the Dark to Light: Towards Large-Scale Workload Optimizations. IDEAS. ACM, 2021.
- Aho, A.V.; Lam, M.S.; Sethi, R.; Ullman, J.D. Compilers: Principles, Techniques, and Tools (2nd Edition); Addison-Wesley Longman Publishing Co., Inc.: USA, 2006. [Google Scholar]
- Ulrich, H.; Kern, J.; Tas, D.; Kock-Schoppenhauer, A.; Ückert, F.; Ingenerf, J.; Lablans, M. QL4MDR: a GraphQL query language for ISO 11179-based metadata repositories. BMC Medical Informatics Decis. Mak. 2019, 19, 45:1–45:7. [Google Scholar] [CrossRef] [PubMed]
- Parr, T. The Definitive ANTLR 4 Reference, 2nd ed.; Pragmatic Bookshelf, 2013.
- Tarjan, R.E. Edge-Disjoint Spanning Trees and Depth-First Search. Acta Informatica 1976, 6, 171–185. [Google Scholar] [CrossRef]
- de Marneffe, M.C.; Manning, C.D.; Nivre, J.; Zeman, D. Universal Dependencies. Computational Linguistics 2021, 47, 255–308. [Google Scholar] [CrossRef]
- Martelli, A.; Montanari, U. Unification in linear time and space: A structured presentation. Technical Report Vol. IEI-B76-16, Consiglio Nazionale delle Ricerche, Pisa.
- Rozenberg, G. (Ed.) Handbook of Graph Grammars and Computing by Graph Transformation: Volume II.; WSP, 1999.
- Talmor, A.; Herzig, J.; Lourie, N.; Berant, J. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, -7, 2019, Volume 1 (Long and Short Papers); Burstein, J.; Doran, C.; Solorio, T., Eds. Association for Computational Linguistics, 2019, pp. 4149–4158. 2 June. [CrossRef]
- de Marneffe, M.; MacCartney, B.; Manning, C.D. Generating Typed Dependency Parses from Phrase Structure Parses. Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, -28, 2006; Calzolari, N.; Choukri, K.; Gangemi, A.; Maegaard, B.; Mariani, J.; Odijk, J.; Tapias, D., Eds. European Language Resources Association (ELRA), 2006, pp. 449–454. 22 May.
- Chen, D.; Manning, C.D. A Fast and Accurate Dependency Parser using Neural Networks. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, -29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL; Moschitti, A.; Pang, B.; Daelemans, W., Eds. ACL, 2014, pp. 740–750. 25 October. [CrossRef]











![]() |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
