Submitted:
28 December 2023
Posted:
29 December 2023
You are already at the latest version
Abstract

Keywords:
1. Introduction
1.1. Open Science and Open Research Data
1.2. Geospatial Data services
1.3. Time varying data
2. Reproducibility research in the geospatial sector
- The utilization of proprietary software which is often subject to licensing restrictions prevents reproduction.
- The multitude of tools frequently employed in a single geospatial research project poses a challenge to replicability.
- The reliance on geospatial infrastructure that depends on online services can lead to obstacles in accessing the original dataset due to potential changes.
- Analyzing extensive datasets is often executed in proximity to the data source using online services, which would necessitate open accessibility to the server implementations.
- While free platforms provide scripting capabilities for data processing, the environment is subject to change and, as a result, may not ensure reproducibility.
3. Updates and Changes of Time Varying Geospatial Data
3.1. Updates and Changes in Time of Data from Sensor Observations Services
3.2. Updates and Changes in Time of data from Feature Services
3.3. Updates and Changes in Time of Raster Services
4. Tools for data versioning
4.1. Data versioning of databases
4.2. Data versioning of files
4.3. Data versioning of Log-Structured Tables
5. Research Challenges and Opportunities
6. Conclusions and Future Directions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ramachandran, R.; Bugbee, K.; Murphy, K. From Open Data to Open Science. Earth Space Sci. 2021, 8, e2020EA001562. [Google Scholar] [CrossRef]
- Ai, Q.; Bhat, V.; Ryno, S.M.; Jarolimek, K.; Sornberger, P.; Smith, A.; Haley, M.M.; Anthony, J.E.; Risko, C. OCELOT: An Infrastructure for Data-Driven Research to Discover and Design Crystalline Organic Semiconductors. J. Chem. Phys. 2021, 154. [Google Scholar] [CrossRef] [PubMed]
- Kar, A.K.; Dwivedi, Y.K. Theory Building with Big Data-Driven Research – Moving Away from the “What” towards the “Why. ” Int. J. Inf. Manag. 2020, 54, 102205. [Google Scholar] [CrossRef]
- Qi, M.; Mak, H.-Y.; Shen, Z.-J.M. Data-Driven Research in Retail Operations—A Review. Nav. Res. Logist. NRL 2020, 67, 595–616. [Google Scholar] [CrossRef]
- Facts and Figures for Open Research Data. Available online: https://research-and-innovation.ec.europa.eu/strategy/strategy-2020-2024/our-digital-future/open-science/open-science-monitor/facts-and-figures-open-research-data_en (accessed on 7 December 2023).
- European Organization For Nuclear Research; OpenAIRE Zenodo: Research. Shared. 2013. [CrossRef]
- Felden, J.; Möller, L.; Schindler, U.; Huber, R.; Schumacher, S.; Koppe, R.; Diepenbroek, M.; Glöckner, F.O. PANGAEA - Data Publisher for Earth & Environmental Science. Sci. Data 2023, 10, 347. [Google Scholar] [CrossRef] [PubMed]
- Żółtak, M.; Trognitz, M.; Ďurčo, M. ARCHE Suite: A Flexible Approach to Repository Metadata Management; July 8 2022; pp. 190–199.
- Re3data.Org UK Data Archive. 2012, Over 8.000 data collections. [CrossRef]
- Pampel, H.; Weisweiler, N.L.; Strecker, D.; Witt, M.; Vierkant, P.; Elger, K.; Bertelmann, R.; Buys, M.; Ferguson, L.M.; Kindling, M.; et al. Re3data – Indexing the Global Research Data Repository Landscape Since 2012. Sci. Data 2023, 10, 571. [Google Scholar] [CrossRef] [PubMed]
- Chatenoux, B.; Richard, J.-P.; Small, D.; Roeoesli, C.; Wingate, V.; Poussin, C.; Rodila, D.; Peduzzi, P.; Steinmeier, C.; Ginzler, C.; et al. The Swiss Data Cube, Analysis Ready Data Archive Using Earth Observations of Switzerland. Sci. Data 2021, 8, 295. [Google Scholar] [CrossRef] [PubMed]
- Farrell, E.; Minghini, M.; Kotsev, A.; Soler, G.J.; Tapsall, B.; Micheli, M.; Posada, S.M.; Signorelli, S.; Tartaro, A.; Bernal, C.J.; et al. European Data Spaces - Scientific Insights into Data Sharing and Utilisation at Scale. Available online: https://publications.jrc.ec.europa.eu/repository/handle/JRC129900 (accessed on 1 September 2023).
- Tandy, J.; van den Brink, L.; Barnaghi, P. Spatial Data on the Web Best Practices. W3C Work. Group Note 2017. [Google Scholar]
- Hobona, G.; Simmons, S.; Masó-Pau, J.; Jacovella-St-Louis, J. OGC API Standards for the Next Generation of Web Mapping. Abstr. ICA 2023, 6, 91. [Google Scholar] [CrossRef]
- Wang, C.; Yu, H.; Ma, K.-L. Importance-Driven Time-Varying Data Visualization. IEEE Trans. Vis. Comput. Graph. 2008, 14, 1547–1554. [Google Scholar] [CrossRef] [PubMed]
- Saracco, C.M.; Nicola, M.; Gandhi, L. A Matter of Time: Temporal Data Management in DB2 for z. IBM Corp. N. Y. 2010, 7. [Google Scholar]
- Kedron, P.; Li, W.; Fotheringham, S.; Goodchild, M. Reproducibility and Replicability: Opportunities and Challenges for Geospatial Research. Int. J. Geogr. Inf. Sci. 2021, 35, 427–445. [Google Scholar] [CrossRef]
- Konkol, M.; Kray, C. In-Depth Examination of Spatiotemporal Figures in Open Reproducible Research. Cartogr. Geogr. Inf. Sci. 2019, 46, 412–427. [Google Scholar] [CrossRef]
- Giuliani, G.; Cazeaux, H.; Burgi, P.-Y.; Poussin, C.; Richard, J.-P.; Chatenoux, B. SwissEnvEO: A FAIR National Environmental Data Repository for Earth Observation Open Science. Data Sci. J. 2021, 20, 22. [Google Scholar] [CrossRef]
- OGC OGC API - Features - Part 3: Filtering and the Common Query Language (CQL). Available online: https://portal.ogc.org/files/96288#simple-cql_temporal (accessed on 1 September 2023).
- OGC Sensor Observation Service 2012.
- OGC OGC API - Moving Features – Overview. Available online: https://ogcapi.ogc.org/movingfeatures/overview.html (accessed on 1 September 2023).
- Nüst, D.; Pebesma, E. Practical Reproducibility in Geography and Geosciences. Ann. Am. Assoc. Geogr. 2021, 111, 1300–1310. [Google Scholar] [CrossRef]
- Cerutti, V.; Bellman, C.; Both, A.; Duckham, M.; Jenny, B.; Lemmens, R.L.G.; Ostermann, F.O. Improving the Reproducibility of Geospatial Scientific Workflows: The Use of Geosocial Media in Facilitating Disaster Response. J. Spat. Sci. 2021, 66, 383–400. [Google Scholar] [CrossRef]
- KNIME Analytics Platform. Available online: https://www.knime.com/knime-analytics-platform (accessed on 8 December 2023).
- Yin, D.; Liu, Y.; Hu, H.; Terstriep, J.; Hong, X.; Padmanabhan, A.; Wang, S. CyberGIS-Jupyter for Reproducible and Scalable Geospatial Analytics. Concurr. Comput. Pract. Exp. 2019, 31, e5040. [Google Scholar] [CrossRef]
- Sullivan, I.; DeHaven, A.; Mellor, D. Open and Reproducible Research on Open Science Framework. Curr. Protoc. Essent. Lab. Tech. 2019, 18, e32. [Google Scholar] [CrossRef]
- Kirchhoff, M.; Geihs, K. Semantic Description of OData Services. In Proceedings of the Proceedings of the Fifth Workshop on Semantic Web Information Management; Association for Computing Machinery: New York, NY, USA,, 2013; pp. 1–8. [Google Scholar]
- Blanc, N.; Cannata, M.; Collombin, M.; Ertz, O.; Giuliani, G.; Ingensand, J. OGC API STATE OF PLAY – A PRACTICAL TESTBED FOR THE NATIONAL SPATIAL DATA INFRASTRUCTURE IN SWITZERLAND. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2022. [Google Scholar] [CrossRef]
- Krishnamurthi, R.; Kumar, A.; Gopinathan, D.; Nayyar, A.; Qureshi, B. An Overview of IoT Sensor Data Processing, Fusion, and Analysis Techniques. Sensors 2020, 20, 6076. [Google Scholar] [CrossRef]
- Strigaro, D.; Cannata, M.; Lepori, F.; Capelli, C.; Lami, A.; Manca, D.; Seno, S. Open and Cost-Effective Digital Ecosystem for Lake Water Quality Monitoring. Sensors 2022, 22, 6684. [Google Scholar] [CrossRef] [PubMed]
- Zahumenskỳ, I. Guidelines on Quality Control Procedures for Data from Automatic Weather Stations. World Meteorol. Organ. Switz. 2004, 955, 2–6. [Google Scholar]
- WMO, W.M. Climate Data Management System Specifications. Available online: https://library.wmo.int/records/item/51447-climate-data-management-system-specifications (accessed on 6 October 2023).
- Pozzoni, M.; Salvetti, A.; Cannata, M. Retrospective and Prospective of Hydro-Met Monitoring System in the Canton Ticino, Switzerland. Hydrol. Sci. J. 2020, 0, 1–15. [Google Scholar] [CrossRef]
- Cannata, M.; Antonovic, M.; Strigaro, D.; Cardoso, M. Performance Testing of IstSOS under High Load Scenarios. ISPRS Int. J. Geo-Inf. 2019, 8, 467. [Google Scholar] [CrossRef]
- UFAM, U. federale dell’ambiente Le acque sotterranee utilizzate come acqua potabile. Available online: https://www.bafu.admin.ch/bafu/it/home/themen/thema-wasser/wasser--fachinformationen/massnahmen-zum-schutz-der-gewaesser/grundwasserschutz/grundwasser-als-trinkwasser.html (accessed on 25 November 2023).
- Haklay, M.; Weber, P. OpenStreetMap: User-Generated Street Maps. IEEE Pervasive Comput. 2008, 7, 12–18. [Google Scholar] [CrossRef]
- OpenStreetMap Statistics. Available online: https://planet.openstreetmap.org/statistics/data_stats.html (accessed on 14 December 2023).
- Mocnik, F.-B.; Mobasheri, A.; Zipf, A. Open Source Data Mining Infrastructure for Exploring and Analysing OpenStreetMap. Open Geospatial Data Softw. Stand. 2018, 3. [Google Scholar] [CrossRef]
- Martini, A.; Kuper, P.V.; Breunig, M. DATABASE-SUPPORTED CHANGE ANALYSIS AND QUALITY EVALUATION OF OPENSTREETMAP DATA. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, IV-2-W5, 535–541. [Google Scholar] [CrossRef]
- Merodio Gómez, P.; Pérez García, M.; García Seco, G.; Ramírez Santiago, A.; Tapia Johnson, C. The Americas’ Spatial Data Infrastructure. ISPRS Int. J. Geo-Inf. 2019, 8, 432. [Google Scholar] [CrossRef]
- Sudmanns, M.; Augustin, H.; Killough, B.; Giuliani, G.; Tiede, D.; Leith, A.; Yuan, F.; Lewis, A. Think Global, Cube Local: An Earth Observation Data Cube’s Contribution to the Digital Earth Vision. Big Earth Data 2023, 7, 831–859. [Google Scholar] [CrossRef]
- Sudmanns, M.; Tiede, D.; Lang, S.; Bergstedt, H.; Trost, G.; Augustin, H.; Baraldi, A.; Blaschke, T. Big Earth Data: Disruptive Changes in Earth Observation Data Management and Analysis? Int. J. Digit. Earth 2020, 13, 832–850. [Google Scholar] [CrossRef]
- Giuliani, G.; Chatenoux, B.; De Bono, A.; Rodila, D.; Richard, J.-P.; Allenbach, K.; Dao, H.; Peduzzi, P. Building an Earth Observations Data Cube: Lessons Learned from the Swiss Data Cube (SDC) on Generating Analysis Ready Data (ARD). Big Earth Data 2017, 1, 100–117. [Google Scholar] [CrossRef]
- Lewis, A.; Oliver, S.; Lymburner, L.; Evans, B.; Wyborn, L.; Mueller, N.; Raevksi, G.; Hooke, J.; Woodcock, R.; Sixsmith, J.; et al. The Australian Geoscience Data Cube — Foundations and Lessons Learned. Remote Sens. Environ. 2017, 202, 276–292. [Google Scholar] [CrossRef]
- Dwyer, J.L.; Roy, D.P.; Sauer, B.; Jenkerson, C.B.; Zhang, H.K.; Lymburner, L. Analysis Ready Data: Enabling Analysis of the Landsat Archive. Remote Sens. 2018, 10, 1363. [Google Scholar] [CrossRef]
- Maso, J.; Zabala, A.; Serral, I.; Pons, X. A Portal Offering Standard Visualization and Analysis on Top of an Open Data Cube for Sub-National Regions: The Catalan Data Cube Example. Data 2019, 4, 96. [Google Scholar] [CrossRef]
- Giuliani, G.; Masó, J.; Mazzetti, P.; Nativi, S.; Zabala, A. Paving the Way to Increased Interoperability of Earth Observations Data Cubes. Data 2019, 4, 113. [Google Scholar] [CrossRef]
- Lankester, T.H. OpenGIS Web Map Services-Profile for EO Products, Version: 0.3. 3. 2009.
- Ferreira, K.R.; Queiroz, G.R.; Vinhas, L.; Marujo, R.F.B.; Simoes, R.E.O.; Picoli, M.C.A.; Camara, G.; Cartaxo, R.; Gomes, V.C.F.; Santos, L.A.; et al. Earth Observation Data Cubes for Brazil: Requirements, Methodology and Products. Remote Sens. 2020, 12, 4033. [Google Scholar] [CrossRef]
- van der Weide, T.; Papadopoulos, D.; Smirnov, O.; Zielinski, M.; van Kasteren, T. Versioning for End-to-End Machine Learning Pipelines. In Proceedings of the Proceedings of the 1st Workshop on Data Management for End-to-End Machine Learning; Association for Computing Machinery: New York, NY, USA, 2017; pp. 1–9. [Google Scholar]
- Mathis, C. Data Lakes. Datenbank-Spektrum 2017, 17, 289–293. [Google Scholar] [CrossRef]
- Haman, J.T.; Miller, C.G. Introduction to Git; Institute for Defense Analyses, 2022.
- Kimball, R. Slowly Changing Dimensions. Inf. Manage. 2008, 18, 29. [Google Scholar]
- Kulkarni, K.; Michels, J.-E. Temporal Features in SQL:2011. ACM SIGMOD Rec. 2012, 41, 34–43. [Google Scholar] [CrossRef]
- Jungwirth, P.A. TEMPORAL DATABASES: THEORY AND POSTGRES. Presented at the PGCon2019, 2019.
- Soroush, E.; Balazinska, M. Time Travel in a Scientific Array Database. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE); April 2013; pp. 98–109. [Google Scholar]
- System-Versioned Tables. Available online: https://mariadb.com/kb/en/system-versioned-tables/ (accessed on 20 November 2023).
- IBM Documentation. Available online: https://www.ibm.com/docs/en/db2-for-zos/13?topic=tables-temporal-data-versioning (accessed on 20 November 2023).
- Deshpande, K. Oracle9i: Understanding Automatic Undo Management and Flashback Query. Sel. J. 2004, 11, 22–30. [Google Scholar]
- Gregg, C. Familiar with Oracle Flashback Time Travel? If Not, Keep Reading…. Available online: https://blogs.oracle.com/dbstorage/post/familiar-with-oracle-flashback-time-travel-if-not-keep-reading (accessed on 19 November 2023).
- rwestMSFT Create a System-Versioned Temporal Table - SQL Server. Available online: https://learn.microsoft.com/en-us/sql/relational-databases/tables/creating-a-system-versioned-temporal-table?view=sql-server-ver16 (accessed on 20 November 2023).
- Fearing, V. Periods and SYSTEM VERSIONING for PostgreSQL 2023.
- Chiodi, P. Temporal Tables 2023.
- Huang, S.; Xu, L.; Liu, J.; Elmore, A.; Parameswaran, A. OrpheusDB: Bolt-on Versioning for Relational Databases 2017.
- Huang, S.; Xu, L.; Liu, J.; Elmore, A.J.; Parameswaran, A. Orpheus Db: Bolt-on Versioning for Relational Databases (Extended Version). VLDB J. 2020, 29, 509–538. [Google Scholar] [CrossRef]
- Sehn, T. When to Make a Dolt Commit | DoltHub Blog. Available online: https://dolthub.com/blog/2022-09-28-when-to-dolt-commit/ (accessed on 26 September 2023).
- Low, Y.; Arya, R.; Banerjee, A.; Huang, A.; Ronan, B.; Koepke, H.; Godlewski, J.; Nation, Z. Git Is for Data. In Proceedings of the Conference on Innovative Data Systems Research; 2023.
- Kandpal, N.; Lester, B.; Muqeeth, M.; Mascarenhas, A.; Evans, M.; Baskaran, V.; Huang, T.; Liu, H.; Raffel, C. Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models 2023.
- Peuster, M.; Schneider, S.; Karl, H. The Softwarised Network Data Zoo 2019.
- Park, K.-T.; Kim, H.-Y.; Kim, Y.-C.; Lee, S.-M.; Kim, Y.-K.; Kim, M.-J. Lake: Towards Highly Manageable Cluster Storage for Extremely Scalable Services. In Proceedings of the 2008 International Conference on Computational Sciences and Its Applications; June 2008; pp. 122–131. [Google Scholar]
- Novella, J.A.; Emami Khoonsari, P.; Herman, S.; Whitenack, D.; Capuccini, M.; Burman, J.; Kultima, K.; Spjuth, O. Container-Based Bioinformatics with Pachyderm. Bioinformatics 2019, 35, 839–846. [Google Scholar] [CrossRef] [PubMed]
- Vohra, D.; Vohra, D. Apache Parquet. Pract. Hadoop Ecosyst. Defin. Guide Hadoop-Relat. Framew. Tools 2016, 325–335. [Google Scholar]
- Liu, C.; Pavlenko, A.; Interlandi, M.; Haynes, B. A Deep Dive into Common Open Formats for Analytical DBMSs. Proc. VLDB Endow. 2023, 16, 3044–3056. [Google Scholar] [CrossRef]
- Camacho-Rodríguez, J.; Agrawal, A.; Gruenheid, A.; Gosalia, A.; Petculescu, C.; Aguilar-Saborit, J.; Floratou, A.; Curino, C.; Ramakrishnan, R. LST-Bench: Benchmarking Log-Structured Tables in the Cloud 2023.
- Behm, A.; Palkar, S. Photon: A High-Performance Query Engine for the Lakehouse. CIDR Www Cidrdb Org Httpcidrdb Orgcidr2022papersa100-Behm Pdf 2022.
- Add Geometry Type to Iceberg · Issue #2586. Available online: https://github.com/apache/iceberg/issues/2586 (accessed on 8 December 2023).
- Hellman, F. Study and Comparsion of Data Lakehouse Systems. 2023.
- OSGeo Data Storage Projects. Available online: https://www.osgeo.org/choose-a-project/information-technology/data/ (accessed on 15 December 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).