Combination of Geostatistical and Geovisualisation Techniques for Analysing 120 Year Earthquake Events in Indonesia Using Open- Source Software

Significant earthquakes frequently occur in Indonesia. Indonesia is situated over three active tectonic plates, resulting in the formation of faults and trenches on the land and ocean floor. For the last 120 years since 1900, there have been more than 1,250 significant earthquake events in Indonesia. In this study, we analyse Indonesia's significant earthquake events using geostatistical and geovisualisation methods to produce an appropriate geospatial analysis platform using the RShiny package to build the WebGIS application. The results show that the earthquake events were spatially distributed from the Sumatera fault in the western part of Indonesia to the southern part of Indonesia, where the Java trench was located and the eastern part of Indonesia. The WebGIS application received a positive evaluation by respondents, with a mean value of 1.617 for pragmatic quality, 1.808 for hedonic quality, and 1.713 for overall quality. This means that the WebGIS application is of good quality based on respondents' impressions. The users also more easily gained insight into information as a result of geostatistical methods. The information gained by the Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 March 2021 doi:10.20944/preprints202103.0407.v1 © 2021 by the author(s). Distributed under a Creative Commons CC BY license. 2 users during the user interaction with the WebGIS platform overlapped with the information that the researcher started with, that is, the spatial cluster of significant earthquakes in Indonesia.


Introduction
Indonesia is an archipelago country situated between three tectonic plates of the world, namely, the Eurasian plate located in the north, the Indo-Australian plate in the south, and the Pacific Ocean plate located in the east. The collision of these three tectonic plates led to creating a subduction zone of the Indo-Australian plate moving northward with the Eurasian plate, which tends to move southward (Hamilton, 1970). This subduction process causes the formation of faults on the land and ocean floor. These activities can trigger earthquakes, so it is not surprising that Indonesia is an area prone to earthquakes.
According to the United States Geological Survey (USGS), Indonesia has more earthquakes than Japan because almost the whole country is situated between active seismic zones (USGS, 2021b).
Based on USGS data, from 1900 to 2020, there were 1,250 earthquakes with magnitudes greater than 6.0. The earthquakes occurred on almost all islands in Indonesia, except Kalimantan Island, which recorded only four earthquakes during a period of more than 100 years, namely, in 1923 and 1957, with an average magnitude scale of 6.2.
Earthquakes occur mostly in eastern and western parts of Indonesia, such as the Flores Sea, Banda Sea, Maluku, and Sulawesi, as well as along the coast and oceans of Sumatra Island.
On February 1, 1938, a great earthquake in the Banda Sea with a magnitude of 8.5 caused a tsunami, but no casualties were recorded. In December 2004, an earthquake occurred in Aceh Province with the largest magnitude scale of 9.1, which caused a tsunami that caused considerable damage and casualties (USGS, 2021a).
The phenomenon of earthquakes in Indonesia can be illustrated through geospatial analysis to determine the relationship between geographical features on the Earth's surface (Ramdani, 2017). An illustration of the relationship between these geographic features can be obtained from spatial data analysis, which is explored by specific methods to understand 4 better what is presented. The spatial autocorrelation technique can be used as a method of analysing data. This technique aims to measure how a distance can affect certain variables to determine the level of similarity of an object to its nearest object (Maroko et al., 2011).
In the geospatial analysis, visualisation is an important aspect that cannot be avoided.
Visualisation is needed to represent spatial data in the form of creations and a series of maps that can be combined with pictures, graphs, diagrams, tables, and so on (Smith et al., 2018). Correctly interpreting spatial data into a map is important because geovisualisation can provide a complex picture of a phenomenon and the relationships within that complexity .
Geovisualisation refers to digital representations of real-world places that are geographically accurate and built with high degrees of realism (Newell & Canessa, 2017) or generally refers to visual depictions of geospatial data (Hutchison & Mitchell, 2007).
According to reference , geovisualisation is an integrated approach from six different domains of science, including computing, cartography, image analysis, information visualisation, exploratory data analysis and geographic information systems, to visually explore, analyse, synthesise, and present geospatial data.
The traditional and conventional focus of earthquake research has been within the hard sciences in fields such as geology, engineering, and disaster science. This has provided opportunities for the geovisualisation domain in providing techniques and technologies to unearth through visual approach the hidden spatial and temporal dimensions of earthquake datasets. This paper is divided into six sections. First, the paper provides the introduction and background of the study. The second section presents the related works that have been done on the topics and methods of geovisualisation. The third and fourth sections detail the study 5 area, data, and methods used in the study, respectively. The fifth section presents the results, and the final section provides the conclusion of the study.

Related works
Various visualisation methods and tools have been rapidly developed and are widely used in various studies. For instance, research on the geovisualisation of spatial databases on settlements in Hungary produced interactive web-based maps using the open-source WebGIS tool and Google application programming interfaces (APIs) (Balla et al., 2020).
The geovisualisation module in QGIS and KML used in that research can be applied to present geospatial information via the internet. The result is web-based maps that provide a detailed picture of the level of contamination, the spatial distribution of the groundwater supply of the investigated settlements, and the changes that have occurred following the sewage system's establishment. Other studies have also shown that the use of tools such as ArcGIS, Tableau, RShiny, and Leaflet can support the presentation of attractive spatial data visualisation (Dharmawan et al., 2017;Jahangiri et al., 2020;Sang et al., 2021;Tate et al., 2011;Zichar, 2020).
Some studies use three-dimensional visualisation for user learning preferences for disaster education purposes and cadaster visualisation (Wahyudi et al., 2020;Wang et al., 2017). Their results show that the visualisation helps the user easily understand the disasterprone areas of landslides and cadaster maps of property units.
Another study used a geovisualisation approach with retail location decision support (Hernandez, 2007). The study examined four different scales of analysis: national, regional, market, and micro-level and outlined the benefits of geovisualisation, such as the ability to dynamically explore spatial-temporal data, the multi-dimensional display of complex datasets, and the sequencing and animation of spatial-temporal data to visually uncover 6 trends and identify anomalies. Reference (Newell & Canessa, 2017) used the geovisualisation approach in coastal environments, and some recommendations for geovisualisation emerged, such as full navigability, dynamic elements, and flexibility.
Furthermore, reference (Hamad & Quiroga, 2016) presented a geovisualisation approach that was applied to transportation system archived data and included the incident detection rate, false alarm rate, quality control flags, and data completeness rate. The analysis was performed at a detailed segment-by-segment level of road networks. They found that geovisualisation helps the transportation management centre (TMC) official identify spots with any abnormal behaviour, whether at the corridor level or at the segment level. Reference (Cominelli et al., 2019) used a geovisualisation approach to inform the management of vessel noise in support of species conservation. Using hotspot mapping, they suggested that small changes in shipping routes can reduce noise exposure levels for Cetacean species.
Studies using geocomputation and geovisualisation have also been performed in the domains of human activity pattern analysis (Mei-Po Kwan, 2004), sediment contamination assessment (Forsythe et al., 2016), and social media analysis (Croitoru et al., 2017).
However, a very limited number of studies assess earthquake events using the combination of geostatistical and geovisualisation approaches using open-source software and measure the user experience of the system. This study aims to perform an integrated geospatial analysis by presenting a webbased interactive map created using the open-source WebGIS tool. In this study, we analyse spatiotemporal data on earthquakes in Indonesia that occurred during a period of 120 years with a scale of more than 6 magnitudes. Geostatistical and geovisualisation methods are used to produce geospatial analysis and RShiny as a support in building WebGIS, which is 7 the open-source of the R package. Finally, the user experience was measured using the User Experience Questionnaire (UEQ).

Study Area
Indonesia is situated between three active tectonic plates: the Eurasian plate in the northern part, the Indo-Australian plate in the southern part, and the Pacific Ocean plate in the eastern part. The confluence of these tectonic plates creates a subduction zone.
This subduction zone then causes the formation of faults and trenches on the land and ocean floor. The activities of these faults and trenches lead to earthquake events.
The active Sumatera fault and Java trench have created many large and destructive earthquakes in the western part of Indonesia. The active trenches in Flores, Wetar, Sulawesi, and Seram were responsible for the earthquake event in the eastern part of Indonesia ( Figure 1).

According to the Global Significant Earthquake Database by National Centers for
Environmental Information (NOAA), for the last 120 years since 1900, there have been more than 290 significant earthquake events in Indonesia (NOAA, 2021). A significant earthquake means that the earthquake led to damage of approximately $1 million or more, casualties of more than 10, and magnitudes of 7.5 or greater.

a. Data source
In this study, the earthquake event dataset was retrieved from two different earthquake databases, i.e., USGS Search Earthquake Catalogue (USGS, 2021a) and NOAA Global Significant Earthquake Database (NOAA, 2021). The earthquakes used in this study are only significant earthquakes with magnitudes of 6 or more that occurred from 1900 to 2020.

9
The earthquake datasets were then divided into three different periods: (1)
Using spatial distribution analysis, we plot the earthquake events for the last 120 years in Indonesia based on magnitude and depth. Spatial autocorrelation measures the degree to which earthquake events are similar to nearby earthquake events. Positive spatial autocorrelation is determined when similar values tend to be closer together than dissimilar values. In the case of earthquake data, earthquakes with similar characteristics tend to reside in similar neighbourhoods due to various reasons, including depth, magnitude, or tsunami events generated from the earthquake. In this study, we evaluate the spatial autocorrelation of variable depth and magnitude to tsunami events.
Several packages need to be installed and activated when using geostatistical methods within RStudio. For instance, we need to install and activate the "raster" and "adehabitatHR" for kernel density estimation. Furthermore, we need to install and activate the "deldir" and "spdep" packages for spatial autocorrelation analysis.

c. Geovisualisation using RStudio
For geovisualisation purposes, we divided the earthquake events of each period into three different classes, i.e., 25% earthquake concentration, 50% earthquake concentration, and 75% earthquake concentration. Furthermore, we use dots of different sizes and colours to represent earthquake events.
When using RStudio, we have two options for geovisualisation: "Plot" and "Viewer". The "Plot" option is used for statis map visualisation, while the "Viewer" option is used for dynamic map visualisation. Some packages need to be installed and activated when working using spatial datasets within RStudio, such as "rgdal", "sp" and "rgeos" to import and "tmap" to visualise the spatial data. Furthermore, for the base map, we use an open topography map available from https://leaflet-extras.github.io/leaflet-providers/preview/ d. RShiny RShiny is a package within RStudio that makes it easy to build interactive web apps straight from the RStudio environment. It provides an elegant and powerful web framework for building web applications using R.
RShiny is one of many tools with a stronger focus on facilitating reproducible workflows or standardised working environments (Palomino et al., 2017). In this study, RShiny is used to transform the code into interactive web applications of WebGIS.

e. User Experience Questionnaire (UEQ)
The User Experience Questionnaire (UEQ) is used to measure the subjective impression of users efficiently and reliably (Schrepp et al., 2014) regarding the user experience of produced interactive WebGIS. In this study, we used the short version of UEQ (Schrepp et al., 2017b). For high precision and lower error probability results, the number of respondents should be more than 20 (Schrepp et al., 2017a).
There are eight items of questions used in the short version of the UEQ, as shown in Table 1, where item numbers 1 to 4 are for pragmatic quality assessment and item numbers 5 to 8 are for hedonic quality assessment of the system. Pragmatic qualities refer to efficiency, perspicuity, and dependability (goal-directed), while hedonic qualities take into account the stimulation and novelty (not goal-directed) (Schrepp, 2019) generated by the use of a WebGIS.
The respondent needs to decide whether the WebGIS is good or not based on the item list. There are seven quality levels for each item. For instance, if the respondent impression of WebGIS is "very easy" to use, then he/she fills in level 7. In contrast, if the respondent impression of the WebGIS is "very boring", he/she fills in level 1. The negative term of an item is always left, and the positive term is always right.  The respondent data were then rescaled to the range of -3 to 3, and the scale values for pragmatic and hedonic quality per respondent were calculated. The result of means, variance, and standard deviation per item are also calculated.
Values greater than 0.8 represent a positive evaluation, values less than -0.8 represent a negative evaluation, and the range of the scales is between -3 (horribly bad) and +3 (extremely good).
The 5% confidence interval was then measured for the precision of the estimation; the smaller the confidence interval was, the higher the precision of the estimation. Furthermore, the Cronbach-Alpha coefficient per scale was calculated to check the correlations of the items per scale. An alpha value greater than 0.7 was considered sufficiently consistent. To detect random or not serious answers by the respondent, inconsistencies were then measured.
The flowchart of the methodology is summarised in Figure 2. There are three main stages. The first is the geostatistical analysis. The second is the geovisualisation process, and the last stage is WebGIS development and testing.  We can therefore determine that our earthquake variable is positively autocorrelated in Indonesia but with very low values. In other words, the earthquake data with depth and magnitude variables that triggered tsunamis in Indonesia do spatially cluster. also share high values. The Gi Statistic is represented as a Z-score. A higher Zscore means a higher intensity of clustering, while positive or negative direction indicates high or low clusters. Figure 3 shows the map of Gi Statistic result.
Gi Statistics of earthquakes with depth and magnitude that triggered tsunamis show the same Z-score. A higher intensity cluster was found in the northwestern part of Sumatera Island, with a positive Z-score ranging between 5 and 10 (represented with a red dot on the map). The lower intensity cluster was found in the western part of Sumatera Island and the southern part of Java Island, with a negative Z-score ranging between -15 and -10.  The results of the kernel density analysis are shown in Figure 5. From three different periods, we found that 25% of earthquakes were concentrated within the eastern part of Indonesia, situated between the Flores, Wetar, Sulawesi, and Seram trenches ( Figure 5A). Fifty per cent of earthquakes were concentrated in two different areas, along the Sumatera trench and between the Flores, Wetar, Sulawesi, and Seram trenches ( Figure 5B). Finally, 75% of earthquakes were evenly concentrated from the Sumatera trench and Java trench to the Flores, Wetar, Sulawesi, and Seram trenches ( Figure 5C).  From Figure 6, we can interpret that there seems to be a geographic pattern of autocorrelation. Figure 6A visualises the earthquake event with the magnitude variable that triggered the tsunami, while Figure 6B visualises the earthquake event with the depth variable that triggered the tsunami. When Moran's value becomes higher or positive, there is autocorrelation between nearby points of earthquake events. In contrast, when Moran's value becomes lower or negative, there is no autocorrelation between nearby points of earthquake events. Figure 6A shows that there is spatial autocorrelation between earthquake events and the magnitude variable that triggered tsunamis along the Sumatera and Java trenches. In contrast, Figure 6B shows that there is spatial autocorrelation between earthquake events and the depth variable that triggered tsunamis around the Flores, Wetar, Sulawesi, and Seram trenches. However, it is not possible to understand whether these are clusters of high or low values. Therefore, we then produce a map of the p-value to observe variances in the significance of earthquakes across Indonesia, which labels the features based on the types of relationships they share with their neighbours (i.e., high and high, high and low, low and high, low and low, and insignificant). From Figure 7, we found a statistically significant geographic pattern in the clustering of earthquake variables in Indonesia. The earthquakes with magnitude variables that triggered tsunamis highly clustered in the western and northwestern parts of Sumatera Island, as shown in Figure 7A. Figure 7B shows that the earthquakes with magnitude variables that did not trigger tsunamis were highly clustered in the eastern part of Indonesia.
However, even though the earthquake with a magnitude variable does not cluster spatially and significantly in the southern part of Java Island, we cannot underestimate it. As suggested by reference (Widiyantoro et al., 2020)

c. Interactive map of WebGIS
The result of the interactive map of WebGIS produced using the RShiny package is shown in Figure 8, where there are interactive panels and buttons for users to interact with the system.
In the top right of the main window, there are three panels, namely, "Date", "Magnitude", and "Depth". The user can easily make queries based on the date, magnitude, or depth of the earthquake event (slider type). The system then automatically visualises the earthquake information with user queried attributes.
There are zoom-in and zoom-out buttons in the top left, represented with "plus" and "minus" symbols.
There   From eight items of the UE question, the hedonic quality received a higher scale than pragmatic quality. Item nos. 6 and 8 received the highest mean values of 2.1 and 2.0, respectively. Item no. 5 received the lowest mean value of 1.3. This means that the respondents feel that the WebGIS application is interesting and leading-edge information technology but not too exciting. Table 4 summarises the results of the respondents' responses. As shown in Table 4, all of the items received values greater than 0.8, which means that the WebGIS has a positive evaluation by respondents. While the mean value of the scales for pragmatic quality is 1.617, for hedonic quality is 1.808, and overall quality is 1.713, which means the WebGIS application is in good quality based on respondent's impression (Figure 9).  in Indonesia over the last 120 years. The WebGIS platform provides information using geostatistical methods. The combination of spatial distribution analysis, kernel density estimation, and spatial autocorrelation using local Moran statistics, local indicators of spatial association, and Getis-Ord (Gi Statistic) help users better understand the spatial information of significant earthquakes in Indonesia.
We found that the earthquake variable is positively autocorrelated in Indonesia but with a very low value. In other words, the earthquake data with depth and magnitude variables that triggered tsunamis in Indonesia clustered locally. A higher intensity cluster was found in the northwestern part of Sumatera Island, with a positive Z-score ranging between 5 and 10. The lower intensity cluster was found in the western part of Sumatera Island and the southern part of Java Island, with a negative Z-score ranging between -15 and -10.
The earthquake events were spatially distributed from the Sumatera fault in the western part of Indonesia, to the southern part of Indonesia, where the Java trench is located and to the eastern part of Indonesia, where the Flores, Wetar, Sulawesi, and Seram trenches are located. Twenty-five per cent of earthquakes were concentrated within the eastern part of Indonesia, situated between the Flores, Wetar, Sulawesi, and Seram trenches. Fifty per cent of earthquakes were concentrated in two different areas, along the Sumatera trench and between the Flores, Wetar, Sulawesi, and Seram trenches, and 75% of earthquakes were evenly concentrated from the Sumatera trench and Java trench to the Flores, Wetar, Sulawesi, and Seram trenches.
Furthermore, we found spatial autocorrelation between earthquake events and magnitude variables that triggered tsunamis along the Sumatera and Java trenches, while there was spatial autocorrelation between earthquake events and depth variables that triggered tsunamis around the Flores, Wetar, Sulawesi, and Seram trenches. The earthquakes with magnitude variables that triggered tsunamis were highly clustered in the western and northwestern parts of Sumatera Island, while the earthquakes with magnitude variables that did not trigger tsunamis were highly clustered in the eastern part of Indonesia.
The WebGIS application received a positive evaluation by respondents, with a mean value of 1.663 for pragmatic quality, 1.837 for hedonic quality, and 1.75 for overall quality. This means that the WebGIS application is of good quality based on respondents' impressions.
The dynamic web maps produced with the support of information technologies applied over traditional static maps is a new approach, which allows the user to view the temporal and spatial information of earthquakes through interactive user interfaces and/or contents directly in the most convenient way. The users also more easily could gain insight into information as a result of geostatistical methods. The information gained by the users during the user interaction with the WebGIS platform overlapped with the information that the researcher started with, that is, the spatial cluster of significant earthquakes in Indonesia.