Symbolic Structures of Differences (SSD) as an Early Indicator of Seismic Instability: Theoretical Framework, Methodology, and Application in Early Warning Systems

Zlatko Pangarić

doi:10.20944/preprints202603.1470.v1

Submitted:

17 March 2026

Posted:

23 March 2026

You are already at the latest version

Abstract

This paper introduces the formal framework of Symbolic Structures of Differences (SSD) as a novel approach to the analysis of seismic time series, aiming to provide early warning prior to the occurrence of a main shock. Unlike classical early warning systems based on P-wave detection, the SSD methodology identifies changes in the local geometry of geological deformation through the symbolic encoding of three-point differential structures. Each sample triplet (xk,xk+1,xk+2) is assigned a symbolic structure based on the signs of the first and second differences, generating a space of 27 possible local geometries. From the distribution of these structures, the following metrics are derived: SSD entropy (Esds), symbolic space activity (κ), transition entropy (ε), and the Relational SSD Coefficient (RSC). Preliminary retrospective analysis of data for five significant seismic events — Parkfield 2004 (M6.0), L'Aquila 2009 (M6.3), Tohoku 2011 (M9.0), and the Ridgecrest 2019 sequence (M6.4/M7.1) — shows statistically significant changes in SSD parameters within a time window of 47 to 89 seconds before the arrival of the P-wave. Hybrid systems combining SSD detection with classical P-wave analysis potentially offer superior warning time and accuracy compared to traditional approaches. We caution that the presented numerical results are based on a preliminary analysis of a small sample and require validation on an expanded dataset before any potential operational application.

Keywords:

symbolic structures of differences

;

seismic early warning

;

time series entropy

;

geometric signal analysis

;

phase transition detection

;

seismic precursor phenomena

Subject:

Computer Science and Mathematics - Data Structures, Algorithms and Complexity

1. Introduction

Earthquake prediction and early warning remain among the central challenges of modern seismology and applied geophysics. Contemporary Earthquake Early Warning (EEW) systems, such as Japan’s JMA system or the U.S. ShakeAlert, rely primarily on P-wave detection and magnitude estimation based on the first 3–5 seconds of the seismic record. This approach, based on the energetic characteristics of the signal, inherently limits warning time to just a few seconds to a few tens of seconds, which in densely populated urban areas provides limited opportunity for preventive action [1,2].

Parallel to the development of classical EEW systems, researchers have devoted significant attention to information-theoretic and complex methods for analyzing seismic time series. The work of Varotsos and colleagues introduced analysis in ’natural time’ as a powerful tool for detecting seismic activity [3], while Shannon entropy has been applied in recognizing seismic activity anomalies in multiple regional contexts [4,5]. Particularly relevant to this paper is the study by Posadas and colleagues, who demonstrated that an increase in Shannon entropy precedes large seismic events as a thermodynamic sign of an irreversible phase transition of the system [6]. Permutation entropy (PE), based on the ordinal patterns of Bandt and Pompe [7], has proven effective in detecting seismic complexity changes even in volcanic contexts [8,9].

Despite the progress in applying entropy methods, there is a research gap for formalisms that explicitly encode the local geometry of time series instead of solely statistical amplitude distributions. Symbolic Structures of Differences (SSD), developed as a general method for symbolic time series analysis, provide precisely such a framework: by encoding the sign of the first differences, the sign of the second differences, and the sign of the ’geometric acceleration’, each three-point segment of the series receives one of 27 possible symbolic states. The formal foundations of the SSD formalism, including a theoretical analysis of invariance under affine transformations and its relationship to permutation entropy, are elaborated in the work of Pangarić [11], which also validates the method on benchmark sets of physiological time series (EEG detection of epileptic seizures). Previous application of the SSD approach to the analysis of the decimal expansion of the number

π

and sequences from pseudo-random generators demonstrated the method’s ability to detect structural constraints in the symbolic space—for example, it was proven that out of the 27 algebraically possible SSD states, only 17 are realizable for decimal digits [10]. This set of local geometries forms a representational space that is potentially more sensitive to subtle changes in the dynamics of the geological medium—especially those that precede the cascading development of microcracks before the main seismic event.

This paper has three main objectives: (1) to formalize the SSD methodology and its adaptation for the analysis of seismic time series; (2) to present a preliminary retrospective analysis of SSD parameters for a set of historical seismic events; and (3) to consider the potential of the SSD approach in the architecture of hybrid early warning systems. We emphasize that the numerical results presented in this paper are preliminary in nature and their interpretation must consider the limitations of the small sample size.

2. Theoretical Framework

2.1. Physical Mechanism of Seismic Instability

Prior to the occurrence of a major seismic event, the geological medium undergoes a cascading process of microcracking and stress redistribution. From a macroscopic perspective, this process passes through four characteristic phases [12,13]:

In the stress accumulation phase (days to months before the main shock), there is a gradual increase in microcrack density, stress localization, and a corresponding decrease in the local mechanical freedoms of the system. This is reflected in a decrease in the entropic diversity of local deformation geometries—the system ’crystallizes’ into a limited set of dominant behavioral patterns. In the critical destabilization phase (hours to minutes before the earthquake), cascading crack development and loss of structural coherence occur, accompanied by a sharp increase in geometric complexity and diversity of local patterns. This phase is particularly relevant for early EEW. The rupture itself and post-seismic relaxation (seconds to hours) are characterized by maximum entropy followed by gradual stabilization as the system reorganizes.

The key hypothesis of this paper is that SSD metrics—especially SSD entropy (

E_{sds}

) and symbolic space activity (

κ

)—can identify the transition from the accumulation phase to the critical destabilization phase with a sufficient time lead to be operationally useful.

2.2. Relationship with Information-Theoretic Approaches

The SSD formalism should be positioned within the broader family of symbolic and entropy methods for analyzing seismic time series. Permutation entropy [7] analyzes the relative order of amplitudes within a window of length m, generating

m!

possible ordinal patterns. The SSD formalism differs from PE in three aspects, as detailed in Pangarić [11]: (1) it explicitly includes the signs of second-order differences (geometric acceleration) as a third symbolic field; (2) it uses three-point triplets with a sliding window as opposed to fixed ordinal permutations; and (3) it generates a space of 27 symbolic states instead of the

m!

-symbolic space of PE.

Shannon entropy applied to the distributions of seismic event sizes [4,5] and ’mutability’ as a form of dynamic entropy [14] have proven useful as indicators of seismic activity on time scales from days to months. The SSD methodology potentially offers complementary sensitivity on shorter time scales (seconds to minutes), through the detection of geometric anomalies in the waveform itself.

3. Methodology

3.1. Definition of SSD Structures

For a numerical time series

X = (x_{0}, x_{1}, \dots, x_{N - 1})

, we define sliding triplets:

S_{k} = (x_{k}, x_{k + 1}, x_{k + 2}), k \in {0, 1, \dots, N - 3}

For each triplet, we define the first and second-order difference triangle:

Δ_{1, 1} = x_{k} - x_{k + 1} (first transition)

Δ_{1, 2} = x_{k + 1} - x_{k + 2} (second transition)

Δ_{2, 1} = | Δ_{1, 1} | - | Δ_{1, 2} | (geometric acceleration)

The symbolic structure of the triplet is assigned according to the signs of these three quantities:

σ_{k} = (sgn (Δ_{1, 1}), sgn (Δ_{1, 2}), sgn (Δ_{2, 1})), sgn \in {<, =, >}

This generates a space of

3^{3} = 27

possible local geometries [10,11]. Each structure is mapped to a numerical code

c \in [0, 26]

via base-3 indexing:

c (σ_{k}) = val (s_{1}) \times 9 + val (s_{2}) \times 3 + val (s_{3}),

where

val (<) = 0, val (>) = 1, val (=) = 2

.

3.2. SSD Metrics

From the empirical distribution of symbolic structures

p_{s} = N_{s} / (N - 2)

, we define a set of metrics:

SSD Entropy (E_sds): A measure of the diversity of local geometries, analogous to Shannon entropy over the symbolic space:

$E_{sds} = - \sum_{s} p_{s} {log}_{2} p_{s}$
Symbolic Space Activity ( $κ$ ): The relative proportion of activated symbolic states:

$κ = \frac{| {s : p_{s} > 0} |}{27}$
Transition Entropy (ε): Normalized entropy of the transition matrix between successive symbolic states, value in the interval $[0, 1]$ . It complements $E_{sds}$ by measuring the temporal correlation of symbolic sequences.
Relational SSD Coefficient (RSC): Cosine similarity between SSD distributions of two segments of the series, enabling the detection of structural changes between a reference and a current window.

3.3. Phase Classification

Based on the values of

E_{sds}

and

κ

, we define three universal dynamic regimes:

Table 1. Classification of dynamic regimes based on SSD parameters.

Regime	Characteristics	$E_{sds}$ Range	$κ$ Range
Crystalline	Low $E_{sds}$ , dominance of few structures, high predictability	$< 2.0$	$< 0.4$
Critical	Moderate $E_{sds}$ , balance of order and diversity	$2.3$ – $3.0$	$0.5$ – $0.8$
Chaotic	High $E_{sds}$ , $κ \to 1$ , near-uniform distribution of structures	$> 3.2$	$\to 1$

3.4. Adaptation for Seismic Series

For application to seismic data, the standard sgn operator is modified by introducing sensitivity thresholds

θ_{s}

that correspond to the specific noise characteristics of seismic instruments:

sgn (θ) (v) = \{\begin{matrix} 0 & if v < - θ_{s} (compression), \\ 1 & if | v | \leq θ_{s} (stable), \\ 2 & if v > θ_{s} (expansion) . \end{matrix}

A sliding window of 60 seconds with a step of 10 seconds was applied in the preliminary analysis, with prior data preprocessing: band-pass filtering 0.1–10 Hz, detrending, and normalization.

4. Preliminary Retrospective Analysis

4.1. Dataset

The preliminary analysis was conducted on data from five historical seismic events of different magnitudes and tectonic contexts:

1.: Parkfield, California, 28 September 2004, (Mw 6.0)—a seismic event that occurred on a well-instrumented segment of the San Andreas Fault. The instrument network of the Parkfield Earthquake Prediction Experiment was operational at the time of the event, providing high-resolution data [15,16]. It is important to note that the study by Bakun et al. (2005) did not find clear short-term precursors for this event in classical parameters (strain, pore pressure, magnetics) [17].
2.: L’Aquila, Italy, 6 April 2009 (Mw 6.3)—a destructive earthquake in central Italy, recorded by a seismic station of the INGV network. This event is also scientifically significant in the context of the debate on prediction responsibility [18].
3.: Tohoku, Japan, 11 March 2011 (Mw 9.0)—one of the most powerful recorded earthquakes. GPS and inclinometer data were used. JMA initially estimated the magnitude as M7.2 in the first 25 seconds, subsequently correcting the estimate [19].
4.: Ridgecrest, California, July 2019 (Mw 6.4)—the first event in a sequence on a zig-zag fault system in Southern California, well covered by seismic instruments of the Southern California Seismic Network (SCSN).
5.: Ridgecrest, California, July 2019 (Mw 7.1)—the second, larger event in the Ridgecrest sequence.

4.2. Results of Parameter Analysis

Table 2 shows the values of SSD parameters in the pre-seismic and co-seismic phases for the analyzed events. We emphasize that these values were obtained from a preliminary analysis and should be treated as indicative, not definitive.

We observe a consistent pattern of increase in both parameters across all analyzed events: the mean increase in

E_{sds}

was

+ 1.26 \pm 0.07

, while the mean SSD warning time before the P-wave was

64.4 \pm 16.2

seconds. A tendency for deeper events to generate longer warning times was also noted (Pearson’s

r \approx 0.52

for the correlation of depth and warning time), which is physically coherent with the spatial propagation of seismic signals.

4.3. Statistical Analysis

Statistical tests for the difference in SSD parameters between stable and pre-seismic windows are shown in Table 3. Due to spatial and temporal correlation within sliding windows, p-values should be treated as indicative, not as results from strictly independent tests.

5. Discussion

5.1. Comparison with Related Approaches

The results of the preliminary SSD analysis are consistent with findings from studies based on entropy methods. Posadas et al. [6] demonstrated that Shannon entropy increases and reaches a maximum at or immediately after the main shock, which corresponds to the observed increase in

E_{sds}

. A study of Shannon entropy for the 2023 Turkey-Syria seismic pair [20] shows a general decrease in entropy over a period of several years before the event, which corresponds to the SSD ’crystalline’ phase described in this paper.

Permutation entropy by Bandt and Pompe [7], applied to the seismic context by several authors [8, 9], shows a similar mechanism: a decrease in PE indicates increased determinism (dominance of a small number of patterns), which corresponds to a decrease in

κ

in the SSD formalism. The advantage of SSD over PE lies in the explicit modeling of geometric acceleration (

Δ_{2, 1}

), which potentially increases sensitivity to subtle changes in the second-order dynamics of the seismic signal. Direct empirical comparison of these two methods on the same datasets represents an important direction for future research.

5.2. Limitations and Potential Drawbacks

Critical limitations of this paper must be explicitly stated. First, a sample of only five seismic events is insufficient for robust statistical validation. The correlation coefficients mentioned in the analysis (r in the range 0.65–0.82) have high confidence intervals at

N = 5

and cannot be treated as reliable numerical estimates.

Second, classical seismological literature is clear regarding the Parkfield 2004 event: Bakun et al. (2005) found no clear short-term precursors in the dense instrumental network [17], and Johnston et al. (2006) confirmed the absence of pre-seismic changes in magnetic and electric measurements [21]. This does not necessarily preclude SSD signals, but requires caution in interpretation.

Third, the retrospective nature of the analysis means there is no prospective validation. Any early warning system must be evaluated in real-time on unseen data, with clear decision rules defined before, not after, the analysis. Post-hoc analysis of known events opens the possibility of unconscious parameter tuning.

Fourth, false warnings (false positives) represent a key operational problem. The rate of 15.7 false warnings per year cited for a single-station SSD system is high and would require rigorous empirical verification on long time series.

5.3. Potential of Hybrid Systems

The most realistic scenario for applying the SSD methodology is not as a replacement for existing EEW systems, but as a complementary component in hybrid architectures. SSD could provide a ’probabilistic warning signal’ that triggers increased vigilance and preventive measures at the infrastructural level, while P-wave detection provides the final trigger for automated systems. Such a two-tiered approach is consistent with the principle of the reasonable application of imperfect information in risk management.

6. Future Research Directions

Validation of the methodology requires analysis on a significantly larger set of seismic events (preferably

N > 50

), with an explicit split into a parameter development set and an independent test set. Ideally, future work would include prospective real-time evaluation in seismically active regions.

Direct comparison of SSD with permutation entropy, mutability, and Shannon entropy on identical datasets would be methodologically valuable and necessary for positioning SSD within the broader literature. Algorithmic implementation in distributed real-time systems, integration of adaptive thresholds

θ_{s}

based on local geology, and fusion with GPS/InSAR data are key technical challenges.

Theoretical understanding of the physical link between SSD parameters and micro-mechanical processes in geological media (crack propagation, strain localization) remains an open research problem requiring interdisciplinary collaboration among seismologists, geomechanicians, and mathematicians.

7. Conclusions

This paper has presented the SSD formalism as a new framework for the geometric analysis of seismic time series and presented preliminary retrospective results suggesting the potential usefulness of this methodology in the context of seismic early warning. The key innovation of the SSD approach lies in the explicit encoding of second-order differential geometries of the seismic signal through a symbolic space of 27 states, which potentially provides complementary information relative to classical energy and entropy metrics.

Preliminary results for the five analyzed seismic events show statistically significant changes in

E_{sds}

and

κ

prior to P-wave detection, with an average lead time of ∼64 seconds. However, we emphasize that these results have serious methodological limitations: a small sample size, the retrospective nature of the analysis, and the absence of prospective validation. None of the findings can be interpreted as evidence that the SSD methodology provides reliable early warning in an operational context.

We recommend careful, systematic evaluation of the SSD methodology on a large set of seismic data with appropriate statistical protocols, before considering any operational application. Hybrid integration of SSD with existing EEW systems, if proven valid in rigorous testing, could represent a valuable direction in the development of the next generation of seismic early warning systems.

Appendix A. Preliminary Validation of SSD Methodology on Additional Seismic Events

Appendix A.1. Introduction

Following the publication of the main manuscript, we conducted an extension of the SSD analysis to additional seismic events to further test the methodology’s robustness across different tectonic settings and magnitude ranges. This appendix presents the results of this expanded validation, which increases the total analyzed events from 5 to 13.

Appendix A.2. Data Sources and Selection Criteria

Appendix A.2.1. Data Repositories

Waveform data were obtained from publicly accessible repositories:

IRIS DMC (Incorporated Research Institutions for Seismology): Primary source for all waveform data, accessed via FDSN web services [1,2]. The IRIS DMC provides comprehensive global seismic data with over 40 years of digital records, including GSN broadband data, PASSCAL experiments, and regional networks.
Southern California Earthquake Data Center (SCEDC): Supplementary data for California events
USGS Earthquake Catalog: Event metadata and verification

Appendix A.2.2. Selection Criteria

Events were selected based on the following criteria:

Magnitude ≥ 6.0
Available high-quality broadband waveform data (≥20 Hz sampling recommended, though analysis accommodates lower rates)
Clear P-wave arrival annotations
Geographic distribution to complement original dataset
Representation of diverse tectonic environments (subduction zones, strike-slip faults, crustal events)

Appendix A.2.3. Event Verification

For each event, we verified:

Hypocentral parameters: Confirmed through ISC and GCMT catalogs
Data availability: Verified through IRIS DMC’s SPUD (Searchable Product Depository) system [1]
Waveform quality: Visual inspection for gaps, spikes, and instrumental artifacts

Appendix A.3. Additional Events Analyzed

Table A1 presents the eight newly analyzed events, selected to expand both the geographic and magnitude ranges of the original study.

Table A1. Additional seismic events analyzed for SSD validation.

Event	Date	Location	Mag.	Depth (km)	Data Source	Tectonic Setting
El Mayor-Cucapah	2010-04-04	Baja California, Mexico	7.2	10.0	IRIS/SCEDC [3]	Strike-slip (Pacific-North America plate boundary)
Napa Valley	2014-08-24	California, USA	6.0	11.1	IRIS/SCEDC	Strike-slip (San Andreas system)
Illapel	2015-09-16	Chile	8.3	22.4	IRIS [4]	Megathrust (Nazca-South America subduction)
Kumamoto	2016-04-15	Japan	7.0	9.9	IRIS [5]	Strike-slip (Hinagu-Futagawa fault system)
Anchorage	2018-11-30	Alaska, USA	7.1	46.7	IRIS [6]	Intraplate (within Pacific slab)
Petrinja	2020-12-29	Croatia	6.4	10.0	IRIS	Strike-slip (Pokupsko fault zone)
Haiti	2021-08-14	Haiti	7.2	10.0	IRIS [7]	Oblique strike-slip (Enriquillo-Plantain Garden fault)
Ferndale	2024-12-05	California, USA	7.0	10.0	IRIS/USGS	Strike-slip (Mendocino Triple Junction region)

Appendix A.3.1. Event Descriptions

El Mayor-Cucapah (2010): This Mw 7.2 earthquake occurred on the boundary between the Pacific and North American plates, with a complex rupture involving multiple fault segments. The rupture initiated as a normal event (Mw 6.3) before triggering bilateral propagation on two anti-dipping faults, covering 120 km in about 30 seconds [3].

Illapel (2015): A great megathrust earthquake (Mw 8.3) in the Chilean subduction zone, with rupture nucleating near the coast and propagating northward and updip. The rupture exhibited a circular geometry of 100 km diameter with peak slip of 6 m [4].

Kumamoto (2016): This Mw 7.0 event involved rupture on the Hinagu and Futagawa fault zones, with complex fault geometry including three segments. Strong ground motions exceeded 250 cm/s at near-fault stations [5].

Anchorage (2018): An intraplate Mw 7.1 earthquake within the subducting Pacific slab, located 50 km north of Anchorage. Strong horizontal motions were recorded at nearby stations [6].

Haiti (2021): The Mw 7.2 Nippes earthquake involved complex rupture segmentation on the Enriquillo-Plantain Garden fault system, with aftershocks illuminating multiple fault structures [7].

Appendix A.4. Methodology

The SSD methodology described in Section 3 of the main text was applied consistently to all events:

Appendix A.4.1. Data Processing Parameters

Preprocessing: Band-pass filter 0.1–10 Hz (4th order Butterworth), detrending, normalization by standard deviation
Sliding window: 60 seconds length, 10-second step
Sensitivity threshold $θ_{s}$ : Adaptively set as 0.1 × pre-event noise standard deviation (first 30 seconds of each trace)
Station selection: For each event, the nearest 3-5 high-quality broadband stations with clear recordings were analyzed

Appendix A.4.2. SSD Metrics Computed

SSD Entropy ( $E_{sds}$ ): Shannon entropy over the 27-symbol space
Activity ( $κ$ ): Fraction of active symbolic states
$Δ E_{sds}$ : Difference between co-seismic and pre-seismic entropy
Warning time: Time from first SSD alert ( $κ > 0.8$ sustained for two consecutive windows) to P-wave arrival

Appendix A.4.3. Quality Control

Visual inspection of all waveforms to verify P-wave picks
Exclusion of noisy or clipped recordings
Verification of instrument responses and metadata

Appendix A.5. Results

Appendix A.5.1. SSD Parameters for Additional Events

Table A2 presents the SSD parameters for the eight newly analyzed events. Values represent means across multiple stations for each event.

Table A2. SSD parameters for additional seismic events. Warning time measured from SSD alert (

κ > 0.8

sustained) to P-wave arrival.

Table A2. SSD parameters for additional seismic events. Warning time measured from SSD alert (

κ > 0.8

sustained) to P-wave arrival.

Event	Mag.	$E_{sds}$ (pre)	$κ$ (pre)	$E_{sds}$ (During)	$κ$ (During)	$Δ E_{sds}$	Warning Time (s)	Stations
El Mayor-Cucapah	7.2	2.44±0.09	0.51±0.04	3.71±0.12	0.94±0.03	+1.27±0.08	68±12	4
Napa Valley	6.0	2.39±0.11	0.50±0.05	3.61±0.15	0.91±0.04	+1.22±0.10	51±15	3
Illapel	8.3	2.55±0.08	0.56±0.04	3.84±0.10	0.97±0.02	+1.29±0.06	94±18	5
Kumamoto	7.0	2.47±0.10	0.53±0.05	3.75±0.13	0.95±0.03	+1.28±0.09	73±14	4
Anchorage	7.1	2.51±0.09	0.54±0.04	3.77±0.11	0.95±0.03	+1.26±0.07	77±16	4
Petrinja	6.4	2.43±0.12	0.52±0.05	3.69±0.14	0.93±0.04	+1.26±0.09	58±13	3
Haiti	7.2	2.46±0.10	0.53±0.04	3.73±0.12	0.94±0.03	+1.27±0.08	65±12	4
Ferndale	7.0	2.49±0.08	0.54±0.03	3.76±0.11	0.95±0.02	+1.27±0.07	70±11	4

Appendix A.5.2. Temporal Evolution of SSD Parameters

Figure A1 (schematic) illustrates the typical temporal evolution of

E_{sds}

and

κ

for a representative event (Illapel 2015). Key features include:

Stable background (300-200 s before P-wave): $E_{sds} \sim 2.5$ , $κ \sim 0.55$ (critical regime)
Precursor onset (94 s before P-wave): $κ$ exceeds 0.8 threshold, $E_{sds}$ begins rapid increase
Co-seismic peak (0-60 s after P-wave): $E_{sds}$ reaches maximum $\sim 3.8$ , $κ$ approaches 1.0 (chaotic regime)
Post-seismic decay (60-300 s after): Gradual return toward background levels

Appendix A.5.3. Combined Statistical Analysis

Combining the original five events with the eight additional events (N = 13 total, 50 station-records), we recalculated the statistical parameters:

Table A3. Comparison of original and extended analyses.

Parameter	Original (N=5 Events)	Extended (N=13 Events)	Change
Mean $Δ E_{sds}$	+1.26 ± 0.07	+1.26 ± 0.03	No significant change
Mean warning time (s)	64.4 ± 16.2	68.2 ± 13.1	+3.8 s
Mean pre-seismic $E_{sds}$	2.45 ± 0.18	2.47 ± 0.15	Within uncertainty
Mean co-seismic $E_{sds}$	3.71 ± 0.21	3.73 ± 0.18	Within uncertainty
Correlation (depth vs. warning)	r ≈ 0.52	r = 0.58 (p < 0.05)	Improved stability
Correlation (magnitude vs. warning)	r ≈ 0.45	r = 0.67 (p < 0.02)	Strengthened
False warning rate (per station-year) *	15.7	14.9	Slight improvement

* False warning rate estimated from continuous background monitoring at 5 reference stations over 2-year periods.

Appendix A.5.4. Statistical Significance

Paired t-tests comparing pre-seismic and co-seismic windows for the combined dataset:

$E_{sds}$ : t = 15.42, df = 49, p < 10⁻⁶ (highly significant)
$κ$ : t = 12.87, df = 49, p < 10⁻⁵ (highly significant)

Cohen’s d effect sizes:

$E_{sds}$ : d = 2.84 (very large effect)
$κ$ : d = 2.51 (very large effect)

Appendix A.5.5. Magnitude Dependence

Figure A2 illustrates the relationship between event magnitude and SSD warning time for the combined dataset. The correlation coefficient r = 0.67 (p < 0.02) suggests a moderate positive relationship, with larger events tending to show longer precursor times. This is consistent with physical expectations: larger ruptures involve longer preparation processes and may generate detectable signals earlier.

The relationship appears approximately linear in the log-log domain, following:

{log}_{10} (warning time) \approx 0.85 + 0.31 \times M

with warning time in seconds and M = moment magnitude.

Appendix A.5.6. Depth Dependence

The correlation between focal depth and warning time (r = 0.58, p < 0.05) strengthens with the larger sample. This is physically coherent: deeper events allow longer precursor detection due to:

1.: Greater travel path through heterogeneous medium
2.: Earlier arrival of deformation signals at surface stations
3.: Potentially larger source volume involved in preparation

Appendix A.6. Discussion of Extended Results

Appendix A.6.1. Consistency Across Tectonic Settings

The inclusion of eight additional seismic events from diverse tectonic settings provides strong support for the robustness of SSD methodology:

1.: Subduction megathrust events (Illapel M8.3): Showed the longest warning times (94 s) and largest $Δ E_{sds}$ (+1.29), consistent with the extensive rupture preparation zone in subduction environments.
2.: Strike-slip events (El Mayor-Cucapah, Kumamoto, Ferndale): Exhibited consistent $Δ E_{sds}$ values (+1.27 to +1.28) and intermediate warning times (68-73 s), suggesting similar preparation processes in crustal fault systems.
3.: Intraplate event (Anchorage M7.1): Despite its greater depth (46.7 km), showed $Δ E_{sds}$ (+1.26) consistent with shallower events, though warning time (77 s) was elevated due to propagation effects.
4.: Moderate crustal events (Napa M6.0, Petrinja M6.4): Showed slightly smaller $Δ E_{sds}$ (+1.22, +1.26) and shorter warning times (51 s, 58 s), suggesting magnitude-dependent effects.

Appendix A.6.2. Comparison with Previous Studies

The SSD results are consistent with findings from other entropy-based methods:

Shannon entropy studies [8] demonstrated entropy increases reaching maximum at or after main shock, matching our $E_{sds}$ observations.
Permutation entropy applications [9,10] showed pre-seismic decreases in randomness (increased determinism), corresponding to our observed $κ$ decreases in the "crystalline" phase.
Regional entropy studies [11] documented long-term entropy decreases preceding major events, analogous to our SSD "critical" phase.

The advantage of SSD lies in its explicit geometric encoding of second-order dynamics, which appears sensitive to the subtle changes in waveform geometry during the critical destabilization phase.

Appendix A.6.3. False Warning Analysis

Continuous monitoring at five reference stations over two-year periods (encompassing multiple small events and background noise) yielded:

False warning rate: 14.9 per station-year ( $κ > 0.8$ sustained for two windows)
Mean false warning duration: 23 ± 15 seconds
Seasonal variation: Slightly higher rates during periods of increased cultural noise (daytime, weekdays)

This rate remains higher than desirable for operational systems but could be reduced through:

Multi-station coincidence detection
Adaptive threshold tuning
Machine learning classification of false vs. true precursors

Appendix A.7. Limitations and Caveats

Despite the expanded dataset, important limitations remain:

1.: Sample size still modest: N=13 events (50 station-records) remain insufficient for definitive statistical conclusions, though the consistency across diverse settings is encouraging.
2.: Geographic bias: Majority of events from the Pacific Rim (California, Chile, Japan, Alaska); underrepresentation of European, African, and Asian intraplate events.
3.: Magnitude range limited: Few events below M6.0 (where signal-to-noise challenges increase) or above M8.3 (rare by definition).
4.: Retrospective nature: All analyses remain post-hoc with known event times; prospective validation on continuous data streams remains essential.
5.: Data quality variations: Not all stations provide identical sampling rates, noise characteristics, or instrument responses, introducing potential biases.
6.: P-wave pick uncertainty: While we used catalog picks, manual verification revealed uncertainties of ±2-5 seconds for some events, affecting warning time precision.
7.: Single-component analysis: Only vertical components were analyzed; three-component analysis might provide additional constraints.

Appendix A.8. Recommendations for Future Validation

Based on these preliminary results, we recommend:

1.: Expanded global dataset: Analysis of 50+ events from diverse tectonic settings, with rigorous station selection criteria.
2.: Blind prospective test: Real-time application to continuous data streams in seismically active regions (California, Japan, Chile) with automated alert algorithms.
3.: Multi-method comparison: Direct comparison with permutation entropy, mutability, and Shannon entropy on identical datasets.
4.: Machine learning integration: Training classifiers on SSD feature vectors to distinguish precursory signals from noise.
5.: Physical modeling: Coupling SSD metrics with numerical simulations of rupture preparation to understand underlying mechanisms.
6.: Operational threshold optimization: Systematic exploration of $κ$ thresholds, window lengths, and coincidence detection to minimize false warnings while maximizing warning time.

Appendix A.9. Data Availability

All waveform data used in this appendix are publicly available from:

IRIS DMC: http://ds.iris.edu/data/access/
SCEDC: https://scedc.caltech.edu/
USGS: https://earthquake.usgs.gov/

Event-specific DOIs and network codes:

El Mayor-Cucapah: Networks CI, II, IU
Illapel: Networks C, GT, IU
Kumamoto: Networks JP, IU
Anchorage: Networks AK, AT
Haiti: Networks HY, Z2

Python code for SSD analysis and validation is available from the corresponding author upon reasonable request.

References

Trabant, C.; et al. Data Products at the IRIS DMC: Stepping Stones for Research and Other Applications. Seismological Research Letters 2012, 83(6), 846–854. [Google Scholar] [CrossRef]
Hutko, A.R., et al. (2014). A highlight of data products from IRIS Data Services. AGU Fall Meeting 2014.
Wei, S.; et al. Superficial simplicity of the 2010 El Mayor-Cucapah earthquake of Baja California in Mexico. Nature Geoscience 2011. [Google Scholar] [CrossRef]
Tilmann, F.; et al. Tilmann, F., et al. (2015). The 2015 Illapel earthquake, Central Chile, a case of characteristic earthquake? AGU Fall Meeting 2015.
Kobayashi, H., Koketsu, K., Miyake, H. (2016). Rupture processes of the 2016 Kumamoto earthquakes derived from joint inversion. Japan Geoscience Union Meeting 2016.
Crowell, B. (2018). Preliminary results for Anchorage M7.0 event. IRIS Special Events.
Douilly, R.; et al. Rupture Segmentation of the 14 August 2021 Mw 7.2 Nippes, Haiti, Earthquake. Bulletin of the Seismological Society of America 2023. [Google Scholar] [CrossRef]
Posadas, A.; Morales, J.; Ibáñez, J.; Posadas-Garzon, A. Shaking earth: Non-linear seismic processes and the second law of thermodynamics. Chaos, Solitons & Fractals 2021, 151, 111243. [Google Scholar] [CrossRef]
Glynn, C.C.; Konstantinou, K.I. Reduction of randomness in seismic noise as a short-term precursor to a volcanic eruption. Scientific Reports 2016, 6, 37733. [Google Scholar] [CrossRef]
Konstantinou, K.I.; et al. Permutation entropy variations in seismic noise before and after eruptive activity at Shinmoedake volcano, Japan. Earth, Planets and Space 2022, 74, 175. [Google Scholar] [CrossRef]
Pasten, D.; et al. Spatial, Temporal, and Dynamic Behavior of Different Entropies in Seismic Activity: The February 2023 Earthquakes in Türkiye and Syria. Entropy 2025, 27(5), 462. [Google Scholar] [CrossRef]
Allen, R.M.; Melgar, D. Earthquake Early Warning: Advances, Scientific Challenges, and Societal Needs. Annual Review of Earth and Planetary Sciences 2019, 47, 361–388. [Google Scholar] [CrossRef]
Zhu, J.; Li, S.; Song, J.; Wang, Y. Magnitude estimation for earthquake early warning using a deep convolutional neural network. Frontiers in Earth Science 2021, 9, 610303. [Google Scholar] [CrossRef]
Varotsos, P.A.; Sarlis, N.V.; Skordas, E.S. Natural Time Analysis: The New View of Time; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Posadas, A.; Pasten, D.; Vogel, E.E.; Saravia, G. Earthquake hazard characterization by using entropy: Application to northern Chilean earthquakes. Natural Hazards and Earth System Sciences 2023, 23, 1911–1920. [Google Scholar] [CrossRef]
Telesca, L. Tsallis-based nonextensive analysis of the Southern California seismicity. Entropy 2010, 13(7), 1267–1280. [Google Scholar] [CrossRef]
Posadas, A.; Morales, J.; Ibáñez, J.; Posadas-Garzon, A. Shaking earth: Non-linear seismic processes and the second law of thermodynamics. Chaos, Solitons & Fractals 2021, 151, 111243. [Google Scholar] [CrossRef]
Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Physical Review Letters 2002, 88(17), 174102. [Google Scholar] [CrossRef]
Glynn, C.C.; Konstantinou, K.I. Reduction of randomness in seismic noise as a short-term precursor to a volcanic eruption. Scientific Reports 2016, 6, 37733. [Google Scholar] [CrossRef]
Konstantinou, K.I.; et al. Permutation entropy variations in seismic noise before and after eruptive activity at Shinmoedake volcano, Japan. Earth, Planets and Space 2022, 74, 175. [Google Scholar] [CrossRef]
Pangarić, Z. Symbolic Geometry of the Number π: Structures, Statistics, and Security. Preprints.org 2026. [Google Scholar] [CrossRef]
Pangarić, Z. Symbolic Structures of Differences (SSD): A Geometrical Approach to Quantifying Complexity in Time Series. Preprints.org 2026. [Google Scholar] [CrossRef]
Chelidze, T.; Matcharashvili, T. (Eds.) Complexity of Seismic Time Series; Elsevier: Amsterdam, The Netherlands, 2018. [Google Scholar]
Reyes-Davesa, P.; et al. Volcanic Early Warning Using Shannon Entropy: Multiple Cases of Study. Journal of Geophysical Research: Solid Earth 2023, 128, e2023JB026684. [Google Scholar] [CrossRef]
Vogel, E.E.; et al. Time-series analysis of earthquake sequences by means of information recognizer. Tectonophysics 2017, 712–713, 723–728. [Google Scholar] [CrossRef]
Bakun, W.H.; et al. Implications for prediction and hazard assessment from the 2004 Parkfield earthquake. Nature 2005, 437, 969–974. [Google Scholar] [CrossRef] [PubMed]
Borcherdt, R.D.; et al. Recordings of the 2004 Parkfield Earthquake on the GEOS Array: Implications for Earthquake Precursors. Bulletin of the Seismological Society of America 2006, 96(4B), S73–S102. [Google Scholar] [CrossRef]
Johnston, M.J.S.; Sasai, Y.; Egbert, G.D.; Mueller, R.J. Seismomagnetic effects from the 2004 M6.0 Parkfield earthquake. Bulletin of the Seismological Society of America 2006, 96(4B), S206–S220. [Google Scholar] [CrossRef]
Amato, A.; et al. L’Aquila earthquake sequence 2009. Annals of Geophysics 2012, 55(4). [Google Scholar]
Minson, S.E.; et al. Real-time inversions for finite fault slip models and rupture geometry based on high-rate GPS data. Journal of Geophysical Research: Solid Earth 2014, 119, 3201–3231. [Google Scholar] [CrossRef]
Pasten, D.; et al. Spatial, Temporal, and Dynamic Behavior of Different Entropies in Seismic Activity: The February 2023 Earthquakes in Türkiye and Syria. Entropy 2025, 27(5), 462. [Google Scholar] [CrossRef]
Johnston, M.J.S. Absence of electric and magnetic field precursors for the 2004 Parkfield earthquake. Bulletin of the Seismological Society of America 2006, 96(4B), S206–S220. [Google Scholar] [CrossRef]

Table 2. SSD parameters before and during significant seismic events. * Time from SSD warning (

κ > 0.8

) to P-wave arrival. Note: Values are the result of a preliminary analysis and require validation.

Table 2. SSD parameters before and during significant seismic events. * Time from SSD warning (

κ > 0.8

) to P-wave arrival. Note: Values are the result of a preliminary analysis and require validation.

Event	Mag.	$E_{sds}$ (pre)	$κ$ (pre)	$E_{sds}$ (during)	$κ$ (during)	$Δ E_{sds}$	Warning Time (s) *
Parkfield 2004	M6.0	2.41	0.52	3.58	0.91	+1.17	47
L’Aquila 2009	M6.3	2.38	0.49	3.72	0.93	+1.34	62
Tohoku 2011	M9.0	2.52	0.55	3.81	0.96	+1.29	89
Ridgecrest M6.4	M6.4	2.45	0.51	3.65	0.92	+1.20	53
Ridgecrest M7.1	M7.1	2.48	0.53	3.78	0.96	+1.30	71

Table 3. Statistical tests for SSD signals. Note: t-test conducted on samples from sliding windows (

N ≫ 100

per group). Due to window overlap, p-values are indicative.

Table 3. Statistical tests for SSD signals. Note: t-test conducted on samples from sliding windows (

N ≫ 100

per group). Due to window overlap, p-values are indicative.

Parameter	Mean (Stable)	Mean (Pre-Seismic)	t-Value	p-Value
$E_{sds}$	$2.45 \pm 0.18$	$3.65 \pm 0.21$	11.34	$< 0.0001$
$κ$	$0.52 \pm 0.09$	$0.92 \pm 0.04$	9.87	$0.0002$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Symbolic Structures of Differences (SSD) as an Early Indicator of Seismic Instability: Theoretical Framework, Methodology, and Application in Early Warning Systems

Abstract

Keywords:

Subject:

1. Introduction

2. Theoretical Framework

2.1. Physical Mechanism of Seismic Instability

2.2. Relationship with Information-Theoretic Approaches

3. Methodology

3.1. Definition of SSD Structures

3.2. SSD Metrics

3.3. Phase Classification

3.4. Adaptation for Seismic Series

4. Preliminary Retrospective Analysis

4.1. Dataset

4.2. Results of Parameter Analysis

4.3. Statistical Analysis

5. Discussion

5.1. Comparison with Related Approaches

5.2. Limitations and Potential Drawbacks

5.3. Potential of Hybrid Systems

6. Future Research Directions

7. Conclusions

Appendix A. Preliminary Validation of SSD Methodology on Additional Seismic Events

Appendix A.1. Introduction

Appendix A.2. Data Sources and Selection Criteria

Appendix A.2.1. Data Repositories

Appendix A.2.2. Selection Criteria

Appendix A.2.3. Event Verification

Appendix A.3. Additional Events Analyzed

Appendix A.3.1. Event Descriptions

Appendix A.4. Methodology

Appendix A.4.1. Data Processing Parameters

Appendix A.4.2. SSD Metrics Computed

Appendix A.4.3. Quality Control

Appendix A.5. Results

Appendix A.5.1. SSD Parameters for Additional Events

Appendix A.5.2. Temporal Evolution of SSD Parameters

Appendix A.5.3. Combined Statistical Analysis

Appendix A.5.4. Statistical Significance

Appendix A.5.5. Magnitude Dependence

Appendix A.5.6. Depth Dependence

Appendix A.6. Discussion of Extended Results

Appendix A.6.1. Consistency Across Tectonic Settings

Appendix A.6.2. Comparison with Previous Studies

Appendix A.6.3. False Warning Analysis

Appendix A.7. Limitations and Caveats

Appendix A.8. Recommendations for Future Validation

Appendix A.9. Data Availability

References

MDPI Initiatives

Important Links

Subscribe