Submitted:
14 May 2026
Posted:
15 May 2026
You are already at the latest version
Abstract

Keywords:
1. Introduction
- 1.
- A hybrid S-NODE-ANF-RRC architecture that is, to our knowledge, the first to combine continuous-time stochastic neural dynamics with fuzzy inference and explicit false alarm rate optimisation.
- 2.
- Identification and correction of a systematic kurtosis measurement artefact in regime clustering that affects all Gaussian mixture-based benchmarks on heavy-tailed data (Section 8).
- 3.
- A unified stability and power framework (Section 4) that bridges classical forecasting tests (CUSUM, Neyman-Pearson), SDE Lyapunov stability, and neural network training diagnostics within a single theoretical argument.
- 4.
- A dual operational profile: the N-ODE-ANF-RRC as the primary early-warning forecaster with positive crisis lead time, and S-NODE-ANF-RRC as the low-false-alarm crisis classifier (Section 8, Table 5).
- 5.
- Open data, code, and trained model weights at https://doi.org/10.5281/zenodo.19787658.
2. Related Work
2.1. Parametric and Neural Architectures for Regime Detection
2.2. Continuous-Time Models for Financial Volatility
2.3. Fuzzy and Hybrid Systems for Regime Classification
2.4. Synthesis: Convergences, Gaps, and Research Motivation
3. Data and Diagnostics
3.1. Dataset
| Stylised Fact | JSE Evidence | Architectural Response |
|---|---|---|
| Extreme kurtosis | : 54.8; : 824.0 | Log-transform before modelling; GMM on raw features inflates ARI by |
| Long memory in vol | Hurst (), (VIX) | Continuous-time SDE naturally models persistent path-dependent dynamics; discrete-time RNNs discretise this artificially |
| Near-random-walk residuals | Hurst () | Diffusion term captures stochastic component without imposing directional persistence |
| Regime persistence | Self-transition: Normal 0.934, Crisis 0.923 | Early-warning objective: rare transitions make advance detection operationally valuable |
3.2. Feature Construction
3.3. Data Diagnostics
| From ∖ To | Normal | Stressed | Crisis |
|---|---|---|---|
| Normal | 0.934 | 0.065 | 0.001 |
| Stressed | 0.109 | 0.826 | 0.065 |
| Crisis | 0.005 | 0.072 | 0.923 |
| Model | ARI | MCC | BAC | LT (d) | FAR | FNR | (bp) | LL |
|---|---|---|---|---|---|---|---|---|
| k-Means | 0.320 | 0.470 | 0.595 | 0.00 | 0.043 | 0.596 | 32,600 | – |
| GMM | 0.309 | 0.434 | 0.562 | 0.43 | 0.203 | 0.532 | 29,650 | – |
| HMM | 0.563 | 0.690 | 0.770 | 0.00 | 0.026 | 0.303 | 16,600 | – |
| N-ODE-ANF-RRC | 0.419 | 0.596 | 0.740 | 0.71 | 0.287 | 0.156 | 10,350 | 1.01 |
| S-NODE-ANF-RRC | 0.462 | 0.585 | 0.663 | 0.00 | 0.051 | 0.312 | 17,200 | 1.07 |
| N-ODE: , no diffusion, Euler integration. LL: log-loss (lower=better); –: hard-assignment models have no calibrated probabilities. Bootstrap 95% CI for cost difference (GMM minus S-NODE): bp; S-NODE cost-advantaged in 100.0% of resamples. GMMraw ARI = 0.389 on untransformed features vs GMMlog ARI = 0.309 (1.3× inflation from kurtosis artefact). | ||||||||
| Configuration | S-NODE | N-ODE | GMM |
|---|---|---|---|
| False Positives (FP) | 4 | 37 | 13 |
| False Negatives (FN) | 34 | 17 | 58 |
| bp, | 3,500 | 2,625 | 6,125 |
| bp, | 10,400 | 6,950 | 18,050 |
| bp, | 17,200 | 10,350 | 29,650 |
| bp, | 34,400 | 20,700 | 59,300 |
| Break-even | (S-NODE vs GMM) | ||
4. Model Stability and Statistical Power: A Unified Framework
4.1. Classical Stability meets SDE Lyapunov Theory
4.2. Statistical Power meets Fisher Information
4.3. Threshold Sensitivity Analysis
5. Theoretical Framework and Hypotheses
5.1. Formal Definitions
5.2. Theoretical Results
5.3. Empirical Hypotheses
6. Architecture
6.1. System Overview, S-NODE Components, and Interpretability
| Component | Configuration | Parameters |
|---|---|---|
| Encoder | Linear() + LayerNorm | 128 |
| Drift | 4-layer residual MLP, SiLU, | 12,416 |
| Diffusion | 2-layer MLP + tanh, rank-4 Cholesky | 5,248 |
| Jump | 4-head MHA + gate, | 8,576 |
| Milstein integrator | , steps, MC paths | – |
| ANFRRC fuzzy layer | prototypes, TSK rules, | 11,427 |
| Classification head | Linear() + softmax | 99 |
| Total (S-NODE-ANF-RRC) | 47,203 | |
| Total (N-ODE ablation) | No , ; Euler ODE | 31,651 |
| TSK: Takagi-Sugeno-Kang fuzzy rules with Gaussian membership functions. MHA: multi-head self-attention. All reported for , , (diffusion rank). | ||
6.2. ANF-RRC Integration and Training
7. Experimental Setup
7.1. Regime Labelling, Baselines, and Evaluation Protocol
7.2. Software and Computational Environment
8. Results
8.1. Regime Classification and Forecasting Performance

8.2. Hypothesis Testing, Statistical Significance, and Noise Robustness
8.3. Ablation Study
8.4. Interpretability Analysis (KernelSHAP)
9. Discussion
9.1. Reframing Success: Two Operational Profiles
9.2. What the Stochastic Layer Adds: A Physics Argument
9.3. Frequency-Dependent Architecture Profile
9.4. Forecasting Interpretation
9.5. Limitations and Future Directions
9.6. Implications for Practice and Policy
9.7. Research Contributions
10. Conclusion
AI Statement
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Heston, S.L. A closed-form solution for options with stochastic volatility with applications to bond and currency options. Rev. Financ. Stud. 1993, 6, 327–343. [Google Scholar] [CrossRef]
- Gatheral, J.; Jaisson, T.; Rosenbaum, M. Volatility is rough. Quant. Financ. 2018, 18, 933–949. [Google Scholar] [CrossRef]
- Cont, R. Empirical properties of asset returns: Stylised facts and statistical issues. Quant. Financ. 2001, 1, 223–236. [Google Scholar] [CrossRef]
- Tzen, B.; Raginsky, M. Neural stochastic differential equations: Deep latent Gaussian models in the diffusion limit. arXiv 2019, arXiv:1905.09883. [Google Scholar] [CrossRef]
- Oh, D.J.; et al. Stable neural stochastic differential equations in analyzing irregular time series data. arXiv 2024, arXiv:2402.14989. [Google Scholar] [CrossRef]
- Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control, 5th ed.; Wiley: Hoboken, NJ, 2015. [Google Scholar] [CrossRef]
- Montgomery, D.C.; Jennings, C.L.; Kulahci, M. Introduction to Time Series Analysis and Forecasting, 2nd ed.; Wiley: Hoboken, NJ, 2015. [Google Scholar] [CrossRef]
- Su, W. Research on the Application of Data Mining Techniques in Early Warning Models for Financial Management. Appl. Math. Nonlinear Sci.;ANFIS K.-Means Financ. Early Warn. 2024, 9. [Google Scholar] [CrossRef]
- Boyacioglu, M.; Avci, D. An adaptive network-based fuzzy inference system (ANFIS) for the prediction of stock market return: the case of the Istanbul stock exchange. Expert Syst. With Appl. 2010, 37, 7908–7912. [Google Scholar] [CrossRef]
- ksendal, B. Stochastic Differential Equations: An Introduction with Applications, 6th ed.; Springer: Berlin, 2003. [Google Scholar] [CrossRef]
- Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, 1994. [Google Scholar] [CrossRef]
- Vincent, P.; Salleh, H. A systematic review of stochastic neural networks for stock market forecasting. J. Math. Sci. Inform.;Syst. Rev. SNN Vs Determ. Model. 2024. [Google Scholar] [CrossRef]
- Hamilton, J. A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 1989, 57, 357–384. [Google Scholar] [CrossRef]
- Tsay, R. Testing and modeling threshold autoregressive processes. J. Am. Stat. Assoc. 1989, 84, 231–240. [Google Scholar] [CrossRef]
- Graves, A. Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin, 2012. [Google Scholar] [CrossRef]
- Dey, R.; Salem, F. Gate-variants of gated recurrent unit (GRU) neural networks. IEEE 60th International Midwest Symposium on Circuits and Systems, 2017; pp. 1597–1600. [Google Scholar] [CrossRef]
- Vaswani, A.; et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar] [CrossRef]
- Chen, R.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D. Neural ordinary differential equations. Adv. Neural Inf. Process. Syst. 2018, 31, 6571–6583. [Google Scholar] [CrossRef]
- Jia, J.; Benson, A. Neural jump stochastic differential equations. Adv. Neural Inf. Process. Syst. 2019, 32, 9847–9858. [Google Scholar] [CrossRef]
- Kidger, P.; Morrill, J.; Foster, J.; Lyons, T. Neural controlled differential equations for irregular time series. Adv. Neural Inf. Process. Syst. 2020, 33, 6696–6707. [Google Scholar] [CrossRef]
- Wang, Y.; Peng, F.; Wang, F.; Li, J. Deep fuzzy cognitive maps for interpretable multivariate time series prediction. IEEE Trans. Fuzzy Syst. 2020, 29, 2350–2362. [Google Scholar] [CrossRef]
- Ardia, D.; Bluteau, K.; Boudt, K.; Catania, L.; Trottier, D.A. Markov-switching GARCH models in R: The MSGARCH package. J. Stat. Softw. 2019, 91, 1–38. [Google Scholar] [CrossRef]
- Lea, C.; Vidal, R.; Reiter, A.; Hager, G. Temporal convolutional networks: A unified approach to action segmentation. ECCV Work. 2016, 47–54. [Google Scholar] [CrossRef]
- Zhang, Z.; Zou, S.; Yang, Y.; Yang, L. Temporal fusion transformer for financial regime detection. Expert Syst. With Appl. 2022, 209, 118361. [Google Scholar] [CrossRef]
- Moroke, N.D.; Metsileng, L.D. A Maximum-Entropy Markov-Switching GARCH Framework for Cryptocurrency Volatility Regime Detection and Forecasting. Preprints;arXiv 2026. [Google Scholar] [CrossRef]
- Al-Shboul, M.; et al. Adaptive Hierarchical Hidden Markov Models for Financial Regime Detection. Mathematics;AH-HMM Regime Detect. MDPI 2025, 13. [Google Scholar] [CrossRef]
- Shoko, C.; Moroke, N.; Sigauke, C.; Makatjane, K. Real-time forecasting of FTSE/JSE-Top40 using deep neural models: GPT-SNN-PPO vs. LSTM. Romanian J. Econ. 2026, 62, 28–44. [Google Scholar] [CrossRef]
- Yang, L.; Gao, T.; Lu, Y.; Duan, J.; Liu, T. Neural network stochastic differential equation models with applications to financial data forecasting. Appl. Math. Model. 2023, arXiv:2111.13164115, 407–426. [Google Scholar] [CrossRef]
- Anh, N.; Ha, T.; Thai, L. Phase Space Reconstructed Neural Ordinary Differential Equations Model for Stock Price Forecasting. In Proceedings of the Pacific Asia Conference on Information Systems (PACIS) NODE application to financial forecasting, 2024. [Google Scholar] [CrossRef]
- Pinna, P. Neural SDEs for financial market modeling: Implementation and performance analysis. Bachelor’s Thesis;Neural SDE implementation for financial markets, University of Cagliari, 2025. [Google Scholar] [CrossRef]
- Mhlanga, D. Financial Inclusion and Sustainable Development in Sub-Saharan Africa; Routledge: London, 2025. [Google Scholar] [CrossRef]
- Kraevskiy, A.; Prokhorov, A.; Sokolovskiy, E. Early warning systems for financial markets of emerging economies. arXiv 2024, arXiv:2404.03319. [Google Scholar] [CrossRef]
- Vuong, N.; et al. VIX and financial market stress in emerging markets. Int. Rev. Financ. Anal. 2022, 82, 102168. [Google Scholar] [CrossRef]
- Dickey, D.; Fuller, W. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar] [CrossRef] [PubMed]
- Kwiatkowski, D.; Phillips, P.; Schmidt, P.; Shin, Y. Testing the null hypothesis of stationarity against the alternative of a unit root. J. Econom. 1992, 54, 159–178. [Google Scholar] [CrossRef]
- Hurst, H. Long-term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 1951, 116, 770–799. [Google Scholar] [CrossRef]
- Kercheval, A.; Zhang, Y. Modelling high-frequency limit order book dynamics with support vector machines. Quant. Financ. 2015, 15, 1315–1329. [Google Scholar] [CrossRef]
- O’Hara, M. Market Microstructure Theory; Blackwell: Cambridge, MA, 1998. [Google Scholar] [CrossRef]
- Sirignano, J. Deep learning for limit order books. Quant. Financ. 2019, 19, 549–570. [Google Scholar] [CrossRef]
- Merton, R.C. Option pricing when underlying stock returns are discontinuous. J. Financ. Econ. 1976, 3, 125–144. [Google Scholar] [CrossRef]
- Kloeden, P.; Pearson, R. The numerical solution of stochastic differential equations. ANZIAM J. 1977, 20, 8–12. [Google Scholar] [CrossRef]
- Lundberg, S.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4765–4774. [Google Scholar] [CrossRef]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization (AdamW). International Conference on Learning Representations, 2019. [Google Scholar] [CrossRef]
- Tsang, E.; Chen, J. Regime change detection using directional change indicators in the foreign exchange market to chart Brexit. IEEE Trans. Emerg. Top. Comput. Intell. 2018, 2, 185–193. [Google Scholar] [CrossRef]
- Steinley, D. Properties of the Hubert-Arabie adjusted Rand index. Psychol. Methods 2004, 9, 386–396. [Google Scholar] [CrossRef] [PubMed]
- Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef]
- Diebold, F.; Mariano, R. Comparing predictive accuracy. J. Bus. Econ. Stat. 1995, 13, 253–263. [Google Scholar] [CrossRef]
- Yang, D.; Zhang, Q. Drift-independent volatility estimation based on high, low, open, and close prices. J. Bus. 2000, 73, 477–491. [Google Scholar] [CrossRef]





| Study | Method | CT | Stoch | EW | FAR | EM | JSE | Data type | Metric |
|---|---|---|---|---|---|---|---|---|---|
| Hamilton (1989) [13] | MS-AR | × | × | × | × | × | × | Macro series | Log-likelihood |
| Tsay (1989) [14] | TAR | × | × | × | × | × | × | Macro series | AIC/BIC |
| Graves (2012) [15] | LSTM | × | × | ∼ | × | × | × | Sequence | Accuracy |
| Dey & Salem (2017) [16] | GRU | × | × | ∼ | × | × | × | Text/returns | F1 |
| Vaswani et al. (2017) [17] | Transformer | × | × | ∼ | × | × | × | Sequence | Accuracy |
| Chen et al. (2018) [18] | Neural ODE | × | × | × | × | × | Latent series | MSE | |
| Tzen & Raginsky (2019) [4] | Neural SDE | × | × | × | × | Latent series | ELBO | ||
| Jia & Benson (2019) [19] | NJ-SDE | ∼ | × | × | × | Irregular TS | MSE | ||
| Kaur (2019) [9] | ANFIS | × | × | × | × | × | × | Returns | RMSE |
| Kidger et al. (2020) [20] | Neural CDE | × | × | × | × | × | Irregular TS | Accuracy | |
| Wang et al. (2020) [21] | Deep FCM | × | × | × | × | × | × | Returns | ARI |
| Ardia et al. (2019) [22] | MS-GARCH | × | × | × | × | × | Returns | Log-lik. | |
| Lea et al. (2016) [23] | TCN | × | × | ∼ | × | × | × | Sequence | Accuracy |
| Zhang et al. (2022) [24] | TFT-Finance | × | × | ∼ | × | ∼ | × | Returns | MAE/RMSE |
| This paper | S-NODE-ANF-RRC | JSE equities | FAR, cost, LL | ||||||
| Prior coverage | 29% | 21% | 0% | 0% | 7% | 0% | |||
| This paper | 100% | 100% | 100% | 100% | 100% | 100% | |||
| ∼ = partial (lead time reported but not optimised against FAR). FAR 0% prior coverage: no prior study optimises or reports crisis false alarm rate. LL: log-loss. | |||||||||
| Test | |||
|---|---|---|---|
| ADF statistic | |||
| KPSS statistic | 0.333 | 0.224 | 0.983 * |
| JB statistic | 322,037 *** | 75,916,564 *** | 22,408 *** |
| Skewness | |||
| Excess kurtosis | 54.830 | 823.953 | 16.125 |
| Hurst H [95%CI] | 0.909 * [0.89,0.93] | [0.5,0.55] | 0.941 * [0.92,0.96] |
| LB(10) | 16922.3 *** | 36.6 *** | 19469.6 *** |
| ***; **; *; †. | |||
| Multiplier | Crisis days (%) | S-NODE FAR | N-ODE FAR |
|---|---|---|---|
| 412 (15.3%) | 0.063 | 0.291 | |
| (baseline) | 269 (10.0%) | 0.051 | 0.287 |
| 118 (4.4%) | 0.044 | 0.283 | |
| S-NODE FAR advantage over N-ODE is preserved at all thresholds, confirming robustness of the dual operational profile finding. | |||
| Model | ARI | FAR | LT (d) |
|---|---|---|---|
| k-Means | 0.255 | 0.815 | 1.33 |
| GMM | 0.183 | 0.750 | 1.00 |
| HMM | 0.050 | 0.805 | 1.00 |
| S-NODE-ANF-RRC | 0.216 | 0.771 | 1.33 |
| ARI between labelling schemes: 0.057. | |||
| GMM=Non-Crisis | GMM=Crisis (FP) | |
|---|---|---|
| S-NODE=Non-Crisis | ||
| S-NODE=Crisis (FP) | ||
| FPSN; FPGMM; FNSN; FNGMM | ||
| Model | ||||
|---|---|---|---|---|
| GMM | 0.397 | -0.009 | -0.024 | -0.022 |
| N-ODE-ANF-RRC | 0.428 | 0.049 | -0.023 | -0.031 |
| S-NODE-ANF-RRC | 0.489 | 0.122 | 0.041 | 0.022 |
| verified below at all levels; Proposition 2 condition met. | ||||
| Feature | Normal | Stressed | Crisis |
|---|---|---|---|
| 0.1321 | 0.1352 | 0.1363 | |
| 0.0400 | 0.0422 | 0.0438 | |
| 0.0922 | 0.0930 | 0.0925 | |
| Realised volatility dominates Crisis detection; standardised residuals contribute most to Stressed classification. Patterns consistent with the regime threshold construction and economic theory. | |||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).