Submitted:
20 May 2025
Posted:
20 May 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We propose Bi-TimeSSM, a novel time-series forecasting model that utilizes bidirectional learning to capture both forward and backward temporal dependencies. Low-rank decomposition is applied to dynamically assess channel correlations, enabling flexible processing, while our parallel Mamba framework supports concurrent channel-independent operations and mixing, optimizing feature extraction and computational efficiency.
- Model effectively captures complementary temporal dependencies through bidirectional learning and adapts to channel relationships via low-rank decomposition, enhancing forecasting accuracy and efficiency. Hybrid channel processing ensures flexibility, reducing overfitting risks and improving performance on complex, high-dimensional data.
- Extensive experiments on benchmark datasets, including ETTh1, ETTh2, Weather, and Electricity, demonstrate that Bi-TimeSSM surpasses state-of-the-art models in both accuracy and scalability, highlighting its robustness across diverse time-series tasks.
2. Materials and Methods
2.1. Problem Statement
2.2. Architecture of Bi-TimeSSM
2.2.1. Normalization
2.2.2. Low-Rank Decomposition (LRD)
2.2.3. Embedded Layer
2.2.4. Mamba
2.2.5. Encoder
2.2.6. Mapping
2.3. Benchmarks
- ETTh1 and ETTh2: These datasets consist of electricity transformer load time series, representing distinct regional load patterns and serving as benchmarks for short-term load forecasting.
- ETTm1 and ETTm2: Similar to the ETTh datasets, these provide minute-level time series data, offering higher temporal resolution for detailed analysis.
- Weather: A dataset comprising multiple meteorological variables, utilized for forecasting future weather parameters.
- Traffic: This dataset includes traffic flow time series collected from sensor devices and is employed for traffic flow prediction.
- Electricity: A dataset documenting electricity consumption time series for multiple users, used for power consumption forecasting.
2.4. Experiment Setup
2.5. Evaluation Metrics
3. Results
3.1. Comparison with State-of-the-Art Methods
3.2. Baseline Comparison
3.3. Ablation Study
3.3.1. Impact of Encoder
3.3.2. Impact of Removing Parallel Mamba Modules
3.3.3. Hyperparameter Tuning
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| LTSF | Long-term time-series forecasting |
| LSTM | Long Short-Term Memory |
| LRD | Low-Rank Decomposition |
| MSE | Mean Squared Error |
| MAE | Mean Absolute Error |
Appendix A
| Dataset | Window | Metric | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ETTh1 | 96 | MSE | 0.363 | 0.378 | 0.398 | 0.386 | 0.393 | 0.423 | 0.449 | 0.387 | 0.384 | 0.382 |
| MAE | 0.388 | 0.395 | 0.402 | 0.405 | 0.408 | 0.448 | 0.459 | 0.406 | 0.402 | 0.398 | ||
| 192 | MSE | 0.425 | 0.427 | 0.435 | 0.441 | 0.445 | 0.471 | 0.500 | 0.439 | 0.436 | 0.427 | |
| MAE | 0.419 | 0.428 | 0.440 | 0.436 | 0.434 | 0.474 | 0.482 | 0.435 | 0.429 | 0.425 | ||
| 336 | MSE | 0.428 | 0.471 | 0.450 | 0.487 | 0.474 | 0.570 | 0.521 | 0.493 | 0.491 | 0.486 | |
| MAE | 0.423 | 0.445 | 0.448 | 0.458 | 0.451 | 0.546 | 0.496 | 0.457 | 0.469 | 0.487 | ||
| 720 | MSE | 0.458 | 0.470 | 0.480 | 0.503 | 0.480 | 0.653 | 0.514 | 0.490 | 0.521 | 0.482 | |
| MAE | 0.452 | 0.457 | 0.465 | 0.491 | 0.471 | 0.621 | 0.512 | 0.478 | 0.500 | 0.477 |
| Dataset | Window | Metric | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ETTh2 | 96 | MSE | 0.279 | 0.291 | 0.230 | 0.297 | 0.302 | 0.745 | 0.346 | 0.305 | 0.340 | 0.302 |
| MAE | 0.334 | 0.342 | 0.349 | 0.349 | 0.348 | 0.584 | 0.388 | 0.352 | 0.374 | 0.349 | ||
| 192 | MSE | 0.350 | 0.368 | 0.371 | 0.380 | 0.388 | 0.877 | 0.456 | 0.424 | 0.402 | 0.382 | |
| MAE | 0.381 | 0.392 | 0.400 | 0.400 | 0.400 | 0.656 | 0.452 | 0.439 | 0.414 | 0.400 | ||
| 336 | MSE | 0.345 | 0.407 | 0.402 | 0.428 | 0.426 | 1.043 | 0.482 | 0.456 | 0.452 | 0.421 | |
| MAE | 0.383 | 0.424 | 0.449 | 0.432 | 0.433 | 0.731 | 0.486 | 0.473 | 0.452 | 0.439 | ||
| 720 | MSE | 0.420 | 0.421 | 0.425 | 0.428 | 0.431 | 1.104 | 0.515 | 0.476 | 0.462 | 0.437 | |
| MAE | 0.439 | 0.439 | 0.438 | 0.432 | 0.446 | 0.763 | 0.511 | 0.493 | 0.468 | 0.458 |
| Dataset | Window | Metric | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ETTm1 | 96 | MSE | 0.316 | 0.320 | 0.312 | 0.334 | 0.329 | 0.404 | 0.505 | 0.353 | 0.338 | 0.340 |
| MAE | 0.354 | 0.360 | 0.371 | 0.368 | 0.367 | 0.426 | 0.475 | 0.374 | 0.375 | 0.374 | ||
| 192 | MSE | 0.347 | 0.361 | 0.365 | 0.377 | 0.367 | 0.450 | 0.553 | 0.389 | 0.374 | 0.377 | |
| MAE | 0.378 | 0.383 | 0.409 | 0.391 | 0.385 | 0.451 | 0.496 | 0.391 | 0.387 | 0.390 | ||
| 336 | MSE | 0.389 | 0.386 | 0.421 | 0.426 | 0.399 | 0.532 | 0.621 | 0.421 | 0.410 | 0.401 | |
| MAE | 0.401 | 0.402 | 0.410 | 0.420 | 0.419 | 0.515 | 0.537 | 0.413 | 0.411 | 0.407 | ||
| 720 | MSE | 0.449 | 0.445 | 0.496 | 0.491 | 0.454 | 0.666 | 0.671 | 0.484 | 0.478 | 0.453 | |
| MAE | 0.436 | 0.437 | 0.437 | 0.459 | 0.439 | 0.589 | 0.561 | 0.448 | 0.450 | 0.442 |
| Dataset | Window | Metric | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ETTm2 | 96 | MSE | 0.174 | 0.176 | 0.185 | 0.180 | 0.175 | 0.287 | 0.255 | 0.182 | 0.187 | 0.177 |
| MAE | 0.255 | 0.263 | 0.290 | 0.264 | 0.259 | 0.366 | 0.339 | 0.264 | 0.267 | 0.261 | ||
| 192 | MSE | 0.244 | 0.242 | 0.292 | 0.250 | 0.241 | 0.414 | 0.281 | 0.257 | 0.249 | 0.240 | |
| MAE | 0.301 | 0.304 | 0.309 | 0.309 | 0.302 | 0.492 | 0.340 | 0.315 | 0.309 | 0.298 | ||
| 336 | MSE | 0.301 | 0.304 | 0.321 | 0.311 | 0.305 | 0.597 | 0.339 | 0.318 | 0.321 | 0.305 | |
| MAE | 0.339 | 0.344 | 0.367 | 0.348 | 0.343 | 0.542 | 0.372 | 0.353 | 0.351 | 0.345 | ||
| 720 | MSE | 0.384 | 0.402 | 0.401 | 0.412 | 0.402 | 1.730 | 0.433 | 0.426 | 0.408 | 0.403 | |
| MAE | 0.389 | 0.402 | 0.400 | 0.407 | 0.400 | 1.042 | 0.432 | 0.419 | 0.403 | 0.400 |
| Dataset | Window | Metric | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Weather | 96 | MSE | 0.162 | 0.159 | 0.174 | 0.174 | 0.178 | 0.158 | 0.266 | 0.196 | 0.172 | 0.186 |
| MAE | 0.208 | 0.205 | 0.218 | 0.214 | 0.219 | 0.230 | 0.336 | 0.235 | 0.220 | 0.237 | ||
| 192 | MSE | 0.205 | 0.205 | 0.200 | 0.221 | 0.224 | 0.206 | 0.307 | 0.241 | 0.219 | 0.233 | |
| MAE | 0.248 | 0.249 | 0.258 | 0.278 | 0.259 | 0.277 | 0.367 | 0.271 | 0.261 | 0.273 | ||
| 336 | MSE | 0.266 | 0.264 | 0.280 | 0.254 | 0.292 | 0.272 | 0.359 | 0.292 | 0.280 | 0.289 | |
| MAE | 0.289 | 0.291 | 0.299 | 0.298 | 0.306 | 0.335 | 0.395 | 0.306 | 0.306 | 0.312 | ||
| 720 | MSE | 0.343 | 0.343 | 0.352 | 0.358 | 0.354 | 0.398 | 0.419 | 0.363 | 0.365 | 0.356 | |
| MAE | 0.342 | 0.344 | 0.359 | 0.349 | 0.348 | 0.418 | 0.428 | 0.353 | 0.359 | 0.352 |
| Dataset | Window | Metric | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Traffic | 96 | MSE | 0.401 | 0.375 | 0.398 | 0.395 | 0.457 | 0.522 | 0.613 | 0.647 | 0.593 | 0.676 |
| MAE | 0.271 | 0.258 | 0.274 | 0.268 | 0.295 | 0.290 | 0.388 | 0.384 | 0.321 | 0.407 | ||
| 192 | MSE | 0.418 | 0.394 | 0.393 | 0.417 | 0.471 | 0.530 | 0.616 | 0.596 | 0.617 | 0.631 | |
| MAE | 0.276 | 0.269 | 0.282 | 0.276 | 0.299 | 0.293 | 0.382 | 0.359 | 0.336 | 0.386 | ||
| 336 | MSE | 0.432 | 0.406 | 0.443 | 0.433 | 0.482 | 0.558 | 0.616 | 0.601 | 0.629 | 0.640 | |
| MAE | 0.282 | 0.274 | 0.368 | 0.283 | 0.304 | 0.305 | 0.382 | 0.361 | 0.336 | 0.387 | ||
| 720 | MSE | 0.464 | 0.440 | 0.470 | 0.467 | 0.514 | 0.589 | 0.622 | 0.642 | 0.640 | 0.681 | |
| MAE | 0.299 | 0.288 | 0.309 | 0.302 | 0.322 | 0.328 | 0.337 | 0.381 | 0.350 | 0.402 |
| Dataset | Window | Metric | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Electricity | 96 | MSE | 0.140 | 0.140 | 0.156 | 0.148 | 0.174 | 0.219 | 0.201 | 0.206 | 0.168 | 0.217 |
| MAE | 0.233 | 0.238 | 0.240 | 0.240 | 0.259 | 0.314 | 0.317 | 0.288 | 0.272 | 0.304 | ||
| 192 | MSE | 0.157 | 0.155 | 0.161 | 0.162 | 0.178 | 0.231 | 0.222 | 0.206 | 0.184 | 0.216 | |
| MAE | 0.252 | 0.253 | 0.268 | 0.253 | 0.265 | 0.322 | 0.334 | 0.290 | 0.289 | 0.306 | ||
| 336 | MSE | 0.184 | 0.170 | 0.195 | 0.178 | 0.196 | 0.246 | 0.231 | 0.220 | 0.198 | 0.232 | |
| MAE | 0.266 | 0.269 | 0.272 | 0.269 | 0.282 | 0.337 | 0.338 | 0.305 | 0.300 | 0.321 | ||
| 720 | MSE | 0.220 | 0.197 | 0.231 | 0.225 | 0.237 | 0.280 | 0.254 | 0.252 | 0.220 | 0.273 | |
| MAE | 0.284 | 0.293 | 0.307 | 0.317 | 0.316 | 0.363 | 0.361 | 0.337 | 0.320 | 0.352 |
References
- Yazici, I.; Beyca, O.F.; Delen, D. Deep-learning-based short-term electricity load forecasting: A real case application. Engineering Applications of Artificial Intelligence 2022, 109, 104645. [Google Scholar] [CrossRef]
- Zhang, G.; Yang, D.; Galanis, G.; Androulakis, E. Solar forecasting with hourly updated numerical weather prediction. Renewable and Sustainable Energy Reviews 2022, 154, 111768. [Google Scholar] [CrossRef]
- Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the Proceedings of the AAAI conference on artificial intelligence, 2019, Vol. 33, pp. 922–929.
- Palensky, P.; Dietrich, D. Demand side management: Demand response, intelligent energy systems, and smart loads. IEEE transactions on industrial informatics 2011, 7, 381–388. [Google Scholar] [CrossRef]
- Kgakatsi, I.B.; Rautenbach, C.d. The contribution of seasonal climate forecasts to the management of agricultural disaster-risk in South Africa. International Journal of Disaster Risk Reduction 2014, 8, 100–113. [Google Scholar] [CrossRef]
- Cheng, Z.; Pang, M.S.; Pavlou, P.A. Mitigating traffic congestion: The role of intelligent transportation systems. Information Systems Research 2020, 31, 653–674. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, H.; Wang, J.; Long, M. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems 2022, 35, 9881–9893. [Google Scholar]
- Sen, R.; Yu, H.F.; Dhillon, I.S. Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting. Advances in neural information processing systems 2019, 32. [Google Scholar]
- Ziat, A.; Delasalles, E.; Denoyer, L.; Gallinari, P. Spatio-temporal neural networks for space-time series forecasting and relations discovery. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM). IEEE; 2017; pp. 705–714. [Google Scholar]
- Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time series analysis: forecasting and control; John Wiley & Sons, 2015.
- Sulandari, W.; Suhartono. ; Subanar.; Rodrigues, P.C. Exponential smoothing on modeling and forecasting multiple seasonal time series: An overview. Fluctuation and Noise Letters 2021, 20, 2130003. [Google Scholar] [CrossRef]
- Zhang, G.P. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003, 50, 159–175. [Google Scholar] [CrossRef]
- Momin, B.; Chavan, G. Univariate time series models for forecasting stationary and non-stationary data: A brief review. Information and Communication Technology for Intelligent Systems (ICTIS 2017)-Volume 2 2, 2018; 219–226. [Google Scholar]
- Petrică, A.C.; Stancu, S.; Tindeche, A. Limitation of ARIMA models in financial and monetary economics. Theoretical & Applied Economics 2016, 23. [Google Scholar]
- De Livera, A.M.; Hyndman, R.J.; Snyder, R.D. Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American statistical association 2011, 106, 1513–1527. [Google Scholar] [CrossRef]
- Yapar, G.; Yavuz, İ.; Selamlar, H.T. Why and how does exponential smoothing fail? An in depth comparison of ATA-simple and simple exponential smoothing. Turkish Journal of Forecasting 2017, 1, 30–39. [Google Scholar]
- Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena 2020, 404, 132306. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014. [Google Scholar]
- Shiri, F.M.; Perumal, T.; Mustapha, N.; Mohamed, R. A comprehensive overview and comparative analysis on deep learning models: CNN, RNN, LSTM, GRU. arXiv preprint arXiv:2305.17473, 2023. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018. [Google Scholar]
- Vaswani, A. Attention is all you need. Advances in Neural Information Processing Systems 2017. [Google Scholar]
- Zhuang, B.; Liu, J.; Pan, Z.; He, H.; Weng, Y.; Shen, C. A survey on efficient training of transformers. arXiv preprint arXiv:2302.01107, 2023. [Google Scholar]
- Liang, A.; Jiang, X.; Sun, Y.; Lu, C. Bi-Mamba4TS: Bidirectional Mamba for Time Series Forecasting. arXiv preprint arXiv:2404.15772, 2024. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, arXiv:2312.00752 2023.
- Bouwmans, T.; Sobral, A.; Javed, S.; Jung, S.K.; Zahzah, E.H. Decomposition into low-rank plus additive matrices for background/foreground separation: A review for a comparative evaluation with a large-scale dataset. Computer Science Review 2017, 23, 1–71. [Google Scholar] [CrossRef]
- Ahamed, M.A.; Cheng, Q. Timemachine: A time series is worth 4 mambas for long-term forecasting. arXiv preprint arXiv:2403.09898, 2024. [Google Scholar]
- Kim, T.; Kim, J.; Tae, Y.; Park, C.; Choi, J.H.; Choo, J. Reversible instance normalization for accurate time-series forecasting against distribution shift. In Proceedings of the International Conference on Learning Representations; 2021. [Google Scholar]
- Nie, Y.; Nguyen, N.H.; Sinthong, P.; Kalagnanam, J. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730, 2022. [Google Scholar]
- Golub, G.H.; Van Loan, C.F. Matrix computations; JHU press, 2013.
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 2019, 32. [Google Scholar]
- Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; Long, M. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625, 2023. [Google Scholar]
- Zhang, Y.; Yan, J. Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In Proceedings of the The eleventh international conference on learning representations; 2023. [Google Scholar]
- Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems 2021, 34, 22419–22430. [Google Scholar]
- Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are transformers effective for time series forecasting? In Proceedings of the Proceedings of the AAAI conference on artificial intelligence, 2023, Vol. 37, pp. 11121–11128.
- Wu, H.; Hu, T.; Liu, Y.; Zhou, H.; Wang, J.; Long, M. Timesnet: Temporal 2d-variation modeling for general time series analysis. arXiv preprint arXiv:2210.02186, 2022. [Google Scholar]
- Huang, Q.; Shen, L.; Zhang, R.; Ding, S.; Wang, B.; Zhou, Z.; Wang, Y. Crossgnn: Confronting noisy multivariate time series via cross interaction refinement. Advances in Neural Information Processing Systems 2023, 36, 46885–46902. [Google Scholar]


| Dataset | Variables | Granularity | Samples |
|---|---|---|---|
| ETTh1 | 7 | 1 hour | 17,420 |
| ETTh2 | 7 | 1 hour | 17,420 |
| ETTm1 | 7 | 15 min | 69,680 |
| ETTm2 | 7 | 15 min | 69,680 |
| Weather | 21 | 10 min | 52,696 |
| Electricity | 321 | 1 hour | 26,304 |
| Traffic | 862 | 1 hour | 17,544 |
| Dataset | Metrics | Bi- Time SSM |
Bi- Mamba+ |
Time Machine |
i Trans former |
Patch TST |
Cross former |
Auto former |
D Linear |
Times Net |
Cross GNN |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ETTh1 | MSE | 0.419 | 0.437 | 0.441 | 0.454 | 0.448 | 0.529 | 0.496 | 0.452 | 0.458 | 0.444 |
| MAE | 0.421 | 0.431 | 0.439 | 0.448 | 0.441 | 0.522 | 0.487 | 0.444 | 0.450 | 0.447 | |
| ETTh2 | MSE | 0.349 | 0.372 | 0.357 | 0.383 | 0.387 | 0.942 | 0.450 | 0.415 | 0.414 | 0.386 |
| MAE | 0.384 | 0.399 | 0.409 | 0.403 | 0.407 | 0.684 | 0.459 | 0.439 | 0.427 | 0.412 | |
| ETTm1 | MSE | 0.375 | 0.378 | 0.399 | 0.407 | 0.387 | 0.513 | 0.588 | 0.412 | 0.400 | 0.393 |
| MAE | 0.392 | 0.396 | 0.407 | 0.410 | 0.402 | 0.495 | 0.517 | 0.407 | 0.406 | 0.403 | |
| ETTm2 | MSE | 0.276 | 0.281 | 0.300 | 0.288 | 0.281 | 0.757 | 0.327 | 0.296 | 0.291 | 0.281 |
| MAE | 0.321 | 0.328 | 0.342 | 0.332 | 0.326 | 0.611 | 0.371 | 0.338 | 0.333 | 0.326 | |
| Weather | MSE | 0.244 | 0.243 | 0.252 | 0.252 | 0.262 | 0.259 | 0.338 | 0.273 | 0.259 | 0.266 |
| MAE | 0.272 | 0.272 | 0.284 | 0.285 | 0.283 | 0.315 | 0.382 | 0.291 | 0.287 | 0.294 | |
| Traffic | MSE | 0.429 | 0.404 | 0.426 | 0.428 | 0.481 | 0.550 | 0.617 | 0.622 | 0.620 | 0.657 |
| MAE | 0.282 | 0.272 | 0.308 | 0.282 | 0.305 | 0.304 | 0.372 | 0.371 | 0.336 | 0.396 | |
| Electricity | MSE | 0.175 | 0.166 | 0.186 | 0.178 | 0.196 | 0.244 | 0.227 | 0.221 | 0.193 | 0.235 |
| MAE | 0.259 | 0.263 | 0.272 | 0.270 | 0.281 | 0.334 | 0.338 | 0.305 | 0.295 | 0.321 |
| Dataset | Full Model (Bi-TimeSSM) | Without Encoder | Relative Change | |||
| MSE | MAE | MSE | MAE | MSE | MAE | |
| ETTh1 | 0.419 | 0.421 | 0.420 | 0.422 | +0.2% | +0.2% |
| ETTh2 | 0.349 | 0.384 | 0.350 | 0.386 | +0.2% | +0.5% |
| ETTm1 | 0.375 | 0.392 | 0.376 | 0.394 | +0.2% | +0.5% |
| ETTm2 | 0.276 | 0.321 | 0.279 | 0.322 | +1.0% | +0.3% |
| Dataset | Full Model (Bi-TimeSSM) | Without Mamba | Relative Change | |||
| MSE | MAE | MSE | MAE | MSE | MAE | |
| ETTh1 | 0.419 | 0.421 | 0.432 | 0.427 | +3.1% | +1.4% |
| ETTh2 | 0.349 | 0.384 | 0.351 | 0.386 | +0.5% | +0.5% |
| ETTm1 | 0.375 | 0.392 | 0.405 | 0.401 | +8.0% | +2.2% |
| ETTm2 | 0.276 | 0.321 | 0.287 | 0.329 | +3.9% | +2.4% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).