Preprint
Article

This version is not peer-reviewed.

Mamba-LSTM-Attention (MLA): A Hybrid Architecture for Long-Term Time Series Forecasting with Cross-Scale Stability

Submitted:

22 January 2026

Posted:

22 January 2026

You are already at the latest version

Abstract
Forecasting long-term time series (LTSF) requires a delicate balance between capturing global dependencies and preserving granular local dynamics. Although State Space Models (SSMs) and Transformers have been successful, a technical challenge remains, in which individual paradigms often fail to perform well over extended horizons due to the difficulty of simultaneously achieving linear-time efficiency and high-fidelity local refinement at the same time. This study introduces Mamba-LSTM-Attention (MLA), a novel hybrid architecture featuring a cascaded topology designed to bridge these dimensions. In this work, the core innovation is the hierarchical feature evolution mechanism, in which a Mamba module serves first as an efficient global encoder to capture long-range periodic trends at linear complexity. Following this, gated LSTM units are used to refine the algorithm at the micro-scale in order to filter noise and characterize non-linear local fluctuations. Lastly, a multi-head attention mechanism performs a dynamic feature re-weighting in order to focus on key historical signals. Systematic evaluations across four multivariate benchmark datasets demonstrate that MLA achieves exceptional cross-step forecasting stability. Most notably, on the ETTh1 dataset, MLA maintains a remarkably narrow Mean Squared Error (MSE) fluctuation range (0.127) as the forecasting horizon extends from T=96 to T=720. This empirical evidence confirms that the integrated Mamba module effectively mitigates the error accumulation typically encountered by vanilla LSTMs. While the current implementation faces an information bottleneck due to a single-point projection decoding strategy, the ablation studies (revealing a 19.76% MSE surge upon LSTM removal) validate the combination of the proposed architecture. This work establishes a robust framework for hybrid SSM-RNN modeling and a clear path for future performance enhancements through sequence-to-sequence mechanisms.
Keywords: 
;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated