Submitted:
29 August 2024
Posted:
29 August 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
2.1. Traditional Methods
2.2. Deep Learning Approaches
2.3. Limitations of LSTM
3. Methodology
3.1. xLSTM
- Enhanced Memory Cells: The xLSTM introduces modifications that allow memory cells to operate with more flexibility and efficiency. Specifically, xLSTM includes two variants of LSTM, sLSTM and mLSTM. The sLSTM variant uses scalar memory updates, which simplify the memory operations, allowing for faster and more efficient computation. On the other hand, mLSTM employs a matrix memory structure that can handle more complex dependencies across longer sequences. The matrix memory enables the model for parallel computations, significantly improving its ability to manage and retain information across extended periods, which is particularly beneficial in scenarios such as long-term stock price forecasting.
- Exponential Gating: Gating is central to the operation of LSTM networks, controlling the flow of information through input, forget, and output gates. In xLSTM, these gates are modified to include exponential gating and advanced normalization techniques, as shown in Equation (1).where represents the input or forget gate at time t, and are weight matrices, and is the bias. The exponential function exp provides a more flexible and dynamic adjustment of gate values, allowing for a more stable gradient flow during training. Exponential gating helps manage the vanishing gradient problem more effectively by dynamically adjusting how much past information is retained or forgotten during the training process. The adjustment ensures that relevant information is preserved over long sequences, while less important data is filtered out, making the model more efficient in learning and predicting complex patterns. This is crucial in financial applications where retaining key market trends while discarding noise can significantly enhance prediction accuracy.
3.2. Dataset
4. Experiments
4.1. Experiment Setup
4.2. Evaluation Metrics
5. Results
5.1. Stock-Specific Performance
5.2. Model Performance across Prediction Horizons
5.3. Performance Gap across Time Horizons
5.4. Practical Implications
6. Conclusions
References
- Ma, F.; Wang, J.; Wahab, M. I. M.; Ma, Y. Stock market volatility predictability in a data-rich world: A new insight. International Journal of Forecasting 2023, 39, 1804–1819. [Google Scholar] [CrossRef]
- Feng, F.; He, X.; Wang, X.; Luo, C.; Liu, Y.; Chua, T. S. Temporal relational ranking for stock prediction. ACM Transactions on Information Systems (TOIS) 2019, 37, 1–30. [Google Scholar] [CrossRef]
- Baker, M.; Wurgler, J. Investor sentiment and the cross-section of stock returns. The journal of Finance 2006, 61, 1645–1680. [Google Scholar] [CrossRef]
- Naeem, M. A.; Qureshi, F.; Farid, S.; Tiwari, A. K.; Elheddad, M. Time-frequency information transmission among financial markets: evidence from implied volatility. Annals of Operations Research 2024, 334, 701–729. [Google Scholar] [CrossRef]
- Alkhatib, K.; Najadat, H.; Hmeidi, I.; Shatnawi, M. K. A. Stock price prediction using k-nearest neighbor (knn) algorithm. International Journal of Business, Humanities and Technology 2013, 3, 32–44. [Google Scholar]
- Khanderwal, S.; Mohanty, D. Stock price prediction using arima model. International Journal of Marketing & Human Resource Research 2021, 2, 98–107. [Google Scholar]
- Yun, K. K.; Yoon, S. W.; Won, D. Prediction of stock price direction using a hybrid ga-xgboost algorithm with a three-stage feature engineering process. Expert Systems with Applications 2021, 186, 115716. [Google Scholar] [CrossRef]
- Ismail Fawaz, H.; Forestier, G.; Weber, J.; Idoumghar, L.; Muller, P. A. Deep learning for time series classification: a review. Data mining and knowledge discovery 2019, 33, 917–963. [Google Scholar] [CrossRef]
- Hochreiter, S. Long short-term memory; Neural Computation MIT-Press, 1997. [Google Scholar]
- Sundermeyer, M.; Ney, H.; Schlüter, R. From feedforward to recurrent lstm neural networks for language modeling. IEEE/ACM Transactions on Audio, Speech, and Language Processing 2015, 23, 517–529. [Google Scholar] [CrossRef]
- Ni, H.; Meng, S.; Geng, X.; Li, P.; Li, Z.; Chen, X.; Wang, X.; Zhang, S. Time series modeling for heart rate prediction: From arima to transformers. arXiv 2024, arXiv:2406.12199. [Google Scholar]
- Sunny, M.A.I.; Maswood, M.M.S.; Alharbi, A.G. Deep learning-based stock price prediction using lstm and bi-directional lstm model. In 2020 2nd novel intelligent and leading emerging sciences conference (NILES); IEEE, 2020; pp. 87–92. [Google Scholar]
- Mehtab, S.; Sen, J.; Dutta, A. Stock price prediction using machine learning and lstm-based deep learning models. In Machine Learning and Metaheuristics Algorithms, and Applications: Second Symposium, SoMMA 2020, Chennai, India, October 14–17, 2020, Revised Selected Papers 2; Springer, 2021; pp. 88–106. [Google Scholar]
- Md, A.Q.; Kapoor, S.; AV, C.J.; Sivaraman, A.K.; Tee, K.F.; Sabireen, H.; Janakiraman, N. Novel optimization approach for stock price forecasting using multi-layered sequential lstm. Applied Soft Computing 2023, 134, 109830. [Google Scholar] [CrossRef]
- Gers, F.A.; Schraudolph, N.N.; Schmidhuber, J. Learning precise timing with lstm recurrent networks. Journal of machine learning research 2002, 3, 115–143. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar]
- Beck, M.; Pöppel, K.; Spanring, M.; Auer, A.; Prudnikova, O.; Kopp, M.; Klambauer, G.; Brandstetter, J.; Hochreiter, S. xlstm: Extended long short-term memory. arXiv 2024, arXiv:2405.04517. [Google Scholar]
- Alharthi, M.; Mahmood, A. xlstmtime: Long-term time series forecasting with xlstm. AI 2024, 5, 1482–1495. [Google Scholar] [CrossRef]
- Ni, H.; Meng, S.; Chen, X.; Zhao, Z.; Chen, A.; Li, P.; Zhang, S.; Yin, Q.; Wang, Y.; Chan, Y. Harnessing earnings reports for stock predictions: A qlora-enhanced llm approach. arXiv 2024, arXiv:2408.06634. [Google Scholar]
- Mo, K.; Liu, W.; Xu, X.; Yu, C.; Zou, Y.; Xia, F. Fine-tuning gemma-7b for enhanced sentiment analysis of financial news headlines. arXiv 2024, arXiv:2406.13626. [Google Scholar]
- Gu, W.; Zhong, Y.; Li, S.; Wei, C.; Dong, L.; Wang, Z.; Yan, C. Predicting stock prices with finbert-lstm: Integrating news sentiment analysis. arXiv 2024, arXiv:2407.16150. [Google Scholar]
- Yan, Y. Influencing factors of housing price in new york-analysis: Based on excel multi-regression model. In Proceedings of the International Conference on Big Data Economy and Digital Management; 2022. [Google Scholar] [CrossRef]
- Zhao, S.; Dong, Z.; Cao, Z.; Douady, R. Hedge fund portfolio construction using polymodel theory and itransformer. arXiv 2024, arXiv:2408.03320. [Google Scholar]
- Yan, C.; Weng, Y.; Wang, J.; Zhao, Y.; Zou, Y.; Li, Z.; Baltimore, U.S. Enhancing credit card fraud detection through adaptive model optimization. 2024. [Google Scholar] [CrossRef]
- Hu, T.; Zhu, W.; Yan, Y. Artificial intelligence aspect of transportation analysis using large scale systems. In Proceedings of the 2023 6th Artificial Intelligence and Cloud Computing Conference; 2023; pp. 54–59. [Google Scholar]
- Chen, J.; Xu, W.; Wang, J. Prediction of car purchase amount based on genetic algorithm optimised bp neural network regression algorithm. Preprints 2024. [Google Scholar] [CrossRef]
- Zhao, S.; Lu, J.; Yang, J.; Chow, E.; Xi, Y. Efficient two-stage gaussian process regression via automatic kernel search and subsampling. arXiv 2024, arXiv:2405.13785. [Google Scholar]
- Engle, R.F. Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica: Journal of the econometric society 1982, 987–1007. [Google Scholar] [CrossRef]
- Gardner, E.S., Jr. Exponential smoothing: The state of the art. Journal of forecasting 1985, 4, 1–28. [Google Scholar] [CrossRef]
- Lin, Y.; Guo, H.; Hu, J. An svm-based approach for stock market trend prediction. In The 2013 international joint conference on neural networks (IJCNN); IEEE, 2013; pp. 1–7. [Google Scholar]
- Khaidem, L.; Saha, S.; Dey, S.R. Predicting the direction of stock market prices using random forest. arXiv 2016, arXiv:1605.00003. [Google Scholar]
- Wu, C.; Yang, X.; Gilkes, E.G.; Cui, H.; Choi, J.; Sun, N.; Liao, Z.; Fan, B.; Santillana, M.; Celi, L.; Silva, P. De-identification and obfuscation of gender attributes from retinal scans. In Workshop on Clinical Image-Based Procedures; Springer, 2023; pp. 91–101. [Google Scholar]
- Zhong, Y.; Liu, Y.; Gao, E.; Wei, C.; Wang, Z.; Yan, C. Deep learning solutions for pneumonia detection: Performance comparison of custom and transfer learning models. medRxiv 2024, 2024–06. [Google Scholar]
- Zhang, Q.; Qi, W.; Zheng, H.; Shen, X. Cu-net: a u-net architecture for efficient brain-tumor segmentation on brats 2019 dataset. arXiv 2024, arXiv:2406.13113. [Google Scholar]
- Dang, B.; Ma, D.; Li, S.; Qi, Z.; Zhu, E.Y. Deep learning-based snore sound analysis for the detection of night-time breathing disorders. Applied and Computational Engineering 2024, 76, 109–114. [Google Scholar] [CrossRef]
- Dang, B.; Zhao, W.; Li, Y.; Ma, D.; Yu, Q.; Zhu, E.Y. Real-time pill identification for the visually impaired using deep learning. arXiv 2024, arXiv:2405.05983. [Google Scholar]
- Tan, L.; Liu, S.; Gao, J.; Liu, X.; Chu, L.; Jiang, H. Enhanced self-checkout system for retail based on improved yolov10. arXiv 2024, arXiv:2407.21308. [Google Scholar]
- Dan, H.C.; Lu, B.; Li, M. Evaluation of asphalt pavement texture using multiview stereo reconstruction based on deep learning. Construction and Building Materials 2024, 412, 134837. [Google Scholar] [CrossRef]
- Li, P.; Abouelenien, M.; Mihalcea, R. Deception detection from linguistic and physiological data streams using bimodal convolutional neural networks. arXiv 2024, arXiv:2311.10944. [Google Scholar]
- Yu, C.; Xu, Y.; Cao, J.; Zhang, Y.; Jin, Y.; Zhu, M. Credit card fraud detection using advanced transformer model. arXiv 2024, arXiv:2406.03733. [Google Scholar]
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 1994, 5, 157–166. [Google Scholar] [CrossRef] [PubMed]
- Schmidhuber, J.; Hochreiter, S.; et al. Long short-term memory. Neural Comput 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Fischer, T.; Krauss, C. Deep learning with long short-term memory networks for financial market predictions. European journal of operational research 2018, 270, 654–669. [Google Scholar] [CrossRef]
- Bao, W.; Yue, J.; Rao, Y. A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PloS one 2017, 12, e0180944. [Google Scholar] [CrossRef] [PubMed]
- Cho, K. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv 2020, arXiv:1406.1078. [Google Scholar]
- Vaswani, A. Attention is all you need. Advances in Neural Information Processing Systems 2017. [Google Scholar]
- Finance, Y. Yahoo finance historical stock data. 15 July. Available online: https://finance.yahoo.com (accessed on 15 July 2024).


| STOCK | MODEL | T = 1 | T = 3 | T = 5 | T = 10 | T = 15 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE↓ | R2 | RMSE↓ | R2 | RMSE↓ | R2 | RMSE↓ | R2 | RMSE↓ | R2 | ||
| AAPL | LSTM | 4.341 | 0.887 | 6.513 | 0.768 | 7.839 | 0.665 | 10.416 | 0.464 | 12.433 | 0.261 |
| xLSTM | 3.020 | 0.941 | 5.155 | 0.831 | 6.347 | 0.875 | 8.750 | 0.742 | 10.500 | 0.376 | |
| JNJ | LSTM | 2.381 | 0.941 | 3.613 | 0.882 | 4.340 | 0.834 | 5.429 | 0.712 | 6.413 | 0.601 |
| xLSTM | 1.716 | 0.972 | 2.915 | 0.917 | 3.605 | 0.875 | 4.670 | 0.568 | 5.616 | 0.659 | |
| NKE | LSTM | 3.516 | 0.926 | 5.364 | 0.835 | 6.614 | 0.763 | 9.094 | 0.515 | 11.128 | 0.320 |
| xLSTM | 2.522 | 0.962 | 4.334 | 0.886 | 5.474 | 0.818 | 7.817 | 0.625 | 9.690 | 0.419 | |
| AVG | LSTM | 3.413 | 0.918 | 5.163 | 0.828 | 6.264 | 0.754 | 8.313 | 0.564 | 9.991 | 0.394 |
| xLSTM | 2.419 | 0.958 | 4.135 | 0.878 | 5.142 | 0.812 | 7.079 | 0.652 | 8.602 | 0.484 | |
| STOCK | T = 1 | T = 3 | T = 5 | T = 10 | T = 15 |
|---|---|---|---|---|---|
| AAPL | 1.321 | 1.358 | 1.492 | 1.666 | 1.933 |
| JNJ | 0.665 | 0.698 | 0.735 | 0.759 | 0.797 |
| NKE | 0.994 | 1.030 | 1.140 | 1.277 | 1.438 |
| AVG | 0.993 | 1.029 | 1.122 | 1.234 | 1.389 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).