Submitted:
12 August 2024
Posted:
12 August 2024
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
2. Literature Review
3. Methodology
3.1. Dataset and Features
3.2. Feature Characterization
3.2.1. Simple Moving Average (SMA)
3.2.2. Exponential Moving Average (EMA)
3.2.3. Moving Average Convergence Divergence (MACD)
3.2.4. Lagged Momentum
3.2.5. Bollinger Bands
3.2.6. Relative Strength Index (RSI)
3.2.7. Stochastic K
3.2.8. Commodity Channel Index (CCI)
3.2.9. Average True Range (ATR)
3.2.10. Accumulation/Distribution (Acc/Dist)
3.2.11. Volume Weighted Average Price (VWAP)
3.2.12. Lagged Returns
3.3. Feature Engineering
3.3.1. Correlation matrix with Heatmap



Analysis of Correlation Matrix
- Nasdaq-100 Index (NDX): The correlation matrix indicates a strong correlation among the features MA_5d, MA_10d, MA_20d, MA_50d, EMA_5d, EMA_10d, EMA20d, and EMA_50d. Consequently, only a single feature will be chosen for model construction. Additionally, Bollinger bands and VWAP also exhibit a similar correlation.
- (EUR/USD) FX Rate: The features MA_5d, MA_10d, MA_20d, MA_50d, EMA_5d, EMA_10d, EMA20d, and EMA_50d in the correlation matrix are found to be highly correlated with each other. As a result, only one of these features will be selected for model construction. This correlation is also seen between Bollinger bands and VWAP.
- Amazon (AMZN) Stock: High correlations were found among the features MA_5d, MA_10d, MA_20d, MA_50d, EMA_5d, EMA_10d, EMA20d, and EMA_50d within the correlation matrix. Consequently, only one of these features will be chosen for constructing the model. The correlation between Bollinger bands and VWAP was also observed.
3.3.2. K-Means Clustering
Elbow Plot



Analysis of Elbow Plot
- Nasdaq-100 Index (NDX): After analyzing the Elbow plot, we observed a clear inflexion point at 3, which led us to modify the KMeans clustering approach to incorporate 3 clusters.
- (EUR/USD) FX Rate: Upon analyzing the Elbow plot, we noticed a discernible shift in the curve at 4. Consequently, we will modify the KMeans clustering method to include 4 clusters.
- Amazon (AMZN) Stock: After examining the Elbow plot, we have identified a significant alteration in the curve at 4, which has led us to modify the KMeans clustering process by including 4 clusters.
3.3.3. Self-Organizing Maps (SOMs)



3.4. Modelling Approach
3.4.1. Long Short-Term Memory (LSTM)
- Input Weights: These weights are used to give importance to the input for the current time step.
- Output Weights: These weights are used to give importance to the output from the previous time step.
- Internal State: The internal state of the memory cell is utilized in computing the output for the current time step.
- Forget Gate: Determines which information should be discarded from the cell.
- Input Gate: Determines which values from the input should be used to update the memory state.
- Output Gate: Decides what to output based on the input and the memory of the cell.

3.4.2. LSTM Input
3.4.3. Data Splitting and Lookback Periods
3.4.4. LSTM Model Design
4. Model Evaluation and Results
4.1. Train Data Accuracy
| Model - Nasdaq-100 Index (NDX) | Loss | Accuracy |
|---|---|---|
| LSTM with 5-Day Lookback Period | 0.6885 | 0.5507 |
| LSTM with 10-Day Lookback Period | 0.6893 | 0.5518 |
| LSTM with 21-Day Lookback Period | 0.6879 | 0.5495 |
| LSTM with 50-Day Lookback Period | 0.6893 | 0.5525 |
| LSTM with 200-Day Lookback Period | 0.6907 | 0.5488 |
| Model - EUR/USD FX Rate | Loss | Accuracy |
|---|---|---|
| LSTM with 5-Day Lookback Period | 0.6937 | 0.5012 |
| LSTM with 10-Day Lookback Period | 0.6941 | 0.4950 |
| LSTM with 21-Day Lookback Period | 0.6925 | 0.5190 |
| LSTM with 50-Day Lookback Period | 0.6951 | 0.4973 |
| LSTM with 200-Day Lookback Period | 0.6939 | 0.4990 |
| Model - Amazon (AMZN) Stock | Loss | Accuracy |
|---|---|---|
| LSTM with 5-Day Lookback Period | 0.6919 | 0.5365 |
| LSTM with 10-Day Lookback Period | 0.6917 | 0.5353 |
| LSTM with 21-Day Lookback Period | 0.6924 | 0.5389 |
| LSTM with 50-Day Lookback Period | 0.6923 | 0.5288 |
| LSTM with 200-Day Lookback Period | 0.6899 | 0.5365 |
4.2. Test Data Accuracy
| Model-Nasdaq-100 Index (NDX) | Prediction Accuracy |
|---|---|
| LSTM with 5-Day Lookback Period | 0.5809 |
| LSTM with 10-Day Lookback Period | 0.5816 |
| LSTM with 21-Day Lookback Period | 0.5825 |
| LSTM with 50-Day Lookback Period | 0.5816 |
| LSTM with 200-Day Lookback Period | 0.5855 |
| Model - EUR/USD FX Rate | Prediction Accuracy |
|---|---|
| LSTM with 5-Day Lookback Period | 0.4939 |
| LSTM with 10-Day Lookback Period | 0.5067 |
| LSTM with 21-Day Lookback Period | 0.4938 |
| LSTM with 50-Day Lookback Period | 0.5079 |
| LSTM with 200-Day Lookback Period | 0.4934 |
| Model - Amazon (AMZN) Stock | Prediction Accuracy |
|---|---|
| LSTM with 5-Day Lookback Period | 0.5310 |
| LSTM with 10-Day Lookback Period | 0.5323 |
| LSTM with 21-Day Lookback Period | 0.5318 |
| LSTM with 50-Day Lookback Period | 0.5315 |
| LSTM with 200-Day Lookback Period | 0.5302 |
5. Visualization
6. Limitations
7. Final Conclusion
- Nasdaq-100 Index (NDX): The performance metrics included in the code suggest that the chosen features are not effective predictors for the short-term directional movement (1 day) of the Nasdaq-100 index. The area under the curve is less than or equal to 50% for the classes, indicating that the model is unable to differentiate between them. Furthermore, the model is incapable of predicting class ‘0’, which is reflected in the confusion matrix, with both the true positive and false negative values being zero. The highest levels of accuracy were observed for the 50-day and 200-day look-back periods in the training and testing data sets, with a maximum accuracy of 55.25% and 58.55%, respectively.
- EUR/USD FX Rate: The model demonstrated progress in predicting both classes when utilizing a 200-day look-back period. Nevertheless, for the majority of periods, the area under the curve is less than or equal to 50% for the classes, implying that the model is incapable of differentiating between them. Therefore, it is apparent that the chosen features are inadequate predictors for determining the short-term directional movement (1Day) of the EUR/USD FX rate. The maximum accuracy was observed for the 21-day look-back period in terms of training (51.90%), whereas, for the test data (50.79%), the maximum accuracy was observed with a 50-day look-back period.
- Amazon (AMZN) Stock: Similar outcomes with NDX were observed for the AMZN. Upon analyzing the performance metrics included in the code, it is evident that the selected features are insufficient predictors for determining the short-term directional movement (1Day) of the AMZN stock. The area under the curve is less than or equal to 50% for the classes, indicating that the model cannot distinguish between them. Furthermore, the model cannot predict class ‘0,’ as seen in the confusion matrix, where both the true positive and false negative values are zero. The maximum levels of accuracy were observed for the 21-day and 10-day look-back periods, with the highest training accuracy being 53.89% and the highest testing accuracy being 53.23%.
8. Recommendation for Future Work
Disclaimer
Appendix
References
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning (Vol. 1). Cambridge: MIT press.
- Génois, M., & Gramfort, A. (2019). Recurrent neural networks for time series forecasting: A survey. arXiv preprint arXiv:1909.04404.
- Wang, L., Liu, J., Wei, Y., & Zhang, X. (2018). Deep learning for financial market: A review. arXiv preprint arXiv:1805.08531. [CrossRef]
- Moody, J., & Saffell, M. (2001). Learning to trade using positive linear programming. IEEE Transactions on Neural Networks, 12(4), 875-889. [CrossRef]
- Li, X., Lin, L., Liu, Z., & Feng, Y. (2021). A Comparative Study of LSTM and ARIMA Models for Stock Price Prediction. Journal of Applied Mathematics, 2021, 1-10.
- Zhang, Z., Zhao, H., Wang, X., & Liu, Y. (2020). A Comparative Study of Deep Learning Models for Exchange Rate Prediction. Mathematical Problems in Engineering, 2020, 1-15.
- Wang, H., Liu, Y., & Fan, Y. (2018). Deep learning for foreign exchange rate prediction using technical indicators and news articles. Expert Systems with Applications, 104, 33-44.
- Zhou, X., & Pei, J. (2019). A comparative study of LSTM with technical indicators for stock price movement prediction. Applied Soft Computing, 74, 238-248.
- Understanding LSTM networks. Understanding LSTM Networks -- colah’s blog. (n.d.). Retrieved March 16, 2023, from https://colah.github.io/posts/2015-08-Understanding-LSTMs/.
- Investopedia. (2021). Simple Moving Average (SMA). Retrieved March 25, 2023 from https://www.investopedia.com/terms/s/sma.asp.
- Investopedia. (2021). Exponential Moving Average (EMA). Retrieved March 25, 2023 from https://www.investopedia.com/terms/e/ema.asp.
- Bollinger, J. (2001). Bollinger on Bollinger Bands. McGraw-Hill Education.
- Investopedia. (2021). Bollinger Bands. Retrieved March 25, 2023 from https://www.investopedia.com/terms/b/bollingerbands.asp.
- Chande, T. S., & Kroll, S. (1994). The new technical trader: boost your profit by plugging into the latest indicators (Vol. 44). John Wiley & Sons Incorporated.
- CMT, M. N. K. (2009). Technical analysis plain and simple: Charting the markets in your language. FT Press.
- Murphy, J. J. (1999). Technical analysis of the financial markets: A comprehensive guide to trading methods and applications. Penguin.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).