Preprint
Article

This version is not peer-reviewed.

Predicting Financial Enterprise Stocks and Economic Data Trends Using Machine Learning Time Series Analysis

A peer-reviewed version of this preprint was published in:
Applied and Computational Engineering 2024, 87(1), 26-32. https://doi.org/10.54254/2755-2721/87/20241562

Submitted:

10 July 2024

Posted:

11 July 2024

You are already at the latest version

Abstract
This paper explores the application of machine learning in financial time series analysis, focusing on predicting trends in financial enterprise stocks and economic data. It begins by distinguishing stocks from stocks and elucidates risk management strategies in the stock market. Traditional statistical methods such as ARIMA and exponential smoothing are discussed in terms of their advantages and limitations in economic forecasting. Subsequently, the effectiveness of machine learning techniques, particularly LSTM and CNN-BiLSTM hybrid models, in financial market prediction is detailed, highlighting their capability to capture nonlinear patterns in dynamic markets. The study demonstrates the advancements in predictive accuracy and robustness achieved by deep learning methods through empirical analysis and model validation. The findings contribute significantly to academic discourse and offer practical insights for investors, financial analysts, and policymakers navigating market volatility and optimizing investment strategies. Finally, the paper outlines prospects for machine learning in financial forecasting, laying a theoretical foundation and methodological framework for achieving more precise and reliable economic predictions.
Keywords: 
;  ;  ;  ;  

1. Introduction

Stocks represent ownership stakes in corporations issued to raise capital, entitling holders to residual profits and assets after debt obligations are fulfilled, alongside voting rights proportional to their shareholdings. Unlike stocks with fixed maturity dates, stocks do not expire as long as the issuing company remains solvent. Stocks guarantee fixed returns specified in the contract, whereas stock returns are variable and contingent upon corporate profitability and asset value. [1] This distinction underscores the inherent risk associated with stocks, where investors face uncertainty regarding dividends and capital gains contingent upon the company’s financial performance. In contrast to risk-free stocks, which promise predictable returns albeit with default risk, there are no risk-free stocks. Stockholders assume greater risk due to fluctuating dividends and market prices, contingent upon broader economic conditions and company-specific factors. Thus, managing stock market risks involves strategies to mitigate volatility and uncertainty, aligning investment goals with risk tolerance and market dynamics.
Therefore, this paper is based on machine learning time series forecasting in artificial intelligence, focusing on predicting trends in financial enterprise stocks and economic data. By leveraging advanced algorithms such as recurrent neural networks [2] (RNNs) and long short-term memory networks [3] (LSTMs), this study aims to enhance predictive accuracy and robustness in forecasting stock market movements and economic indicators. Integrating AI into time series analysis offers unprecedented opportunities to capture complex patterns and dependencies in financial data., enabling more informed decision-making and risk management strategies.
Through empirical analysis and model validation, this research explores the effectiveness of machine learning techniques in capturing the nonlinear and dynamic nature of financial markets. Insights gained from this study contribute to academic discourse and have practical implications for investors, financial analysts, and policymakers seeking to navigate volatile market conditions and optimize investment strategies. By harnessing AI’s predictive capabilities, this paper aims to advance the understanding and application of machine learning in financial forecasting, paving the way for more accurate and reliable predictions in real-world economic scenarios.

3. Methodology

In recent years, applying advanced machine learning techniques to financial time series analysis has garnered significant attention due to their potential to uncover intricate patterns and improve prediction accuracy. Among these techniques, the combination of Convolutional Neural Networks (CNNs) and Long Short-Term Memory networks (LSTMs), known as CNN-LSTM models, has proven particularly effective. This approach leverages CNNs to extract spatial features from input data and LSTMs to capture temporal dependencies. It is well-suited for analyzing multivariate time series data such as stock prices and economics.

3.1. Model Discussion

The CNN- BiLSTM LSTM model integrates two powerful neural network architectures:
  • Convolutional Neural Networks (CNNs):
    -
    CNNs are adept at learning spatial hierarchies of features through convolutional layers.
    -
    In the context of multivariate time series, CNNs can be applied to extract spatial patterns across different variables (e.g., multiple stock prices, economic indicators) at each time step.
  • Long Short-Term Memory networks (LSTMs):
    -
    LSTMs are well-suited for modeling temporal dependencies by maintaining long-term memory of sequential data.
  • BiLSTM: Based on the cell structure of LSTM, the LSTM historical model has stronger historical information screening ability and chronological order learning ability, and can rationally use the input historical data information to form long-term memory of historical data information in the past period, thus avoiding the problem that effective historical information cannot be stored permanently due to the influence of continuous input historical data. Since data processing depends on the direction of network connection, Bi-directional Long Short-Term Memory (BiLSTM) is introduced for events that need to consider the impact of future data on historical data. The model can reference the influence of both historical and future data on the predicted results.
Figure 3. BiLSTM expansion structure based on LSTM.
Figure 3. BiLSTM expansion structure based on LSTM.
Preprints 111746 g003

The First Step is Data Preprocessing

The data is preprocessed first, including the processing of missing and duplicate values, and then normalized. After the processing is completed, the data is divided into a test set and a training set. In the second step, model construction inputs the training set data into the CNN model through the construction of the convolution layer and pooling layer of CNN for feature extraction and then through the BiLSTM model for sequence prediction, during which the number of layers, batch size and other model parameters of the neural network are adjusted. Step three: update the weights. The attention mechanism is used to increase the weight value of the extracted features and update the weight. Finally, the CNN-BiLSTM-Attention model was constructed, and the test set data was input to verify the model’s accuracy.

3.2. Data Processing

In this experiment, the stock data of the People’s Bank of China were studied, and the research results were compared with the LSTM model (LSTM-ATTENTION), the convolutional neural network, and the bidirectional long and short memory neural network mixed model (CNN-BiLSTM), and the single long and short memory neural network model (LSTM). It is concluded that the CNN-BiLSTM-Attention model has a good effect.
Due to the problems of data missing and data duplication in the obtained data, please
The data needs to be preprocessed. The method of averaging adjacent data is adopted to process the missing data, and the duplicate data is deleted. Due to the large difference in the results of the stock data, the data needs to be normalized before being input into the neural network model using 0-1 normalization. The calculation method is as follows:
x x = x m i n m a x m i n
x is the original sample data value; min is the minimum value in the sample data.max indicates the maximum value in the sample data.

3.3. Test Data and Methods

The input of the neural network is the data closely related to the trading of the stock, and the output is the closing price, which predicts the close of the next trading day’s price. This paper downloads experimental data from Tushara’s official website. It selects the data of People’s Bank of America Stock (stock code 000001) from January 1, 2005, to October 4, 2021, of which 80% is used as the training set and 20% is used as the test set. The data set includes the opening price (open), closing price (close), highest price (high), lowest price (low), yesterday’s closing price (pre-close), rise and fall (amount), rise and fall (change), turnover rate (rate), volume (volume), transaction amount (business) s), volume weighted average price (price) these basic trading data.
Since the stock data has the characteristics of time series, how many windows should be selected
In this paper, the window size is n, and roll is selected. The window size is 1, and MAE comparisons of different window lengths are selected 5 times. See Table 1.
Table 1 shows that when the window size is 5, the MAE value is larger than the step size. When the selection is 15, the MAE value is also relatively large, and when the selection step is 10, the average absolute error value is the smallest, so the optimal window size is selected Select 10.
To verify the high accuracy of the model, different algorithms are used for comparison, and the comparison results are shown in Table 2.
As can be seen from Table 2, compared with the LSTM hybrid model, the overall trend is better, while the new hybrid model CNNBiLSTM-Attention model MSE is 0.012864103, MAPE is 0.01984150. It has higher reliability than previous models.
Figure 4. Evaluation of Model Accuracy: MSE vs MAPE.
Figure 4. Evaluation of Model Accuracy: MSE vs MAPE.
Preprints 111746 g004

3.4. Experimental Design

By comparing the performance of the attention-based convolutional neural network and bidirectional long and short-memory neural network mixed models with traditional statistical methods, we draw the following conclusions:
  • Improved prediction accuracy: Experimental results show that the new hybrid model significantly improves the accuracy of predicting changes in stock prices. Compared with traditional statistical methods, the model performs better on several evaluation indicators, such as mean square and absolute percentage errors.
  • Effectiveness of feature extraction: Using convolutional neural networks (CNNs) for feature extraction can effectively capture spatial information in the input data, which is particularly important for analyzing multivariate time series. These extracted features help improve the subsequent model’s predictive power (BiLSTM).
  • Timing modeling of BiLSTM models: Bidirectional Long and short memory neural networks (BiLSTM) perform well in processing time series data, effectively capturing long and short-term timing dependencies, thereby improving the robustness and accuracy of predictions.
  • Addition of attention mechanisms: The introduction of attention mechanisms further improves the model’s performance. The attention mechanism can make the model pay more attention to essential time steps or features in the learning and prediction process, thus effectively improving the accuracy and stability of prediction.
  • Technical support for quantitative trading: This improved hybrid model not only significantly improves the prediction accuracy but also has the feasibility of practical application and can provide more reliable technical support for financial applications such as quantitative trading.
The hybrid CNN-BiLSTM-Attention model demonstrates clear advantages in processing financial time series data and predicting quantitative transactions.

4. Conclusions

Based on the considerations for long-term stability and reliable forecasting in stock markets, it is evident that short-term stock price predictions can inadvertently promote short-sighted investor behavior. This tendency undermines the market’s long-term stability and hampers its sustainable growth. Developing robust long-term forecasting models that incorporate multiple influencing factors is crucial to counteract this. These models should transcend the immediate fluctuations and provide insights contributing to a more stable and predictable market environment.
Furthermore, as China’s capital markets continue to undergo reforms and development, a significant imperative remains to refine market institutions through ongoing exploration. This includes comprehensive advancements in the registration system reform to enhance marketization levels effectively. The aim is to stabilize long-term expectations and foster a healthy and steady trajectory for capital market development. Achieving high-quality development in China’s distinctive modern capital market requires a sustained commitment to these principles, ensuring a balanced approach that supports long-term investor confidence and economic resilience.

References

  1. Colladon, Andrea Fronzetti, and Giacomo Scettri. "Look inside. Predicting stock prices by analyzing an enterprise intranet social network and using word co-occurrence networks." International Journal of Entrepreneurship and Small Business36.4 (2019): 378-391. [CrossRef]
  2. Choudhury, M.; Li, G.; Li, J.; Zhao, K.; Dong, M.; Harfoush, K. (2021, September). Power Efficiency in Communication Networks with Power-Proportional Devices. 2021 IEEE Symposium on Computers and Communications (ISCC); pp. 1–6. IEEE. [CrossRef]
  3. Yang, T.; Xin, Q.; Zhan, X.; Zhuang, S.; Li, H. ENHANCING FINANCIAL SERVICES THROUGH BIG DATA AND AI-DRIVEN CUSTOMER INSIGHTS AND RISK ANALYSIS. J. Knowl. Learn. Sci. Technol. Issn: 2959-6386 2024, 3, 53–62. [CrossRef]
  4. Shi, Y.; Li, L.; Li, H.; Li, A.; Lin, Y. Aspect-Level Sentiment Analysis of Customer Reviews Based on Neural Multi-task Learning. J. Theory Pr. Eng. Sci. 2024, 4, 1–8. [CrossRef]
  5. Yuan, J., Lin, Y., Shi, Y., Yang, T., & Li, A. (2024). Applications of Artificial Intelligence Generative Adversarial Techniques in the Financial Sector. Academic Journal of Sociology and Management, 2(3), 59-66. [CrossRef]
  6. Jiang, W.; Qian, K.; Fan, C.; Ding, W.; Li, Z. Applications of generative AI-based financial robot advisors as investment consultants. Appl. Comput. Eng. 2024, 67, 28–33. [CrossRef]
  7. Ding, W., Zhou, H., Tan, H., Li, Z., & Fan, C. (2024). Automated Compatibility Testing Method for Distributed Software Systems in Cloud Computing.
  8. Fan, C.; Li, Z.; Ding, W.; Zhou, H.; Qian, K. Integrating artificial intelligence with SLAM technology for robotic navigation and localization in unknown environments. Appl. Comput. Eng. 2024, 67, 22–27. [CrossRef]
  9. Guo, L., Li, Z., Qian, K., Ding, W., & Chen, Z. (2024). Bank Credit Risk Early Warning Model Based on Machine Learning Decision Trees. Journal of Economic Theory and Business Management, 1(3), 24-30. [CrossRef]
  10. Li, Z.; Fan, C.; Ding, W.; Qian, K. Robot Navigation and Map Construction Based on SLAM Technology. World J. Innov. Mod. Technol. 2024, 7, 8–14. [CrossRef]
  11. Fan, C.; Ding, W.; Qian, K.; Tan, H.; Li, Z. Cueing Flight Object Trajectory and Safety Prediction Based on SLAM Technology. J. Theory Pr. Eng. Sci. 2024, 4, 1–8. [CrossRef]
  12. Ding, W.; Tan, H.; Zhou, H.; Li, Z.; Fan, C. Immediate traffic flow monitoring and management based on multimodal data in cloud computing. Appl. Comput. Eng. 2024, 71, 1–6. [CrossRef]
  13. Qian, K., Fan, C., Li, Z., Zhou, H., & Ding, W. (2024). Implementation of Artificial Intelligence in Investment Decision-making in the Chinese A-share Market. Journal of Economic Theory and Business Management, 1(2), 36-42. [CrossRef]
  14. Lin, Y.; Li, A.; Li, H.; Shi, Y.; Zhan, X. GPU-Optimized Image Processing and Generation Based on Deep Learning and Computer Vision. J. Artif. Intell. Gen. Sci. (JAIGS) ISSN:3006-4023 2024, 5, 39–49. [CrossRef]
  15. Shi, Y.; Yuan, J.; Yang, P.; Wang, Y.; Chen, Z. Implementing intelligent predictive models for patient disease risk in cloud data warehousing. Appl. Comput. Eng. 2024, 67, 34–40. [CrossRef]
  16. Cui, Z.; Lin, L.; Zong, Y.; Chen, Y.; Wang, S. Precision gene editing using deep learning: A case study of the CRISPR-Cas9 editor. Appl. Comput. Eng. 2024, 64, 134–141. [CrossRef]
  17. Haowei, M.; Ebrahimi, S.; Mansouri, S.; Abdullaev, S.S.; Alsaab, H.O.; Hassan, Z.F. CRISPR/Cas-based nanobiosensors: A reinforced approach for specific and sensitive recognition of mycotoxins. Food Biosci. 2023, 56. [CrossRef]
  18. Xu, Q., Xu, L., Jiang, G., & He, Y. (2024, June). Artificial Intelligence In Risk Protection For Financial Payment Systems. In The 24th International scientific and practical conference “Technologies of scientists and implementation of modern methods”(June 18–21, 2024) Copenhagen, Denmark. International Science Group. 2024. 431 p. (p. 344). [CrossRef]
  19. Huang, S., Diao, S., Zhao, H., & Xu, L. (2024, June). The Contribution Of Federated Learning To Ai Development. In The 24th International scientific and practical conference “Technologies of scientists and implementation of modern methods”(June 18–21, 2024) Copenhagen, Denmark. International Science Group. 2024. 431 p. (p. 358). [CrossRef]
  20. Zhan, X., Ling, Z., Xu, Z., Guo, L., & Zhuang, S. (2024). Driving Efficiency and Risk Management in Finance through AI and RPA. Unique Endeavor in Business & Social Sciences, 3(1), 189-197.
  21. Bao, W.; Xiao, J.; Deng, T.; Bi, S.; Wang, J. The Challenges and Opportunities of Financial Technology Innovation to Bank Financing Business and Risk Management. Financial Eng. Risk Manag. 2024, 7, 82–88. [CrossRef]
  22. Xu, L., Gong, C., Jiang, G., & Yang, H. (2024, June). A Study On Personalized Web Page Recommendation Based On Natural Language Processing And Its Impact On User Browsing Behavior. In The 24th International scientific and practical conference “Technologies of scientists and implementation of modern methods”(June 18–21, 2024) Copenhagen, Denmark. International Science Group. 2024. 431 p. (p. 317).
  23. Wang, B.; He, Y.; Shui, Z.; Xin, Q.; Lei, H. Predictive optimization of DDoS attack mitigation in distributed systems using machine learning. Appl. Comput. Eng. 2024, 64, 95–100. [CrossRef]
  24. Dhand, A.; Lang, C.E.; Luke, D.A.; Kim, A.; Li, K.; McCafferty, L.; Mu, Y.; Rosner, B.; Feske, S.K.; Lee, J.-M. Social Network Mapping and Functional Recovery Within 6 Months of Ischemic Stroke. Neurorehabilit. Neural Repair 2019, 33, 922–932. [CrossRef]
  25. Allman, R.; Mu, Y.; Dite, G.S.; Spaeth, E.; Hopper, J.L.; Rosner, B.A. Validation of a breast cancer risk prediction model based on the key risk factors: family history, mammographic density and polygenic risk. Breast Cancer Res. Treat. 2023, 198, 335–347. [CrossRef]
Figure 2. The Processing Framework.
Figure 2. The Processing Framework.
Preprints 111746 g002
Table 1. Comparison of results of different sliding window sizes.
Table 1. Comparison of results of different sliding window sizes.
Preprints 111746 i001
Table 2. Comparison of predictions from different models.
Table 2. Comparison of predictions from different models.
Preprints 111746 i002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated