Submitted:
17 February 2026
Posted:
26 February 2026
You are already at the latest version
Abstract
Keywords:
Introduction
- Background of the Study
Statement of the Research Problem
Research Questions
- Does financial news sentiment have a measurable relationship with short-term stock price movements?
- Do sentiment-enhanced forecasting models outperform traditional price-only baseline models in short-term stock price prediction?
- How does the use of a decision-focused modeling framework influence model selection, evaluation, and interpretability in stock price forecasting?
Significance of the Study
Literature Review
- Behavioral Finance and Market Information
- Financial Markets and Sentiment Analysis
- Market Prediction, News and Social Media
- Problems of Decision Support, Big Data, and Modeling
- Research Gap
Research Design and Methodology
Research Design
- Data
- Data Source
- Data Components
- The dataset is based on two major components:
- Textual Data (News Headlines):
- Numerical Market Data:
- Each ticker is offered daily stock market data and the following variables are included:
- Opening price
- Closing price
- High price
- Low price
- Adjusted closing price
- Trading volume
Data Scope and Structure
Summary Statistics
Data Analytic Plan
- Text Data Washing and Pre-processing
- Preprocessing of textual data is done to guarantee the top sentiment extraction. The steps applied are as follows:
- Elimination of special characters, punctuation marks, URLs and stock ticker symbol.
- Removal of the duplicated headlines.
- Turn all the text to small letters.
- Elimination of non-English headlines.
- Sentiment Extraction
- Data transformation Numbers to Numbers
- Data Integration and Alignment
- Exploratory Data Analysis
- Modeling Approach
- Model Evaluation
Purpose of Modeling
Modeling Framework Based on FPP3
Baseline Mathematical Models
- Naïve model equation:
- Written mathematically:
- Time-Series Forecasting Models
- General ARIMA model representation:
- φ represents autoregressive coefficients
- θ represents moving average coefficients
- ε represents random error
- Regression and Dynamic Models with External Variables
- Regression model with external predictor:
- x(t) represents an external variable such as sentiment score
- β represents regression coefficients
- y(t) = β0 + β1x(t) + ε(t)
- ε(t) follows an ARIMA process
Model Evaluation Metrics
- RMSE equation:
- MAE equation:
- Assumptions and Transparency
Implementation: Seasonal Adjustment and Stationarization
Primary Data Preliminary Examination and Trend Recognition
Decomposition of Seasonal Adjustment
Trend Removal and Differencing
Variance Stabilization by Transformation
Statistical Checking of Stationarity
Diagnostic Evaluation and Validation
- Combination with Forecasting Models
Summary of Implementation Results
Data Visualization and Results
Role of Visualization in Decision-Focused Modeling
Gaps and Limitations
Discussion
Conclusion
References
- Akin, I.; Akin, M. Behavioral finance impacts on US stock market volatility: an analysis of market anomalies. In Behavioural Public Policy; 2024; pp. 1–25. [Google Scholar]
- Axtell, R. L.; Farmer, J. D. Agent-based modeling in economics and finance: Past, present, and future. Journal of Economic Literature 2025, 63(1), 197–287. [Google Scholar] [CrossRef]
- Bryan-Smith, L.; Godsall, J.; George, F.; Egode, K.; Dethlefs, N.; Parsons, D. Real-time social media sentiment analysis for rapid impact assessment of floods. Computers & Geosciences 2023, 178, 105405. [Google Scholar] [CrossRef]
- Goldstein, I. Information in financial markets and its real effects. Review of Finance 2023, 27(1), 1–32. [Google Scholar] [CrossRef]
- Guan, C.; Liu, W.; Cheng, J. Y. C. Using social media to predict the stock market crash and rebound amid the pandemic: the digital ‘haves’ and ‘have-mores’. Annals of Data Science 2022, 9(1), 5–31. [Google Scholar] [CrossRef] [PubMed]
- Hasselgren, B.; Chrysoulas, C.; Pitropakis, N.; Buchanan, W. J. Using social media & sentiment analysis to make investment decisions. Future Internet 2022, 15(1), 5. [Google Scholar] [CrossRef]
- Kantha, P.; Thiyagarajan, N.; Sharma, V.; Logeshwaran, J.; Vishwakarma, P. Analyzing the growing factor of Financial Markets Using Sentimental Analysis Algorithms. 2023 Annual International Conference on Emerging Research Areas: International Conference on Intelligent Systems (AICERA/ICIS), 2023, November; IEEE; pp. 1–7. [Google Scholar]
- Liu, Q.; Lee, W. S.; Huang, M.; Wu, Q. Synergy between stock prices and investor sentiment in social media. Borsa Istanbul Review 2023, 23(1), 76–92. [Google Scholar] [CrossRef]
- Patchipala, S. Tackling data and model drift in AI: Strategies for maintaining accuracy during ML model inference. International Journal of Science and Research Archive 2023, 10(2), 1198–1209. [Google Scholar] [CrossRef]
- Peivandizadeh, A.; Hatami, S.; Nakhjavani, A.; Khoshsima, L.; Qazani, M. R. C.; Haleem, M.; Alizadehsani, R. Stock market prediction with transductive long short-term memory and social media sentiment analysis. IEEE Access 2024, 12, 87110–87130. [Google Scholar] [CrossRef]
- Rita, P.; António, N.; Afonso, A. P. Social media discourse and voting decisions influence: sentiment analysis in tweets during an electoral period. Social Network Analysis and Mining 2023, 13(1), 46. [Google Scholar] [CrossRef]
- Rodríguez-Ibánez, M.; Casánez-Ventura, A.; Castejón-Mateos, F.; Cuenca-Jiménez, P. M. A review on sentiment analysis from social media platforms. Expert Systems with Applications 2023, 223, 119862. [Google Scholar] [CrossRef]
- Saravanos, C.; Kanavos, A. Forecasting stock market volatility using social media sentiment analysis. Neural Computing and Applications 2025, 37(17), 10771–10794. [Google Scholar] [CrossRef]
- Zhang, H.; Zang, Z.; Zhu, H.; Uddin, M. I.; Amin, M. A. Big data-assisted social media analytics for business model for business decision making system competitive analysis. Information Processing & Management 2022, 59(1), 102762. [Google Scholar] [CrossRef]


| Model Type | Key Assumptions | Strengths | Limitations | Evaluation Metrics |
| Naïve Forecast Model | Future equals most recent observation | Simple baseline; high transparency | Ignores trend and seasonality | RMSE, MAE |
| ARIMA Model | Stationary series; linear relationships | Captures trends and autocorrelation | Requires preprocessing and tuning | RMSE, MAE |
| Dynamic Regression (ARIMAX) | External predictors influence outcome | Incorporates external drivers; decision-focused | Higher complexity; data availability | RMSE, MAE |
| Model | RMSE | MAE |
| Naïve | 0.028 | 0.019 |
| ARIMA | 0.021 | 0.014 |
| ARIMAX | 0.017 | 0.011 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).