Submitted:
11 November 2025
Posted:
13 November 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Literature Review
3. Methodology
3.1. Research Framework
3.2. Data Collection and Preprocessing
3.2.1. Dataset Creation
3.2.2. Data Preprocessing
3.3. Relationship and Causality Analysis
3.3.1. Time Series Analysis and Stationarity Testing
3.3.2. Correlation Analysis
3.3.3. Lag Features Analysis
3.3.4. Granger Causality Test
3.3.5. Autoregressive Distributed Lag (ARDL) Model
3.3.6. Cointegration Test (Engle-Granger)
3.3.7. Random Forest Feature Importance
3.3.8. Attribute Shortlist
3.4. Predictive Model Development
3.4.1. Linear Regression
3.4.2. Random Forest
3.4.3. Gradient Boosting
3.4.4. XGBoost
3.4.5. Support Vector Regression (SVR)
3.4.6. Long Short-Term Memory (LSTM)
3.4.7. Artificial Neural Networks (ANN)
3.4.8. ARIMA
3.4.9. NARX-RNN
3.4.10. ANFIS
3.4.11. SHAP-Based Ensemble Interpretability
3.5. Model Evaluation
3.5.1. Evaluation Metrics
- Mean Absolute Error (MAE):
- Root Mean Squared Error (RMSE):
- Coefficient of Determination (R2):
3.5.2. Cross-Validation Assessment
3.5.3. Statistical Significance Test
4. Relationship and Causality Analyses & Results
4.1. Time Series Analysis
4.2. Partial Autocorrelation Function (PACF)
4.3. Correlation Analysis
4.4. Lag Features Analysis
4.5. Stationarity Test
4.6. Granger Causality Test
4.7. Autoregressive Distributed Lag (ARDL) Model
4.8. Cointegration Test (Engle-Granger)
4.9. Random Forest Feature Importance Analysis
4.10. Analysis Results & Attribute Shortlist
5. Prediction Model Development & Results
5.1. Dataset Preparation
5.2. Model Selection and Implementation
5.3. Model Performance Results
5.4. Model-Specific Findings
5.4.1. Linear Regression
5.4.2. Random Forest
5.4.3. XGBoost
5.4.4. Support Vector Regressor (SVR)
5.4.5. LSTM
5.4.6. Artificial Neural Network (ANN)
5.4.7. Gradient Boosting
5.4.8. ARIMA
5.4.9. NARX-RNN
5.4.10. ANFIS
5.4.11. SHAP Interpretability Insights
- TP FG J125 (Insurance) consistently ranked among the top-2 features across all three models (avg. rank: 2.03), suggesting that household risk-mitigation expenditures are strongly coupled with food price dynamics.
- TP FG J062 (Outpatient Services) and TP FG J073 (Transportation Services) appeared in the top-3 of at least two models and within the top-5 of all three, reinforcing the role of healthcare accessibility and logistics costs in food inflation transmission.
- TP FG J051 (Furniture & Furnishings) showed high stability, ranking 2nd in Random Forest and 5th in both Gradient Boosting and SVR.
5.5. Model Robustness Assessment
5.5.1. Cross Validation
5.5.2. Statistical Significance Test
6. Discussion
6.1. Feature Selection and Economic Interconnectedness
6.2. Predictive Performance and Model Robustness
6.3. Interpretability and Policy Implications
7. Conclusion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abidoye, R. B. , Chan, A. P., Abidoye, F. A., & Oshodi, O. S. (2019). Predicting property price index using artificial intelligence techniques: Evidence from Hong Kong. International Journal of Housing Markets and Analysis, 12(6), 1072-1092.
- Atalan, A. (2023). Forecasting drinking milk price based on economic, social, and environmental factors using machine learning algorithms. Agribusiness, 39(1), 214-241.
- Baumeister, C. , & Kilian, L. (2014). Do oil price increases cause higher food prices? Economic Policy, 29(80), 691-747.
- Box, G. E. P. , Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2016). Time series analysis: Forecasting and control (5th ed.). John Wiley & Sons.
- Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
- Cerveny, D. (2023). PPI and CPI: What is the relationship? [Bachelor's thesis, Charles University]. Faculty of Social Sciences.
- Chen, T. , & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.
- Cohen, J. , Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.
- Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association, 74(366), 427-431.
- Engle, R. F. , & Granger, C. W. J. (1987). Co-integration and error correction: Representation, estimation, and testing. Econometrica, 55(2), 251-276.
- Eştürk, Ö. , & Albayrak, N. (2018). Tarım ürünleri-gıda fiyat artışları ve enflasyon arasındaki ilişkinin incelenmesi. Uluslararası İktisadi Ve İdari İncelemeler Dergisi, 147-158.
- Fan, X. , Xu, Z., Qin, Y., & Škare, M. (2023). Quantifying the short-and long-run impact of inflation-related price volatility on knowledge asset investment. Journal of Business Research, 165, 114048.
- Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189-1232.
- Granger, C. W. J. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3), 424-438.
- Hastie, T. , Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
- Hochreiter, S. , & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
- Jang, J. S. R. (1993). ANFIS: Adaptive-network-based fuzzy inference system. IEEE Transactions on Systems, Man, and Cybernetics, 23(3), 665-685.
- Ji, M., Liu, P., Deng, Z., & Wu, Q. (2022). Prediction of national agricultural products wholesale price index in China using deep learning. Progress in Artificial Intelligence, 1-9.
- Karagöl, V. (2023). Ekonomik politika belirsizliğinin gıda fiyatlarına etkisi: seçilmiş ülkeler için zamanla değişen nedensellik analizi. İktisat Politikası Araştırmaları Dergisi, 10(2), 409-433.
- Kresova, S., & Hess, S. (2022). Identifying the determinants of regional raw milk prices in Russia using machine learning. Agriculture, 12(7), 1006.
- Lin, T. , Horne, B. G., Tino, P., & Giles, C. L. (1996). Learning long-term dependencies in NARX recurrent neural networks. IEEE Transactions on Neural Networks, 7(6), 1329-1338.
- Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems (NeurIPS), 30.
- Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., ... & Lee, S. I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67.
- Lutoslawski, K., Hernes, M., Radomska, J., Hajdas, M., Walaszczyk, E., & Kozina, A. (2021). Food demand prediction using the nonlinear autoregressive exogenous neural network. IEEE Access, 9, 146123-146136.
- Makridakis, S. , Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and Machine Learning forecasting methods: Concerns and ways forward. PLOS ONE, 13(3), e0194889.
- Oyeleke, O. J., & Ojediran, S. (2018). Exploring the relationship between consumer price index (CPI) and producer price index (PPI) in Nigeria. International Journal of Statistics and Applications, 8(2), 42-46.
- Ozpolat, A. (2020). Causal link between consumer prices index and producer prices index: An evidence from central and Eastern European Countries (CEECs). Adam Academy Journal of Social Sciences, 10(2), 319-332.
- Özçelik, Ö. , & Uslu, N. (2024). Gıda Enflasyonunun Belirleyicileri Üzerine Bir Analiz: Türkiye Örneği. Dumlupınar Üniversitesi Sosyal Bilimler Dergisi(79), 289-309. [CrossRef]
- Pesaran, M. H., Shin, Y., & Smith, R. J. (2001). Bounds testing approaches to the analysis of level relationships. Journal of Applied Econometrics, 16(3), 289-326.
- Qian, J. , Dai, B., Wang, B., Zha, Y., & Song, Q. (2022). Traceability in food processing: problems, methods, and performance evaluations—a review. Critical Reviews in Food Science and Nutrition, 62(3), 679-692.
- Rumelhart, D. E. , Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533-536.
- Vapnik, V. N. (1995). The nature of statistical learning theory. Springer.
- Venkateswara Rao, K. , Srilatha, D., Jagan Mohan Reddy, D., Desanamukula, V. S., & Kejela, M. L. (2022). Regression based price prediction of staple food materials using multivariate models. Scientific Programming, 2022, Article 9547039.
- Wanjuki, T. M., Wagala, A., & Muriithi, D. K. (2022). Evaluating the predictive ability of seasonal autoregressive integrated moving average (SARIMA) models using food and beverages price index in Kenya. European Journal of Mathematics and Statistics, 3(2), 28-38.
- Warren-Vega, W. M. , Aguilar-Hernández, D. E., Zárate-Guzmán, A. I., Campos-Rodríguez, A., & Romero-Cano, L. A. (2022). Development of a predictive model for agave prices employing environmental, economic, and social factors: Towards a planned supply chain for agave-tequila industry. Foods, 11(8), 1138.
- Xie, M. , Ding, L., Xia, Y., Guo, J., Pan, J., & Wang, H. (2021). Does artificial intelligence affect the pattern of skill demand? Evidence from Chinese manufacturing firms. Economic Modelling, 96, 295-309.
- Yu, C. P. (2016). Why are there always inconsistent answers to the relation between the PPI and CPI? Re-examination using panel data analysis. International Review of Accounting, Banking & Finance, 8(1), 1-17.
- Muthayya, S., Sugimoto, J. D., Montgomery, S., & Maberly, G. F. (2014). An overview of global rice production, supply, trade, and consumption. Annals of the New York Academy of Sciences, 1324(1), 7-14.
- Valera, H. G. A. (2022). Is rice price a major source of inflation in the Philippines? A panel data analysis. Applied Economics Letters, 29(16), 1528-1532.
- TCMB (EVDS Verinin Merkezi). (2024). https://evds2.tcmb.gov.tr/index.php?/evds/serieMarket.
- World Food Programme Price Database (2024), https://data.world/wfp/7d7224ed-eff6-421f-9f96-9c8d43905f3c.
| Year | Category | Methodological Approach | Content |
|---|---|---|---|
| 2024 | Relationship and Causality | The ARDL (Autoregressive Distributed Lag) method was employed. | Özçelik and Uslu (2024) investigate the determinants of food inflation within the Turkish economy. Based on ARDL modeling, the study finds that the Consumer Price Index for Food and Non-Alcoholic Beverages is positively influenced by the Domestic Producer Price Index for Agriculture, Forestry, and Fishing (UFET) and the Consumer Price Index for Electricity, Gas, and Other Fuels (TUFEE), while the Real Effective Exchange Rate based on CPI (REDK) exerts a negative impact. Furthermore, results from the ARDL Error Correction Model, which examines short-term dynamics, indicate that short-term imbalances are corrected in the long run. |
| 2023 | Relationship and Causality | Panel Structural Vector Autoregression (PSVAR) technique to assess inflation effects | Fan et al. (2023) explore the relationship between information asset investments and inflation. Utilizing the PSVAR method, they analyze both short- and long-term dynamics. Their findings suggest that low to moderate inflation levels are positively correlated with the market value of R&D firms, whereas high inflation has a negative effect. |
| 2023 | Relationship and Causality | Granger causality test to examine the PPI-CPI relationship | Cerveny (2023) investigates the link between Producer Price Index (PPI) and Consumer Price Index (CPI) in the Czech Republic and the Eurozone. Applying the Granger causality test, the study reveals that PPI influences CPI in the Czech Republic, whereas no such causal relationship is observed in the Eurozone. |
| 2020 | Relationship and Causality | Panel cointegration and panel causality tests | Ozpolat (2020) analyzes the causal relationship between CPI and PPI in Central and Eastern European Countries (CEECs), using panel cointegration and panel causality tests. The results indicate a long-term, bidirectional causality between CPI and PPI in these countries. |
| 2018 | Relationship and Causality | Econometric methods: DF-GLS unit root test, Johansen and Engle-Granger cointegration approaches, VAR model | Oyeleke and Ojediran (2018) examine the relationship between PPI and CPI in Nigeria using various econometric techniques. The DF-GLS unit root test is applied to assess stationarity, Johansen and Engle-Granger methods are used for long-run cointegration, and a VAR model is employed to analyze interactions. The study concludes that the PPI-CPI relationship in Nigeria does not follow a simple cause-effect pattern and lacks a long-term equilibrium relationship. |
| 2016 | Relationship and Causality | Non-parametric regression using the LOESS technique | Akmercan (2016) investigates the relationships among household expenditures, income, and OECD household size data using the LOESS (Locally Estimated Scatterplot Smoothing) non-parametric regression method. Essential consumption items are aggregated into a single expenditure category for analysis. |
| 2016 | Relationship and Causality | Comparison of ordered and unordered discrete choice models (LOGIT and PROBIT) ) |
Çelik (2016) analyzes factors influencing household fuel choices for heating in Turkey using TÜİK data. The study compares ordered and unordered discrete choice models, particularly LOGIT and PROBIT variants. Model performance is assessed using OLOGIT, GOLOGIT, PPO, HOLOGIT, AIC, BIC, and MNL statistics to determine the most suitable approach. |
| 2016 | Relationship and Causality | Panel data analysis and Dumitrescu-Hurlin panel causality test | Chih-Ping Yu (2016) first applies panel data analysis to explore the general dynamics between CPI and PPI, then uses the Dumitrescu-Hurlin panel causality test for a deeper investigation into the causal nature of this relationship. This dual approach allows for a more nuanced understanding of inconsistencies in CPI-PPI transmission across countries. |
| 2014 | Relationship and Causality | Correlation, regression, ANOVA, and coefficient of determination (R2) | Galodikwe (2014) investigates the PPI-CPI relationship using correlation analysis, regression models, ANOVA, and the coefficient of determination. The findings confirm that PPI indices significantly influence CPI indices. |
| 2001 | Relationship and Causality | Limitations of OLS and use of Tobit models | Emeç (2001) examines household consumption expenditures, highlighting the limitations of the Ordinary Least Squares (OLS) method when applied to continuous or ordinal dependent variables across regions. As a solution, Tobit models are suggested, where zero expenditures are bounded at zero, and certain continuous variables are categorized to fit ordered logit models. Results are interpreted in the context of Engel curves. |
| 2022 | Relationship and Causality | Combined econometric (ARDL) and machine learning (Support Vector Machine) approach; hybrid model proposed. VIF test used to avoid multicollinearity. Evaluation Metrics: RMSE, MAE, R2 |
Ozden (2022) investigates macroeconomic and financial determinants of Turkey’s export-import ratio using both econometric and machine learning methods. The ARDL model is applied to monthly data (2010–2021) on normalized GDP, exchange rate, CPI, PPI, crude oil prices, and trade ratio. Trends of each variable are presented. A VIF test confirms no multicollinearity issues. Subsequently, Support Vector Machine (SVM) is used to capture complex patterns. Results from ARDL, SVM, and a hybrid ARDL-SVM model are compared using RMSE, MAE, and R2. The hybrid model, supported by machine learning, demonstrates superior performance in capturing variable interactions. |
| 2016 | Prediction Model | Poisson Quasi Maximum Likelihood estimation; Bootstrap validation test | Balyaner (2016) estimates the number of information technology devices owned by households using the Poisson Quasi Maximum Likelihood (PQML) estimation method. The validity of the model is assessed through bootstrap resampling techniques. |
| 2023 | Prediction Model | Comparison of Random Forest, Gradient Boosting, SVM, Neural Networks, and AdaBoost Evaluation Metrics: MSE, RMSE, MAE, R2 |
Atalan (2023) evaluates economic, social, and environmental factors affecting unit prices of milk in Turkey. Five machine learning algorithms—Random Forest, Gradient Boosting, Support Vector Machine (SVM), Artificial Neural Network, and AdaBoost—are used for price prediction. Performance is assessed using MSE, RMSE, MAE, and R2. Random Forest yields the best results. Additionally, Random Forest performance is reported across tree counts ranging from 10 to 2000. |
| 2022 | Prediction Model | System dynamics model for energy efficiency and resource optimization in the food and beverage industry | Katsumbe (2022) proposes a system dynamics model to optimize energy efficiency and resource use in the food and beverage sector. Separate sub-models are developed for water, electricity, and production lines, with input variables defined for each. Total consumption is formulated and compared against a baseline. The model is used to simulate one-year forecasts. |
| 2022 | Prediction Model | SARIMA model for forecasting food and beverage prices in Kenya, accounting for seasonality. Evaluation Metrics: MSE, MAE, MAPE, Theil’s U statistic |
Wanjuki et al. (2022) propose a model for forecasting food and beverage prices in Kenya. Given seasonal fluctuations, the Seasonal Autoregressive Integrated Moving Average (SARIMA) model is employed. Model accuracy is evaluated using MSE, MAE, MAPE, and Theil’s U statistic. High predictive accuracy is achieved, and the model is recommended for short-term price forecasting in the food and beverage sector. |
| 2022 | Prediction Model | Multiple regression model implemented in Minitab | Warren et al. (2022) develop a multiple regression model to forecast agave (a key input in tequila production) prices. Variables include rainfall, harvest volume, tequila production, costs, exchange rates, and export volumes. The model, run in Minitab, shows strong predictive performance (R = 0.86). |
| 2022 | Prediction Model | Comparison of deep learning models: DA-RNN, NARX-RNN, MV-LSTM Evaluation Metrics: RMSE, MAE, MAPE |
Ji et al. (2022) investigate deep learning approaches for forecasting wholesale agricultural prices in China. The Dual-Stage Attention-Based Recurrent Neural Network (DA-RNN) outperforms NARX-RNN and MV-LSTM models. Performance is evaluated using RMSE, MAE, and MAPE. |
| 2022 | Prediction Model | ARCH and GARCH models for forecasting prices of food items (tomato, garlic, okra, pepper) | Venkateswara et al. (2022) present a regression-based multivariate approach to forecast prices of key food commodities. Emphasizing the importance of price volatility for governments, producers, and consumers, they apply ARCH (Autoregressive Conditional Heteroskedasticity) and GARCH (Generalized ARCH) models. While ARCH generally yields more consistent results, GARCH performs better for certain items. |
| 2022 | Prediction Model | Random Forest with three cross-validation techniques: temporal, spatial, spatiotemporal | Kresove and Hess (2022) analyze factors influencing raw milk prices in Russia using 17 variables. Feature selection is performed using Boruta analysis, confirming all variables as relevant. The Random Forest model is tested with three cross-validation strategies: temporal (for time-series), spatial (for geographical), and spatiotemporal (combined). The spatiotemporal approach is found to be the most effective. |
| 2021 | Prediction Model | Artificial Neural Networks (ANN) and Multiple Linear Regression for CPI forecasting Software: WEKA |
Özcan (2021) examines the influence of macroeconomic variables—External Debt, PPI, USD exchange rate, Exports, Imports, and M2 money supply—on CPI in Turkey using data from 2008 to 2020 (TCMB). Both ANN and Multiple Linear Regression models are implemented in WEKA. The ANN model demonstrates superior predictive accuracy compared to the linear regression model. |
| 2021 | Prediction Model | NARXNN model for forecasting food demand | Lutoslawski et al. (2021) employ the Nonlinear Autoregressive Exogenous Neural Network (NARXNN) model to forecast food demand. The study highlights that NARXNN, commonly used in time series forecasting, provides more accurate predictions than traditional regression models. |
| 2021 | Prediction Model | Backpropagation-trained ANN model for CPI forecasting Software: Zaitun Evaluation: MAPE |
Sarangi et al. (2021) aim to forecast the Consumer Food Price Index (CFPI) in India using a machine learning approach. A backpropagation-trained Artificial Neural Network (ANN) is implemented using the Zaitun statistical software. MAPE values are used to validate model accuracy, which is reported to be very high, indicating strong predictive performance. |
| 2020 | Prediction Model | ANN, Random Forest, and XGBoost models Evaluation Metrics: R2, MAE, RMSE |
Tosun (2020) forecasts fresh fruit and vegetable imports for OECD countries using data mining and machine learning techniques. ANN, Random Forest, and XGBoost models are applied and compared using R2, RMSE, and MAE. XGBoost demonstrates the best overall performance. |
| 2020 | Prediction Model | Applicability of ANN, SVM, genetic algorithms, and hybrid techniques in stock price forecasting | Strader et al. (2020) conduct a study on stock price forecasting. Their findings suggest that: Artificial Neural Networks (ANN) are best suited for predicting numerical stock index values; Support Vector Machines (SVM) perform well in classification tasks, such as predicting market direction; Hybrid machine learning techniques may overcome limitations of single-method approaches. |
| 2019 | Prediction Model | Superiority of ANN over logarithmic regression | Selim and Demirkıran (2019) analyze household budget survey data from TÜİK to identify factors affecting food expenditures and track temporal changes. They develop predictive models using logarithmic regression and Artificial Neural Networks (ANN). Results show that the ANN model outperforms the semi-logarithmic regression model in forecasting accuracy. |
| 2019 | Prediction Model | Comparison of ANN, SVM, and ARIMA models | Abidoye et al. (2019) collect data on factors influencing real estate prices in Hong Kong and apply ARIMA, ANN, and SVM models. The models are used for out-of-sample forecasting. The ANN model outperforms both SVM and ARIMA in predictive accuracy. |
| 2018 | Prediction Model | ANFIS (Adaptive Neuro-Fuzzy Inference System) combining fuzzy logic and neural networks | Soltani and Pooya (2018) design an AI system to predict the success of new food products. The ANFIS algorithm integrates fuzzy logic and neural networks, processing data from diverse sources such as market research and social media to forecast product performance. |
| 2018 | Prediction Model | Evaluation of machine learning as an alternative to statistical methods in time series forecasting | Makridakis et al. (2018) assess machine learning methods as alternatives to traditional statistical approaches in time series forecasting. Eight classical statistical methods and ten machine learning techniques are compared using sMAPE. The results show that statistical methods generally outperform machine learning models. However, the authors note that recent advancements may soon close this gap. |
| Item Code | Description | Rank | Correlation Value |
|---|---|---|---|
| TP FG J053 | 053.HOUSEHOLD APPLIANCES | 1 | 0.99891 |
| TP FG J051 | 051.FURNITURE, FURNISHINGS, CARPETS AND OTHER FLOOR COVERINGS | 2 | 0.998844 |
| TP FG J056 | 056.GOODS AND SERVICES FOR HOUSEHOLD MAINTENANCE | 3 | 0.997916 |
| Item Code | Description | Rank | Correlation Value |
|---|---|---|---|
| TP FG J127 | 127.OTHER SERVICES N.E.C. | 1 | 0.99776 |
| TP FG J124 | 124.SOCIAL PROTECTION | 2 | 0.997718 |
| TP FG J062 | 062.OUTPATIENT SERVICES | 3 | 0.997687 |
| Item Code | Description | Rank | Correlation Value |
|---|---|---|---|
| TP FG J127 | 127.OTHER SERVICES N.E.C. | 1 | 0.970884 |
| TP FG J124 | 124.SOCIAL PROTECTION | 2 | 0.970381 |
| TP FG J062 | 062.OUTPATIENT SERVICES | 3 | 0.97034 |
| Item Code | Description | Rank | Average Correlation |
|---|---|---|---|
| TP FG J056 | 056.GOODS AND SERVICES FOR HOUSEHOLD MAINTENANCE | 1 | 0.988253667 |
| TP FG J012 | 012.NON-ALCOHOLIC BEVERAGES | 2 | 0.988152 |
| TP FG J062 | 062.OUTPATIENT SERVICES | 3 | 0.988128 |
| Item Code | Description | Lag Period | Correlation Value |
|---|---|---|---|
| TP FG J053 | 053.HOUSEHOLD APPLIANCES | 1 | 0.999222306 |
| TP FG J051 | 051.FURNITURE, FURNISHINGS, CARPETS AND OTHER FLOOR COVERINGS | 1 | 0.998482096 |
| TP FG J012 | 012.NON-ALCOHOLIC BEVERAGES | 1 | 0.997243669 |
| Item Code | Description | Coefficient | Std Error | p-value |
|---|---|---|---|---|
| TP FG J062.L0 | 062.OUTPATIENT SERVICES | 0.0092 | 0.004 | 0.034 |
| TP FG J061.L0 | 061.MEDICAL PRODUCTS, APPLIANCES AND EQUIPMENT | 0.0081 | 0.003 | 0.028 |
| TP FG J083.L1 | 083.TELEPHONE AND TELEFAX SERVICES | 0.0079 | 0.003 | 0.015 |
| Item Code | Description | Cointegration Statistic | p-value |
|---|---|---|---|
| TP FG J105 | 105.EDUCATION PROGRAMMES OF UNSPECIFIED LEVEL | -5.59518832 | 0.0000115 |
| TP FG J124 | 124.SOCIAL PROTECTION | -5.239523044 | 0.0000583 |
| TP FG J072 | 072.OPERATION OF PERSONAL TRANSPORT EQUIPMENT | -4.565730017 | 0.000953 |
| Item Code | Description | Rank | Importance Score |
|---|---|---|---|
| TP FG J125 | 125.INSURANCE | 1 | 0.068466 |
| TP FG J061 | 061.MEDICAL PRODUCTS, APPLIANCES AND EQUIPMENT | 2 | 0.039167 |
| TP FG J073 | 073.TRANSPORT SERVICES | 3 | 0.036515 |
| Item | Description | Method | Delay (Months) |
|---|---|---|---|
| TP FG J053 | 053. HOUSEHOLD APPLIANCES | Pearson | 1 |
| TP FG J051 | 051. FURNITURE, FIXTURES, CARPETS AND OTHER FLOOR COVERINGS | Pearson | 0 |
| TP FG J073 | 073. TRANSPORTATION SERVICES | Spearman | 6 |
| TP FG J127 | 127. OTHER UNCLASSIFIED SERVICES | Spearman and Kendall Tau | 0 |
| TP FG J124 | 124. SOCIAL PROTECTION | Spearman, Kendall Tau, Cointegration | 0 |
| TP FG J062 | 062. OUTPATIENT SERVICES | Kendall Tau, ARDL | 0 |
| TP FG J105 | 105. EDUCATIONAL PROGRAMS NOT DETERMINED BY LEVEL | Cointegration Test | 0 |
| TP FG J061 | 061. MEDICAL PRODUCTS, INSTRUMENTS AND EQUIPMENT | ARDL, Random Forest | 0 |
| TP FG J125 | 125. INSURANCE | Random Forest | 0 |
| TP FG J011 | 011. FOOD | PACF | 1 |
| Rank | Model | MAE | RMSE | R2 | Performance |
|---|---|---|---|---|---|
| 1 | Gradient Boosting | 0.2838 | 0.4229 | 0.9990 | Excellent |
| 2 | SVR | 0.3743 | 0.5743 | 0.9982 | Excellent |
| 3 | NARX-RNN (6 months) | 0.4845 | 0.6363 | 0.9988 | Excellent |
| 4 | Random Forest | 0.3690 | 0.6612 | 0.9976 | Excellent |
| 5 | XGBoost | 0.3995 | 0.6659 | 0.9975 | Excellent |
| 6 | Linear Regression | 0.8147 | 1.2329 | 0.9915 | Good |
| 7 | ANN | 1.1036 | 1.3161 | 0.9903 | Good |
| 8 | ANFIS | 7.2268 | 10.7862 | 0.3497 | Poor/Failed |
| 9 | ARIMA | 9.3682 | 11.2475 | -0.7834 | Poor/Failed |
| 10 | LSTM | 19.8388 | 21.4423 | -5.4817 | Poor/Failed |
| Lag Period (months) | MAE | RMSE | R2 |
|---|---|---|---|
| 3 | 0.5921 | 0.7856 | 0.9968 |
| 6 | 0.4845 | 0.6363 | 0.9988 |
| 9 | 0.6733 | 0.8341 | 0.9954 |
| 12 | 0.7891 | 0.9742 | 0.9932 |
| Rank | Feature | Description | Avg. SHAP Importance | GB Rank | RF Rank | SVR Rank |
|---|---|---|---|---|---|---|
| 1 | TP FG J125 | 125. INSURANCE | 2.029 | 1 | 4 | 2 |
| 2 | TP FG J062 | 062. OUTPATIENT SERVICES | 1.481 | 2 | 1 | 3 |
| 3 | TP FG J073 | 073. TRANSPORTATION SERVICES | 1.448 | 3 | 5 | 1 |
| 4 | TP FG J051 | 051. FURNITURE, FIXTURES, CARPETS | 1.207 | 5 | 2 | 5 |
| 5 | TP FG J061 | 061. MEDICAL PRODUCTS, APPLIANCES | 1.021 | 7 | 3 | 8 |
| Performance Metric | 5 Fold Cross-Validation Score |
|---|---|
| Mean R2 (±SD) | 0.9742 ± 0.0324 |
| Mean MAE (±SD) | 0.6917 ± 0.4445 |
| Mean RMSE (±SD) | 1.6300 ± 1.3980 |
| Original Test R2 | 0.9990 |
| CV-Test Difference | 0.0248 |
| Model | Cross-Validation Score |
|---|---|
| Gradient Boosting | 0.9742 ± 0.0324 |
| SVR | 0.9896 ± 0.0168 |
| Random Forest | 0.9811 ± 0.0208 |
| Model Comparison | t-statistic | p-value | Interpretation |
|---|---|---|---|
| Gradient Boosting vs SVR | -0.998 | 0.375 | No significant difference |
| Gradient Boosting vs Random Forest | -1.000 | 0.374 | No significant difference |
| SVR vs Random Forest | 0.986 | 0.380 | No significant difference |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
