Submitted:
25 June 2025
Posted:
26 June 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
2.1. Statistical Models
2.2. Deep Neural Network Models
2.3. Dropout
2.4. Hybrid Models
3. Methodology
3.1. Datasets
- M5 competition: This dataset contains monthly sales data for 3049 SKUs and 10 stores. It can be downloaded at https://www.kaggle.com/c/m5-forecasting-uncertainty.
- Stallion competition: This dataset contains monthly sales data for 24 SKUs and 58 agencies. It can be downloaded at https://www.kaggle.com/datasets/utathya/future-volume-prediction.
- Stock market: This dataset contains daily stock data (Volume, High, Low, and Closing Price) for all NASDAQ, S&P500, and NYSE listed companies. It can be downloaded at https://www.kaggle.com/datasets/paultimothymooney/stock-market-data.
- Synthetic data: This dataset is generated by the sum of four components, namely; (i) seasonality, modeled by a sine wave with random amplitude, phase, and frequency. (ii) trend, modeled by a random linear coefficient, either positive, negative, or null. (iii) noise, modeled by Gaussian white noise. The dataset contains 500 time series with 60 time steps each. And lastly, (iv) gain, which is a random scalar value that multiplies the entire series.
3.2. Preprocessing
- forecast_horizon: How many time steps ahead will be predicted;
- season_length: The expected length of the seasonality, for instance, in a monthly aggregated, a reasonable seasonality length would be 12 time steps;
- date_freq: The pandas string to represent the frequency of the dataset, for instance, "MS" for monthly aggregated time series;
- train_split: The portion of the data that should be used for training.
- models: A list of statistical models that will be used to generate the covariates, any model that implements a fit() and predict() method can be used.
- fallback_model: If an error happened when trying to predict with one of the models, the class will fall back to this model instead;
- verbose: If the process should verbose the progress.
3.2.1. ARIMA
3.2.2. ETS
3.2.3. Linear Regression
3.3. Model Architecture
3.3.1. Dense Layer
3.3.2. Convolutional Layer
3.3.3. Long Short-Term Memory
3.4. Metrics
3.4.1. Mean Absolute Error – MAE
3.4.2. Mean Squared Error – MSE
3.4.3. Symmetric Mean Absolute Percentage Error (SMAPE)
3.5. Hypothesis Test
3.5.1. Significance Level
4. Discussion of Results
4.1. M5 Dataset
4.2. Stallion Dataset
4.3. Stock Market Dataset
4.4. Synthetic Dataset
4.5. Overall Implications
5. Conclusion
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zuege, C.V.; Stefenon, S.F.; Yamaguchi, C.K.; Mariani, V.C.; Gonzalez, G.V.; dos Santos Coelho, L. Wind speed forecasting approach using conformal prediction and feature importance selection. International Journal of Electrical Power & Energy Systems 2025, 168, 110700. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Seman, L.O.; Sopelsa Neto, N.F.; Meyer, L.H.; Mariani, V.C.; Coelho, L.d.S. Group method of data handling using Christiano-Fitzgerald random walk filter for insulator fault prediction. Sensors 2023, 23, 6118. [Google Scholar] [CrossRef] [PubMed]
- Lim, B.; Zohren, S. Time-series forecasting with deep learning: a survey. Philosophical Transactions of the Royal Society A 2021, 379, 20200209. [Google Scholar] [CrossRef] [PubMed]
- Lopes, H.; Pires, I.M.; Sánchez San Blas, H.; García-Ovejero, R.; Leithardt, V. PriADA: Management and Adaptation of Information Based on Data Privacy in Public Environments. Computers 2020, 9, 77. [Google Scholar] [CrossRef]
- Kourentzes, N.; Athanasopoulos, G. Elucidate structure in intermittent demand series. European Journal of Operational Research 2021, 288, 141–152. [Google Scholar] [CrossRef]
- Tian, X.; Wang, H.; Erjiang, E. Forecasting intermittent demand for inventory management by retailers: A new approach. Journal of Retailing and Consumer Services 2021, 62, 102662. [Google Scholar] [CrossRef]
- Jain, G.; Mallick, B. A study of time series models ARIMA and ETS. Available at SSRN 2898968 2017. [Google Scholar] [CrossRef]
- Lim, B.; Arık, S.Ö.; Loeff, N.; Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
- Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural computation 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Chevalier, G. LARNN: linear attention recurrent neural network. arXiv preprint arXiv:1808.05578 2018.
- Bui, V.; Le, N.T.; Nguyen, V.H.; Kim, J.; Jang, Y.M. Multi-behavior with bottleneck features LSTM for load forecasting in building energy management system. Electronics 2021, 10, 1026. [Google Scholar] [CrossRef]
- Ranganathan, A. The levenberg-marquardt algorithm. Tutoral on LM algorithm 2004, 11, 101–110. [Google Scholar]
- Gardner, M.W.; Dorling, S. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment 1998, 32, 2627–2636. [Google Scholar] [CrossRef]
- Zhang, X.; Xu, M.; Li, Y.; Su, M.; Xu, Z.; Wang, C.; Kang, D.; Li, H.; Mu, X.; Ding, X.; et al. Automated multi-model deep neural network for sleep stage scoring with unfiltered clinical data. Sleep and Breathing 2020, 24, 581–590. [Google Scholar] [CrossRef] [PubMed]
- Gustriansyah, R.; Ermatita, E.; Rini, D.P. An approach for sales forecasting. Expert Systems with Applications 2022, 207, 118043. [Google Scholar] [CrossRef]
- González-Sopeña, J.; Pakrashi, V.; Ghosh, B. An overview of performance evaluation metrics for short-term statistical wind power forecasting. Renewable and Sustainable Energy Reviews 2021, 138, 110515. [Google Scholar] [CrossRef]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining; 2019; pp. 2623–2631. [Google Scholar]
- Hyndman, R.J.; Khandakar, Y. Automatic time series forecasting: the forecast package for R. Journal of statistical software 2008, 27, 1–22. [Google Scholar] [CrossRef]
- Hyndman, R.; Koehler, A.B.; Ord, J.K.; Snyder, R.D. Forecasting with exponential smoothing: the state space approach; Springer Science & Business Media, 2008.
- da Silva, E.C.; Finardi, E.C.; Stefenon, S.F. Enhancing hydroelectric inflow prediction in the Brazilian power system: A comparative analysis of machine learning models and hyperparameter optimization for decision support. Electric Power Systems Research 2024, 230, 110275. [Google Scholar] [CrossRef]
- Klaar, A.C.R.; Stefenon, S.F.; Seman, L.O.; Mariani, V.C.; Coelho, L.S. Optimized EWT-Seq2Seq-LSTM with attention mechanism to insulators fault prediction. Sensors 2023, 23, 3202. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Seman, L.O.; da Silva, L.S.A.; Mariani, V.C.; dos Santos Coelho, L. Hypertuned temporal fusion transformer for multi-horizon time series forecasting of dam level in hydroelectric power plants. International Journal of Electrical Power & Energy Systems 2024, 157, 109876. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Seman, L.O.; Aquino, L.S.; dos Santos Coelho, L. Wavelet-Seq2Seq-LSTM with attention for time series forecasting of level of dams in hydroelectric power plants. Energy 2023, 274, 127350. [Google Scholar] [CrossRef]
- Khaldi, R.; El Afia, A.; Chiheb, R.; Tabik, S. What is the best RNN-cell structure to forecast each time series behavior? Expert Systems with Applications 2023, 215, 119140. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Silva, M.C.; Bertol, D.W.; Meyer, L.H.; Nied, A. Fault diagnosis of insulators from ultrasound detection using neural networks. Journal of Intelligent & Fuzzy Systems 2019, 37, 6655–6664. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Singh, G.; Yow, K.C.; Cimatti, A. Semi-ProtoPNet deep neural network for the classification of defective power grid distribution structures. Sensors 2022, 22, 4859. [Google Scholar] [CrossRef]
- Starke, L.; Hoppe, A.F.; Sartori, A.; Stefenon, S.F.; Santana, J.F.D.P.; Leithardt, V.R.Q. Interference recommendation for the pump sizing process in progressive cavity pumps using graph neural networks. Scientific Reports 2023, 13, 16884. [Google Scholar] [CrossRef] [PubMed]
- Stefenon, S.F.; Seman, L.O.; Klaar, A.C.R.; Ovejero, R.G.; Leithardt, V.R.Q. Hypertuned-YOLO for interpretable distribution power grid fault location based on EigenCAM. Ain Shams Engineering Journal 2024, 15, 102722. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Seman, L.O.; Singh, G.; Yow, K.C. Enhanced insulator fault detection using optimized ensemble of deep learning models based on weighted boxes fusion. International Journal of Electrical Power & Energy Systems 2025, 168, 110682. [Google Scholar] [CrossRef]
- Salazar, L.H.A.; Leithardt, V.R.Q.; Parreira, W.D.; da Rocha Fernandes, A.M.; Barbosa, J.L.V.; Correia, S.D. Application of Machine Learning Techniques to Predict a Patient’s No-Show in the Healthcare Sector. Future Internet 2022, 14, 3. [Google Scholar] [CrossRef]
- Fernandes, F.; Stefenon, S.F.; Seman, L.O.; Nied, A.; Ferreira, F.C.S.; Subtil, M.C.M.; Klaar, A.C.R.; Leithardt, V.R.Q. Long short-term memory stacking model to predict the number of cases and deaths caused by COVID-19. Journal of Intelligent & Fuzzy Systems 2022, 6, 6221–6234. [Google Scholar] [CrossRef]
- Vieira, J.C.; Sartori, A.; Stefenon, S.F.; Perez, F.L.; de Jesus, G.S.; Leithardt, V.R.Q. Low-Cost CNN for Automatic Violence Recognition on Embedded System. IEEE Access 2022, 10, 25190–25202. [Google Scholar] [CrossRef]
- Larcher, J.H.K.; Stefenon, S.F.; dos Santos Coelho, L.; Mariani, V.C. Enhanced multi-step streamflow series forecasting using hybrid signal decomposition and optimized reservoir computing models. Expert Systems with Applications 2024, 255, 124856. [Google Scholar] [CrossRef]
- Ribeiro, M.H.D.M.; da Silva, R.G.; Moreno, S.R.; Canton, C.; Larcher, J.H.K.; Stefenon, S.F.; Mariani, V.C.; dos Santos Coelho, L. Variational mode decomposition and bagging extreme learning machine with multi-objective optimization for wind power forecasting. Applied Intelligence 2024, 54, 3119–3134. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Seman, L.O.; Yamaguchi, C.K.; Coelho, L.D.S.; Mariani, V.C.; Matos-Carvalho, J.P.; Leithardt, V.R.Q. Neural Hierarchical Interpolation Time Series (NHITS) for Reservoir Level Multi-Horizon Forecasting in Hydroelectric Power Plants. IEEE Access 2025, 13, 54853–54865. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Seman, L.O.; Schutel Furtado Neto, C.; Nied, A.; Seganfredo, D.M.; Garcia da Luz, F.; Sabino, P.H.; Torreblanca González, J.; Quietinho Leithardt, V.R. Electric field evaluation using the finite element method and proxy models for the design of stator slots in a permanent magnet synchronous motor. Electronics 2020, 9, 1975. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Cristoforetti, M.; Cimatti, A. Automatic digitalization of railway interlocking systems engineering drawings based on hybrid machine learning methods. Expert Systems with Applications 2025, 281, 127532. [Google Scholar] [CrossRef]
- Branco, N.W.; Cavalca, M.S.M.; Stefenon, S.F.; Leithardt, V.R.Q. Wavelet LSTM for Fault Forecasting in Electrical Power Grids. Sensors 2022, 22, 8323. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Kasburg, C.; Freire, R.Z.; Silva Ferreira, F.C.; Bertol, D.W.; Nied, A. Photovoltaic power forecasting using wavelet neuro-fuzzy for active solar trackers. Journal of Intelligent & Fuzzy Systems 2021, 40, 1083–1096. [Google Scholar] [CrossRef]
- Seman, L.O.; Stefenon, S.F.; Mariani, V.C.; dos Santos Coelho, L. Ensemble learning methods using the Hodrick–Prescott filter for fault forecasting in insulators of the electrical power grids. International Journal of Electrical Power & Energy Systems 2023, 152, 109269. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Bruns, R.; Sartori, A.; Meyer, L.H.; Ovejero, R.G.; Leithardt, V.R.Q. Analysis of the ultrasonic signal in polymeric contaminated insulators through ensemble learning methods. IEEE Access 2022, 10, 33980–33991. [Google Scholar] [CrossRef]
- Corso, M.P.; Stefenon, S.F.; Singh, G.; Matsuo, M.V.; Perez, F.L.; Leithardt, V.R.Q. Evaluation of visible contamination on power grid insulators using convolutional neural networks. Electrical Engineering 2023, 105, 3881–3894. [Google Scholar] [CrossRef]
- Baldi, P.; Sadowski, P.J. Understanding dropout. Advances in neural information processing systems 2013, 26. [Google Scholar]
- Gal, Y.; Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the international conference on machine learning. PMLR; 2016; pp. 1050–1059. [Google Scholar]
- Stefenon, S.F.; Ribeiro, M.H.D.M.; Nied, A.; Mariani, V.C.; Coelho, L.S.; Leithardt, V.R.Q.; Silva, L.A.; Seman, L.O. Hybrid wavelet stacking ensemble model for insulators contamination forecasting. IEEE Access 2021, 9, 66387–66397. [Google Scholar] [CrossRef]
- Smyl, S. A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. International Journal of Forecasting 2020, 36, 75–85. [Google Scholar] [CrossRef]
- Javeri, I.Y.; Toutiaee, M.; Arpinar, I.B.; Miller, J.A.; Miller, T.W. Improving Neural Networks for Time-Series Forecasting using Data Augmentation and AutoML. In Proceedings of the 2021 IEEE Seventh International Conference on Big Data Computing Service and Applications (BigDataService); 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Stefenon, S.F.; Seman, L.O.; da Silva, E.C.; Finardi, E.C.; Coelho, L.d.S.; Mariani, V.C. Hypertuned wavelet convolutional neural network with long short-term memory for time series forecasting in hydroelectric power plants. Energy 2024, 313, 133918. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2011, 12, 2825–2830. [Google Scholar]
- Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time series analysis: forecasting and control; John Wiley & Sons, 2015.
- Hyndman, R.J.; Athanasopoulos, G. Forecasting: principles and practice; OTexts, 2018.
- Abdelouahab, K.; Pelcat, M.; Berry, F. Why TanH is a hardware friendly activation function for CNNs. In Proceedings of the Proceedings of the 11th international conference on distributed smart cameras; 2017; pp. 199–201. [Google Scholar]
- Dubey, A.K.; Jain, V. Comparative study of convolution neural network’s relu and leaky-relu activation functions. In Proceedings of the Applications of Computing, Automation and Wireless Systems in Electrical Engineering: Proceedings of MARC 2018; Springer, 2019; pp. 873–880. [Google Scholar]
- Huang, Z.; Ng, T.; Liu, L.; Mason, H.; Zhuang, X.; Liu, D. SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE, 2020; pp. 6854–6858. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Borré, A.; Seman, L.O.; Camponogara, E.; Stefenon, S.F.; Mariani, V.C.; Coelho, L.S. Machine fault detection using a hybrid CNN-LSTM attention-based model. Sensors 2023, 23, 4512. [Google Scholar] [CrossRef]
- Nagi, J.; Ducatelle, F.; Di Caro, G.A.; Cireşan, D.; Meier, U.; Giusti, A.; Nagi, F.; Schmidhuber, J.; Gambardella, L.M. Max-pooling convolutional neural networks for vision-based hand gesture recognition. In Proceedings of the 2011 IEEE international conference on signal and image processing applications (ICSIPA); IEEE, 2011; pp. 342–347. [Google Scholar]
- Makridakis, S. Accuracy measures: theoretical and practical concerns. International journal of forecasting 1993, 9, 527–529. [Google Scholar] [CrossRef]
- Montgomery, D.C.; Runger, G.C. Applied statistics and probability for engineers; John wiley & sons, 2020.
- Wasserstein, R.L.; Lazar, N.A. The ASA statement on p-values: context, process, and purpose, 2016.
- Goodman, S.N. Toward evidence-based medical statistics. 1: The P value fallacy. Annals of internal medicine 1999, 130, 995–1004. [Google Scholar] [CrossRef]
- Dixon, P. The p-value fallacy and how to avoid it. Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale 2003, 57, 189. [Google Scholar] [CrossRef]
- Bertolaccini, L.; Viti, A.; Terzi, A. Are the fallacies of the P value finally ended? Journal of Thoracic Disease 2016, 8, 1067. [Google Scholar] [CrossRef] [PubMed]
- Saraiva, D.A.F.; Leithardt, V.R.Q.; de Paula, D.; Sales Mendes, A.; González, G.V.; Crocker, P. PRISEC: Comparison of Symmetric Key Algorithms for IoT Devices. Sensors 2019, 19. [Google Scholar] [CrossRef] [PubMed]
- Lopes, H.; Pires, I.M.; Sánchez San Blas, H.; García-Ovejero, R.; Leithardt, V. PriADA: Management and Adaptation of Information Based on Data Privacy in Public Environments. Computers 2020, 9. [Google Scholar] [CrossRef]
- Noetzold, D.; Rossetto, A.G.d.M.; Barbosa, J.; Leithardt, V.R.Q. Investigation and Optimization of StringDeduplication with Custom Heuristic in Different Versions of the JVM. IEEE Latin America Transactions 2025, 23, 43–49. [Google Scholar] [CrossRef]
- Noetzold, D.; de Moraes Rossetto, A.G.; Silva, L.A.; Crocker, P.; Leithardt, V.R.Q. JVM optimization: An empirical analysis of JVM configurations for enhanced web application performance. SoftwareX 2024, 28, 101933. [Google Scholar] [CrossRef]




| Sensitivity to outliers | Explainability | Interpretability | |
|---|---|---|---|
| MAE | Low | Medium | Easy |
| MSE | Medium | Hard | Easy |
| SMAPE | High | Easy | Hard |
| Model A | Model B | p-value | t-statistic | ||
|---|---|---|---|---|---|
| MAE | |||||
| MSE | |||||
| M5 | SMAPE | ||||
| MAE | |||||
| MSE | |||||
| Stallion | SMAPE | ||||
| MAE | |||||
| MSE | |||||
| Stock Market | SMAPE | ||||
| MAE | |||||
| MSE | |||||
| Synthetic | SMAPE |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).