Submitted:
02 November 2023
Posted:
06 November 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Academic Background
2.1. Multiple Regression Analysis
![]() |
(1) |
| where | : | dependent variable (also called the outcome variable) |
| : | independent variables | |
| : | intercept, representing the value of when all independent variables are zero | |
| : | coefficients, representing the change in associated with a one-unit change in the corresponding independent variable while all other independent variables are fixed | |
| : | error term, representing the variability of which cannot be explained by the independent variables |
- Linear regression analysis, including both simple regression and multiple regression, depends upon some assumptions being met to obtain accurate results.
- Linearity: the relationship between independent variables and the expected value of the outcome variable is linear.
- Independence: observations are independent of each other, such that the error of each observation is also independent.
- Normality: the variance of error is normally distributed.
- Homoscedasticity: the variance of error is the same for any value of the independent variables.
![]() |
(2) |
2.2. Random Forest
2.2.1. Decision Tree
- 6.
- Maximum depth of the tree: setting a maximum depth for the decision tree restricts its growth, avoiding the creation of deep and complex trees.
- 7.
- Minimum number of observations for a split: setting the minimum number of observations required for a node ensures that only nodes with sufficient observations are considered for further splitting.
- 8.
- Minimum number of observations per leaf: setting the minimum number of observations required within each leaf node prevents the creation of small, isolated leaves.
2.2.2. Ensemble of Several Decision Trees
- 9.
- Random subset of data: during the construction of each tree, a random subset of the original dataset is used in a process called bootstrapping. On average, each tree will be trained using approximately two thirds of the dataset, with the remaining third used for validation.
- 10.
- Random subset of independent variables: when building each decision tree, only a random subset of the independent variables is considered for splitting. Therefore, splits in decision trees are made based on a subset of independent variables, introducing diversity among individual trees.
2.3. GBM
![]() |
(3) |
3. Model Construction and Evaluation
3.1. Exploratory Data Analysis and Preprocessing
- 11.
- Step 1: Datasets B and A are combined for all spots. Approximately 30 weather variables present only in dataset A were added to dataset B.
- 12.
- Step 2: Observations with missing values are deleted.
- 13.
- Step 3: The GHI_sat variable is modified. As GHI_sat is a value observed in the past, it is substituted with the value recorded 48 h prior to analysis.
- 14.
- Step 4: Missing values are identified again and substituted with data from the same time, 1 day earlier.
3.2. Experiment
3.2.1. Multiple Regression Analysis
![]() |
(4) |
3.2.2. Random Forest and GBM
- 15.
- Maximum depth of each tree: 10
- 16.
- Minimum number of observations for a split: 20
- 17.
- Minimum number of observations per leaf: 20
- 18.
- Total number of trees in the ensemble: 100
- 19.
- Learning rate: 0.05 (a hyperparameter only for GBM)
3.3. Evaluating the Accuracy of Energy Generation Prediction Models
![]() |
(6) |
4. Discussion
![]() |
(7) |
- 20
- p: parameter for the AR model. The observation values at time points affect the value at time point.
- 21
- d: parameter for differencing. The observation value at time point is deducted from the value at time point to make the data stationary.
- 22
- q: parameter for the MA model. The error of the continuous observation values affect the value at time point.
5. Conclusion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hwang, J.-Y. ; Korea Electric Power Corporation Home Page. Available online: https://home.kepco.co.kr/kepco/front/html/WZ/2022_01_02/sub13.html (accessed on 19 April 2023).
- Cheon, J.-H.; Lee, J.-T.; Kim, H.-G; Kang, Y.-H.; Yun, C.-Y.; Kim, C.-K.; Kim, B.-Y.; Kim, J.-Y.; Park, Y.-Y.; Kim, T.-H.; Jo, H.-N. Trend Review of Solar Energy Forecasting Technique. Journal of the Korean Solar Energy Society 2019, 39, 41–54, (in Korean with English abstract). [Google Scholar] [CrossRef]
- Ahmed, R.; Sreeram, V.; Mishra, Y.; Arif, M.D. A review and evaluation of the state-of-the art in PV solar power forecasting: Technniques and optimization. Renewable and Sustainable Energy Reviews 2020, 124, 109792. [Google Scholar] [CrossRef]
- Alcañiz, A.; Grzebyk, D. , Ziar, H., Isabella, O. Trends and gaps in photovoltaic power forecasting with machine learning. Energy Report 2023, 9, 447–471. [Google Scholar] [CrossRef]
- Jung, A.-H.; Lee, D.-H.; Kim, J.-Y.; Kim, C.-K.; Kim, H.-G.; Lee, Y.-S. Regional Photovoltaic Power Forecasting Using Vector Autoregression Model in South Korea. Energies 2022, 15, 7853. [Google Scholar] [CrossRef]
- Zhang, P.; Takano, H.; Murata, J. Daily Solar Radiation Prediction Based on Wavelet Analysis. In Proceedings of the SICE Annual Conference (IEEE), Tokyo, Japan, 13–18 September 2011. [Google Scholar]
- Oudjana, S.H.; Hellal, A.; Mahamed, I.H. Short Term Photovoltaic Power Generation Forecasting Using Neural Network. In Proceedings of the 11th International Conference on Environment and Electrical Engineering (IEEE), Venice, Italy, 18–25 May 2012. [Google Scholar]
- Kim, Y.-S.; Lee, S.-H.; Kim, H.-W. Prediction Method of Photovoltaic Power Generation Based on LSTM Using Weather Information. The Journal of Korean Institute of Communications and Information Sciences 2019, 44, 2231–2238. [Google Scholar] [CrossRef]
- Lee, S.-M.; Lee, W.-J. Development of a System for Predicting Photovoltaic Power Generation and Detecting Defects Using Machine Learning. KIPS Transactions on Computer and Communication Systems 2016, 5, 353–360. [Google Scholar] [CrossRef]
- Yona, A.; Senjyu, T.; Funabashi, T. Application of Recurrent Neural Network to Short Term-ahead Generating Power Forecasting for Photovoltaic System. In Proceedings of the 2007 IEEE Power Engineering Society General Meeting (IEEE), Tampa, Florida, USA, 24–28 June 2007. [Google Scholar]
- Yona, A.; Senjyu, T.; Funabshi, T.; Sekine, H. Application of Neural Network to 24-hours-ahead Generating Power Forecasting for PV System. In Proceedings of the 2008 IEEE Power and Energy Society General Meeting - Conversion and Delivery of Electrical Energy in the 21st Century, Pittsburgh, Pennsylvania, USA, 20–24 July 2008. [Google Scholar]
- Capizzi, G.; Napoli, C.; Bonanno, F. Innovative Second-generation Wavelets Construction with Recurrent Neural Networks for Solar Radiation Forecasting. IEEE Transactions on Neural Networks and Learning Systems 2012, 23, 1805–1815. [Google Scholar] [CrossRef] [PubMed]
- Cao, S.; Weng, W.; Chen, J.; Liu, W.; Yu, G.; Cao, J. Forecast of Solar Irradiance Using Chaos Optimization Neural Networks. In Proceedings of the 2009 Asia-Pacific Power and Energy Engineering Conference (IEEE); Wuhan, China 28-30 March 2009. [Google Scholar]
- Mellit, A.; Pavan, A.M. A 24-hour Forecast of Solar Irradiance Using Artificial Neural Network: Application for Performance Prediction of a Grid-connected PV Plant at Trieste, Italy. Solar Energy 2010, 84, 807–821. [Google Scholar] [CrossRef]
- Wang, F.; Mi, Z.; Su, S.; Zhao, H. Short-term Solar Irradiance Forecasting Model Based on Artificial Neural Network Using Statistical Feature Parameters. Energies 2012, 5, 1355–1370. [Google Scholar] [CrossRef]
- Hocaoğlu, FO. ; Gerek ÖN.; Kurban M. Hourly Solar Radiation Forecasting Using Optimal Coefficient 2-D Linear Filters and Feed-forward Neural Networks. Solar Energy 2008, 82, 714–726. [Google Scholar] [CrossRef]
- Yun, Z.; Quan, Z.; Caixin, S.; Shaolan, L.; Yuming, L.; Yang, S. RBF Neural Network and ANFIS-based Short-term Load Forecasting Approach in Real-time Price Environment. IEEE Transactions on Neural Networks and Learning Systems 2008, 23, 853–858. [Google Scholar] [CrossRef]
- Huang, Y.; Lu, J.; Liu, C.; Xu, X.; Wang, W.; Zhou, X. Comparative Study of Power Forecasting Methods for PV Stations. In Proceedings of the International Conference on Power System Technology (IEEE), Zhejiang, Hangzhou, China, 24–28 October 2010. [Google Scholar]
- Chen, C.; Duan, S.; Cai, T.; Liu, B. Online 24-h Solar Power Forecasting Based on Weather Type Classification Using Artificial Neural Network. Solar Energy 2011, 85, 2856–2870. [Google Scholar] [CrossRef]
- Kim, G.G. .; Choi, J.H; Park, S.Y.; Bhang, B.G.; Nam, W.B.; Cha, H.L.; Park, N.S. Prediction Model for PV Performance with Correlation Analysis of Environmental Variables. IEEE Journal of Photovoltaics 2019, 9, 832–841. [Google Scholar] [CrossRef]
- Schwingshackl, C.; Petitta, M.; Wagner, J.E; Belluardo, G.; Moser, D.; Castelli, M.; Zebisch, M.; Tetzlaff, A. Wind Effect on PV Module Temperature: Analysis of Different Techniques for an Accurate Estimation. Energy Procedia 2013, 40, 77–86. [Google Scholar] [CrossRef]




| Variable | Description | Dataset A | Dataset B | Unit |
| date | Date | O | O | |
| KPX | Energy generation quantity (outcome variable) | O | O | MWh |
| cGHI | Clear-sky global horizontal irradiance | O | O | Wh/m2 |
| Szen | Sun zenith angle | O | O | Degree |
| Sazi | Sun azimuth angle | O | O | Degree |
| ExtI | Extraterrestrial radiation | O | O | Wh/m2 |
| Temp_nwp | Predicted ground temperature | O | K | |
| GHI_nwp | Predicted global horizontal irradiance | O | Wh/m2 | |
| GHI_sat | Satellite-based global horizontal irradiance | O | Wh/m2 | |
| TMP_P0_L1_GLC0 * | Surface air temperature | O | K | |
| TMP_P0_L103_GLC0 * | Air temperature at 1. 5 m altitude | O | K | |
| DPT_P0_L103_GLC0 * | Dew point at 1. 5 m altitude | O | K | |
| TTDIA_P0_L103_GLC0 * | Temperature change rate at 1. 5 m altitude | O | K/s | |
| RH_P0_L103_GLC0 * | Relative humidity at 1. 5 m altitude | O | % | |
| UGRD_P0_L103_GLC0 * | U component of wind speed at 10 m altitude | O | m/s | |
| VGRD_P0_L103_GLC0 * | V component of wind speed at 10 m altitude | O | m/s | |
| PRES_P0_L1_GLC0 * | Surface barometric pressure | O | Pa | |
| PRMSL_P0_L101_GLC0 * | Sea level pressure | O | Pa | |
| DIST_P0_L1_GLC0 * | Altitude | O | m | |
| HPBL_P0_L220_GLC0 * | Boundary layer altitude | O | m | |
| LCDC_P0_L200_GLC0 * | Low-level cloud cover | O | 0–1 | |
| MCDC_P0_L200_GLC0 * | Mid-level cloud cover | O | 0–1 | |
| HCDC_P0_L200_GLC0 * | High-level cloud cover | O | 0–1 | |
| VIS_P0_L103_GLC0 * | Visibility | O | m | |
| TMAX_P8_L103_GLC0_max1h * | Maximum temperature in 1 h | O | K | |
| TMIN_P8_L103_GLC0_min1h * | Minimum temperature in 1 h | O | K | |
| LHTFL_P8_L1_GLC0_avg1h * | Latent heat flux 1-h average | O | W/m2 | |
| NCPCP_P8_L1_GLC0_acc1h * | Non-flowing hourly accumulated precipitation | O | kg/m2 | |
| LSPRATE_P8_L1_GLC0_avg1h * | Non-flowing hourly average precipitation | O | kg/m2/s | |
| LSSRATE_P8_L1_GLC0_avg1h * | Non-flowing hourly average snowfall rate | O | kg/m2/s | |
| CPRAT_P8_L1_GLC0_acc1h * | Convective hourly accumulated precipitation | O | kg/m2 | |
| MCONV_P8_L1_GLC0_acc1h * | Moisture convergence rate | O | kg/kg/s | |
| FRICV_P8_L103_GLC0_max1h * | Maximum turbulent wind speed in 1 h | O | m/s | |
| FRICV_P8_L103_GLC0_min1h * | Minimum turbulent wind speed in 1 h | O | m/s | |
| DSWRF_P8_L1_GLC0_avg1h * | Total solar radiation 1-h average | O | W/m2 | |
| VBDSF_P8_L1_GLC0_avg1h * | Direct radiation 1-h average | O | W/m2 | |
| DLWRF_P8_L1_GLC0_avg1h * | Longwave radiation 1-h average | O | W/m2 |
| MR | RF | GBM | ||||
|---|---|---|---|---|---|---|
| RMSE | MAE | RMSE | MAE | RMSE | MAE | |
| All | 12.00% | 9.06% | 11.69% | 8.61% | 11.34% | 8.18% |
| Spot 0 | 10.10% | 7.62% | 9.71% | 7.16% | 9.62% | 7.02% |
| Spot 1 | 9.89% | 7.25% | 9.56% | 6.76% | 9.51% | 6.68% |
| Spot 2 | 11.94% | 8.86% | 11.37% | 8.22% | 11.20% | 8.01% |
| Spot 3 | 14.13% | 10.94% | 13.56% | 10.18% | 13.43% | 9.92% |
| Spot 4 | 10.65% | 7.70% | 10.81% | 7.69% | 10.63% | 7.50% |
| Spot 5 | 13.09% | 10.07% | 12.73% | 9.40% | 12.70% | 9.20% |
| Spot 6 | 10.32% | 7.66% | 9.89% | 7.14% | 9.95% | 7.12% |
| Spot 7 | 11.48% | 8.39% | 11.45% | 8.02% | 11.27% | 7.82% |
| Spot 8 | 12.31% | 9.51% | 11.68% | 8.66% | 11.51% | 8.36% |
| Spot 9 | 13.94% | 9.57% | 11.32% | 8.36% | 11.15% | 8.15% |
| Spot 10 | 13.05% | 9.95% | 12.96% | 9.68% | 12.89% | 9.55% |
| Spot 11 | 12.04% | 8.50% | 10.98% | 7.76% | 10.96% | 7.57% |
| Spot 12 | 13.08% | 10.11% | 12.55% | 9.35% | 12.50% | 9.17% |
| Spot 13 | 11.49% | 8.92% | 11.34% | 8.60% | 11.32% | 8.50% |
| Spot 14 | 11.58% | 8.63% | 11.40% | 8.33% | 11.44% | 8.23% |
| Spot | |
|---|---|
| Spot 0 | (5, 1, 0) |
| Spot 1 | (5, 1, 0) |
| Spot 2 | (5, 1, 0) |
| Spot 3 | (5, 1, 0) |
| Spot 4 | (5, 1, 0) |
| Spot 5 | (5, 1, 0) |
| Spot 6 | (5, 1, 0) |
| Spot 7 | (1, 1, 0) |
| Spot 8 | (5, 1, 0) |
| Spot 9 | (5, 1, 0) |
| Spot 10 | (5, 1, 0) |
| Spot 11 | (1, 1, 0) |
| Spot 12 | (5, 1, 0) |
| Spot 13 | (1, 1, 0) |
| Spot 14 | (1, 1, 0) |
| Spot | RMSE | MAE |
|---|---|---|
| Spot 0 | 25.90% | 13.92% |
| Spot 1 | 21.47% | 11.48% |
| Spot 2 | 29.07% | 15.57% |
| Spot 3 | 37.27% | 19.49% |
| Spot 4 | 24.61% | 13.00% |
| Spot 5 | 20.27% | 10.52% |
| Spot 6 | 35.30% | 18.87% |
| Spot 7 | 21.65% | 11.03% |
| Spot 8 | 46.56% | 24.71% |
| Spot 9 | 19.16% | 10.18% |
| Spot 10 | 52.69% | 27.76% |
| Spot 11 | 19.95% | 10.26% |
| Spot 12 | 51.29% | 26.88% |
| Spot 13 | 16.65% | 8.42% |
| Spot 14 | 35.33% | 18.08% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).





