Parameter Optimization of Motion Estimation for Geograph- ically Distributed Photovoltaic Power Forecasting

The power-generation capacity of grid-connected photovoltaic (PV) power systems is increasing. As output power forecasting is required by electricity market participants and utility operators for the stable operation of power systems, several methods have been proposed using physical and statistical approaches for various time ranges. A short-term (30 min ahead) forecasting method has been previously proposed by our laboratory for geographically distributed PV systems using motion estimation. This study focuses on an important parameter for estimating the proposed motion and optimizing the parameter. This parameter is important because it is associated with the smoothness of the vector field, which is the result of motion estimation and influences the forecasting accuracy. In the periods with drastic power output changes, the evaluation was conducted on 101 PV systems located within a circle of 15-km radius in the Kanto region of Japan. The results indicate that the absolute mean error of the proposed method with the optimized parameter is 10.3%, whereas that of the persistent prediction method is 23.7%. Therefore, the proposed method is effective in forecasting for periods when PV output changes drastically in a short time.


Introduction
Reducing greenhouse gas emissions caused by power production can effectively address climate changes [1]. Therefore, the installation of photovoltaic (PV) power generation systems is significantly increasing with the decreasing installation costs [2], reaching 39 GW in 2010 and 760 GW in 2020 [3]. However, as opposed to the existing dispatched power generations, the power generation of PV relies on the weather conditions, thus varying drastically and uncertainly. The high penetration of PV power generation makes it difficult to maintain the balance between demand and supply in electric power systems. This uncertain variability in PV generation can significantly impact the stable and economical operation of power systems [4,5]. Therefore, power system operators require forecasting of PV generation output; particularly, short-term forecasting is required to determine the need for demand response, and a quick-start generator is required for the stable operation of the power grid [4]. To date, different forecasting methods have been proposed for various time scales using solar radiation forecasting or direct PV power forecasting. PV forecasting can be classified into persistent forecasting, physical approach, and statistical approach [6,7]. The physical approach forecasts solar irradiance data from a numeric weather predictor atmospheric model. A common approach for regional PV output using numeric weather predictor (NWP) data is to upscale a set of representative regional PV systems. Saint-Drenan et al. [8] proposed a method to probabilistically forecast the PV output of the entire forecast area from the reference PV power and meteorological data obtained from NWP. Ma et al. [9] proposed a method to correct the NWP irradiance using the measured irradiance and optimized the prediction model using the particle swarm optimization algorithm. Statistical approaches use historical data to identify relationships for forecasting future PV outputs. Statistical methods can be divided into two categories, namely time-series-based and artificial intelligence forecasting models [6]. The former is a forecasting method that evaluates past patterns based on periodically obtained data. Bacher et al. [10] reported that autoregression model forecasting can be effective for shortterm forecasting based on the results obtained using NWP, observed PV data, and data from both sources. Reikard [11] estimated the solar radiation intensity using the autoregressive integrated moving average (ARIMA) method and compared it with that obtained from the artificial neural network (ANN) method. In the future, it may be possible to evaluate past nonlinear patterns and capture sharper changes. Neural networks use different sets of input data to forecast future output patterns and improve the accuracy of the model by carefully selecting different influential parameters. Pedro and Coimbra [12] implemented an ANN method optimized by a genetic algorithm (GAs/ANN) and compared it with the persistent forecast, ARIMA method, and k-nearest neighbors (kNN) method. They verified that GAs/ANN is superior to the other forecast methods. Zhou et al. [13] used two long short-term memory models to forecast the temperature and PV output and combined them to enhance the forecast accuracy. Behera et al. [14] implemented an optimal design methodology for an extreme learning machine-based forecasting model of a PV system and compared its results with those of existing models, such as the backpropagation model.
In addition to these approaches, forecasting methods based on image processing techniques have been proposed. Studies have been conducted to estimate solar radiation from satellites and grand-based sensors. Hammer et al. [15] proposed a method for estimating solar radiation from satellite images and a statistical method for estimating the movement of clouds. Perez et al. [16] reported that satellite-derived cloud motion-based forecasts are better models for short-term forecasts than NWP-based solar radiation forecasts. Bright et al. [17] proposed a method for selecting reference PVs to support PV output nowcasts from satellite images. Feng et al. [18] developed a deep convolutional neural network for global horizontal irradiance one hour ahead of the sky image. Chu et al. [19] proposed a reforecasting model to improve the forecasting by reforecasting with an ANN and optimizing with a GA from the sky racking technique, ARIMA, and kNN forecasting. Chow et al. [20] estimated cloud motion from the optical flow method using a sky-imaging system, and the best forecasting may be obtained depending on the smoothness parameter. Table 1 presents a brief description and summary of these forecasts. The effective timescales in Table 1 are summarized in [21]. Table 1. Characteristics of each forecast approach.

Approach Description Effective time scale References
Physical approach NWP methods are popular long-term forecast methods for solar irradiance. Long-term [8,9] Statistical approach This includes forecast by autoregressive and artificial intelligence, which are particularly effective in the short-term forecast.
Very short-term, short-term [10][11][12][13][14] Image processing approach This involves forecasts based on images captured using sensors installed on the ground and artificial satellites. Very short-term, short-term [15][16][17][18][19][20] Persistence approach This approach is effective in very short-term forecasting and is used as a benchmark for the forecast. Very short-term [6,12,16] [ 6,12,16] In contrast to these studies, our previous proposal focused on the temporal transition of output power from geographically distributed PV systems [22]. The output power distribution of irregularly located PV systems was converted into a regularly arranged mesh distribution with the same latitude and longitude intervals. Forecasting was conducted by estimating the motion of mesh distribution. In the future, the smart grid will be upgraded to capture the PV output power; and therefore, this method is economical for short-term forecasting, as it allows for predictions without additional equipment.
The existing method uses the optical flow for forecasting, wherein the motion field between two consecutive images is estimated. This motion field is obtained by minimizing the function, comprising the data term, and regularization term using variational methods [23,24]. The parameter of the regularization term is a model of the smoothness of the motion field obtained using this function. This parameter is important because the obtained motion field is different for each value of the parameter. Previously [22,25], the mesh size and interpolation method for blank values when converting to a mesh distribution were determined. However, discussion regarding the aforementioned important parameter that affects the motion estimation was insufficient.
The objective of this study is to optimize the parameter of motion estimation, which is essential for estimating the normalized value (NV) distribution. The PV distribution in Japan (latitude 31.20°6'-39°80' and longitude 129°60'-141°60') was normalized every 30 min for one year from 2013 to 2014, converted to geographically distributed mesh data, the motion of the NV distributions was estimated, and then the forecast was evaluated. The PV output volatility is important when evaluating a forecast. Certain periods exist, such as clear days, when it is easy to forecast the PV generation. These times reduce the error difference when evaluating the motion estimations of each parameter, making it difficult to evaluate the parameters. Pedro and Coimbra [12] reported that developing seasonal models for each period of variability should improve the evaluation of the forecasting method. The objective of this method is to forecast the PV output during periods when the output of multiple PVs changes. As the changes in the PV output over a wide range differ from place to place, capturing the changes in multiple PV outputs is difficult. Therefore, in this study, the period when the output of PVs in a certain area within a 15-kmradius in Japan changed drastically within 30 min was extracted and evaluated. This paper evaluates the proposed method with the multiple PV output changed periods in a certain area where the output changes are correlated to each other. The contributions of this study can be summarized as follows. 1. A method is proposed for evaluating multiple PV generation forecasts when the PV output changes drastically. 2. The parameter that makes the best forecast for motion estimation reduces the error by 56.6% compared to that of the persistence forecasting. It is an effective forecasting method for periods when PV output changes drastically. 3. The characteristics of forecasting a wide range of PV outputs using a motion estimation method are presented. The remainder of this paper is organized as follows. Section 2 describes the data preprocessing, estimation, and evaluation methods of the proposed method. Section 3 explains the characteristics of the motion estimation, forecasting results at drastic change periods, and reasons for the errors. Finally, Section 4 concludes the study.

Materials and Methods
This section describes the input data, forecasting methods, and evaluation methods. Figure 1 depicts the flowchart of the proposed method. The forecast method uses the power generation output of the PV and position data as input and performs the forecast using an image processing approach. In this method, prior to motion estimation, the geographically distributed power output of PVs is preprocessed to enable image processing. The power output of each PV system is forecasted based on the estimation results. Additionally, this method changes the smoothness parameter of the motion estimation and evaluates it during periods when the output changes drastically. In this section, first the conditions under which the period was extracted and then the evaluation methods are described.

Dataset
The dataset used in this study is the actual data of the output power from 5097 PV systems measured at 30 min intervals from August 1, 2013, to July 31, 2014, in Japan. Therefore, the forecast timescale Δ was set to 30 min. Table 2 presents an outline of the dataset.

Pre-processing
The proposed method uses motion estimation, which is an image processing method. To apply the motion estimation to forecast the output of geographically distributed PV systems with various power ratings and forms, the output power data were normalized and converted to mesh data. The following process was implemented to achieve this.

Normalization
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 10 February 2022 doi:10.20944/preprints202202.0143.v1 The output power from the PV systems is affected by the panel size, material, azimuth angle, tilt angle, and installation location. Therefore, normalization is necessary to compare numerous geographically distributed PV systems. In this study, NV of the output power is defined as indicated in Equation (1).
where ( ) denotes the actual PV power output at a specific time , and 2 ( ) represents the highest power output at the same time in the previous two weeks. Figure  2 depicts the concept of normalization implemented on July 14, 2014. Figure 2 (a) shows the output power ( ) of a PV system on July 14, 2014 (orange line) and that observed on the previous 14 days from June 30 to July 13, 2014 (gray lines); their maximum values 2 ( ) at the corresponding times are also indicated (red line). The two-week period is selected under the assumption that at least one sunny day exists, and the seasonal change in sun elevation is negligible during this period. Figure 2

Conversion to Mesh Data
The proposed method converted the NVs of geographically distributed PV systems into a mesh distribution. The mesh data of each parcel in the regularly arranged distribution with equally spaced widths of latitude and longitude were obtained from the average of the NVs of PV systems located in the parcel. In this study, the width of the mesh was set to 0.02 ° in latitude and longitude, based on the reports of a previous study [22]. Error! Reference source not found. depicts the conversion diagram to mesh data; Figure 3 (a) shows the actual locations of the PV systems and their NVs at 11:00 on July 14, 2014, are classified by color, and Figure 3 (b) shows the mesh distribution at that time. In Error! Reference source not found. (b), the parcels with no corresponding PV are considered blank values and are indicated in black. If several blank values exist in the mesh data, the motion estimation cannot be performed. Therefore, we performed interpolation (Error! Reference source not found. (c)) based on Delaunay triangulation using "grid data," which is a function of MATLAB 2020a [26].

Motion Estimation
The geographical distribution of NV is assumed to move over time based on the movement of elements affecting weather conditions, such as clouds. The optical flow method, which involves image processing, was used to analyze the motion [25,26]. Assuming that NV is invariant and moves linearly in the time interval Δ , the NVs expressed as ( , , ) during the period from time − Δ to can be expressed as indicated in Equation (2).
where ( , ) denotes the position of NV in the east-west and north-south directions, and ( , , ) and ( , , ) represent the velocities of NV at position ( , ) and at time in the east-west and north-south directions, respectively. A regulation term is defined considering that adjacent motion vectors tend to be aligned. The NV data term and regularization term can be determined using Equations (3) and (4), respectively, in each mesh.
where ∇= � , � represents the partial differential operator. The energy function is expressed as shown in Equation (5) while introducing a weight coefficient , which determines the extent to which and are spatially aligned. The value of is adjusted to minimize the estimation error. The vector distribution can be obtained by solving the energy function minimization problem, and the solution is expressed in Equations (6) and (7), which are Euler-Lagrange equations.
The detailed solution and algorithm are reported in [23]. As mentioned previously, the width of the mesh is 0.02, which corresponds to 1.806 km and 2.219 km in the east-west and north-south directions, respectively, at latitude 35°78 ′ and longitude 140°04′. However, the motion vectors for 30 min may be longer than the mesh width. Sawada and Takahashi [27] analyzed the migration of precipitation areas and reported that the speed was distributed in the range of 2-8 m/s. Bosch and Kleissl [28] proposed a method to estimate the cloud velocity using inverter output and reported that the velocity of clouds is distributed in the range of 3-35 m/s based on the analysis.
A multiscale approach was used to avoid the local minima of the energy function indicated in Equation (5). This approach generates a rougher distribution with multiple size scales, seeks the global minimum, and propagates the solution gradually to finer scales. In this study, the Gaussian filter was used to generate a four-fold rougher distribution. Figure 4 illustrates the motion estimation concept. Using the optical flow method, vector AB representing the NV of location A at time − Δ and then moved to location B at time is estimated. Assuming that the movement is a constant velocity linear motion and persists until + Δ , vector BC is obtained by moving vector AB in parallel. The forecasting distribution of all mesh at time + is obtained by interpolating the location of C from each NV obtained from vector BC.

Error Evaluation Method
In this study, the periods when NV values of more than 80% of the PV systems changed to higher than 0.2 in 30 min are referred to as drastic change periods. Generally, PV power forecasting is difficult when a drastic change occurs. Periods with drastic changes were extracted to evaluate forecast errors under harsh conditions. The forecast error is evaluated in a certain area in the Kanto Region with a sufficient number of PVs while varying , which is the parameter in the motion estimation, to minimize the error caused by the interpolation process. Table 3 summarizes the dataset of the  evaluated  area.  The  central  point  of  the  area  is  located at latitude 35°78 ′ and longitude140°04′, and the radius of the area is 15 km. Figure 4 depicts the area, wherein the green circle denotes the evaluated area, and the blue circle with a radius of 65 km indicates the area that has probably been used to estimate NV distribution in the green area.
In this study, persistence forecasting was implemented for comparison. As the influence of the sun elevation in a day is assumed to be represented by multiplying 2 ( ), the PV power output in persistence forecasting can be calculated using Equation (8).  The absolute percent error (APE) is used as an evaluation method for examining the forecast error in drastic change periods, as indicated in (9). Additionally, the definitions of the mean absolute percent error (MAPE) and root mean square error (RMSE) are indicated in Equations (10) and (11), respectively.
where N denotes the number of time steps evaluated, represents the forecasted PV output, denotes the observed PV output, and indicates the maximum observed PV output for a year.

Results and Discussion
This section presents the forecast results of the motion estimation that were evaluated using 101 PVs in the evaluation area. Initially, the importance of extracting and evaluating the period in which the outputs change drastically was confirmed by comparing the forecasting results of days exhibiting a drastic change in the period with those obtained on a clear day. Subsequently, the error distribution of the persistent forecasting and the nine parameters at the extracted times were evaluated to determine the optimized parameter. Finally, the forecast results for a certain period were analyzed to determine the cause of the forecast error at the period when the error was reduced by changing the parameters and at periods when the forecast error could not be reduced by any of the parameters. This analysis improves the understanding of the characteristics of the motion estimation method.

Characteristics of Motion Estimation Based on the Weather
Error! Reference source not found. (a) depicts the sum of the power generation of 101 PV systems. Error! Reference source not found. (b) shows the error of the total output in the evaluation area.  Figure 7 shows the time variation of the observed NV distribution at the period used to estimate NV motion and the forecasting period respectively, at 12:00 on March 26, 2014. Figure 8 illustrates the distribution of motion vectors estimated for forecasting the NV distribution at 12:00 on March 16, 2014, under clear weather conditions. Figure 7 indicates that the change of NVs is small in the evaluation area; the forecast error of the proposed method is similar to that of the persistence forecasting. On a clear day, motion estimation estimates small movements, as depicted in Figure 8.
Error! Reference source not found. (a) andError! Reference source not found. (b) depict the sum of the power generation of 101 PV systems and the corresponding error of the total output in the evaluation area, respectively. Table 5 summarizes the daily average MAPE and RMSE of the total output calculated on September 5, 2013. Figure 10 shows the time variation of the observed NV distribution at the period used to estimate NV motion and the forecasting period respectively, at 12:30 on September 5, 2013. Figure 8 Figure 10 indicates that there was a visible change in the distribution of NVs in the evaluation area; thus, a drastic change occurred in the output power, and the large motion vectors of the NV distribution were calculated, as depicted in Figure 11. In such a period, the forecast error of the proposed method is expected to reduce in comparison with that of the persistence forecasting. Figure 11 indicates that the estimation results differ according to the value of parameter .
This method is classified as a short-term forecast, and PV has a higher correlation with the previous output as the forecast time-horizon scale is shorter. Additionally, the result of motion estimation is close to that of the persistence forecast when the period NV distribution remains unchanged.
In this study, the proposed method is compared with persistence forecasting to verify its effectiveness, and parameter is optimized for the drastic change periods.

Evaluation of the Estimation Error
The estimation results of the proposed forecasting were evaluated to optimize the motion estimation parameter and compared with the persistence forecasting for periods when the output power varied drastically. Figure 12 depicts a box plot of the estimation errors for the periods in the evaluated area. In the case of persistence forecasting, the mean was 23.7%, and the median was 22.2%. In the case of proposed forecasting with the best motion estimation parameter = 0.019, the median was 7.20%, and the mean was 10.3%. In other words, the means of the proposed forecasting with the best parameter reduced by 56.6% in comparison with the persistence forecast. Therefore, it is confirmed that the proposed method is effective for forecasting periods with a drastic change in output power.

Results of Estimation Errors
The estimation errors observed in the proposed method can be attributed to two reasons, namely the pre-processing and motion estimation. During pre-processing, the influence of the differences in tilt angle and installation orientation of the PV arrays is not completely canceled out by the normalization process. Consequently, the representative value of each mesh used in the estimation is the average of all NV values of the corresponding PV systems in the mesh. Furthermore, certain errors are caused by the interpolation of NVs in areas with no data; errors on motion estimation occur when assumptions of NV invariance and NV having a constant velocity linear motion were inappropriate. Figure 13, Figure 14, and Figure 15 depict the box plot of the estimation errors, NV distribution for forecasting, and the estimated flow visualization, respectively, at 12:00 on March 21, 2014. The figures indicate that the result of the proposed forecasting is substantially better than that of the persistence forecasting. Figure 13 indicates that the proposed forecasting with a large parameter does not reduce the error compared to persistence forecasting. Parameter λ is associated with the unity of the vector distribution. Figure 14 (a)-(c) depict a lump of low NVs in the center that expands, and the spread of NV is non-uniform depending on the direction. In this case, the forecast can be improved with a low estimation that considers local motion. Figure 15 shows that lower yields larger vectors. By contrast, certain motion vectors cross each other when λ is extremely small in Figure 15, which is considered unnatural because weather changes are not considered to intersect, λ = 0.019-0.036 is optimal for appropriate motion estimation. These errors are caused when a generated vector distribution does not follow the pattern of weather changes. Figure 16, Figure 17, Figure 18 and Figure 19 depict the box plot of the estimation errors, NV distribution for forecasting, estimated flow visualization at 12:30, and estimation flow visualization at 13:00, all on September 6, 2013, respectively. As indicated in the figures, the result of the proposed forecasting is not better than that of the persistence forecasting. Figure 16 indicates that motion estimation does not reduce the forecasting error compared to that of persistence forecasting. Figure 17 indicates that the movement of the lumps of the NVs may differ at different times of the day, such as between 11:30 and 12:00, and 12:00 and 12:30. Figure 18 shows the distribution of motion vectors with the optimized parameter λ = 0.036 and 0.019 obtained using the NV distribution from − Δ to , and Figure 19 depicts that obtained from to + Δ . Despite the time difference being only Δ = 30 min, the distributions are different from each other. Although it is assumed that the movement obtained from − Δ and persists in the period that is obtained from and + Δ using the proposed method, this assumption is not suitable for this period.

Conclusions
In this study, a parameter for motion estimation, which was insufficiently discussed in a previous study [22], was investigated for a certain geographic area during drastic change periods. The method proposed in this paper is achieved by normalizing the geographically distributed PV output data, converting them to mesh data, analyzing those motions, generating estimated mesh data from the motions, and converting them into the output of each PV for forecasting. The forecasting was performed using the output data of 5097 PVs located in Japan (latitude 31°20′-39°80′, longitude 129°6'-141°60') for every 30 min from August 2013 to September 2014. The period when output changed drastically in a certain area within a radius of 15 km in the Kanto region (latitude 35°78 ′ and longitude140°04 ′ ) was extracted to evaluate the forecasting method. The parameters were evaluated from an error distribution of 101 PV units. In comparison with the persistence forecasting, the parameter for motion estimation, which exhibited the smallest error, was able to significantly reduce the error from 23.7% to 10.3%. Motion estimation indicates that the proposed forecasting with a large does not reduce the error compared to persistence forecasting. This indicates that low λ estimation improves the prediction by considering local motion when estimating the motion of a wide range of non-uniform NVs.
This result indicates that the parameter presents the best forecast for motion estimation and reduces the error by 56.6% than that of persistence forecasting. The proposed method can effectively forecast for periods when PV output changes drastically.
The following studies should be conducted in the future: 1. Correlation between the number of PV data affecting the evaluation area and forecasting accuracy: The proposed forecasting was optimized using data obtainable in this case. However, the positional relationship and density of PV output in the evaluation area that affect the forecasting accuracy have yet to be investigated. Therefore, it should be considered in the future. 2. Consideration of ensemble learning: It has been reported that ensemble learning, which combines different forecasting models, improves the forecasting accuracy. Therefore, we believe that further improvement of accuracy can be achieved by combining ensemble learning with statistical methodology.