Application of Parallel ANN-PSO to Hourly Solar PV Estimation

: Based on collected power generation data associated with solar irradiance, PV system conversion efficiency and cell temperature, an hourly solar PV power estimation model by a parallel artificial neural network (ANN) and particle swarm optimization (PSO) algorithm is proposed. Weight matrices related to different seasons and geographic areas for estimation power generations have been trained by real measured operation data. The parallel PSO algorithm with heuristic global optimization technique assists the ANN training process to get near optimal solutions precisely. The accuracy and reliability of estimation is audited by actual details of photovoltaic power stations in various regions and scales. The results of estimation model can not only assist the electricity dis-patcher to explicitly monitor the trend of solar power generation in different areas, but also coordi-nate with traditional power plants to meet load demand more accurately. The presented model will bring benefits to power dispatching for larger scales of intermittent and unstable solar power generation.


Introduction
Solar photovoltaic (PV) systems are fast-growing renewable energy sources; nevertheless, varying solar irradiation makes PV power output intermittent and non-dispatchable. Accurate estimating of solar irradiance and generated electricity can eliminate the possibility of power imbalance to enhance system reliability, so forecasting or estimating electricity of solar PV panels in the system networks is important for power system coordinators or planners. Solar irradiance, solar PV receiving area, system conversion efficiency and PV cell temperature are remarkable factors for solar power generation. Solar irradiance including direct and diffuse radiation with uncertainty is difficult to predict. System transform efficiency such as solar PV conversion, inverter and converter set, electronic circuit loss, dust content, temperature characteristic, is improved owing to progress of science and technology.
Much literature has been focusing on the forecasting or estimation of solar power generation. These techniques can be classified into some types. Distinct kinds of Artificial Neural Network (ANN) approaches have been utilized in finding solutions for several problems such as prediction, classification and pattern recognition [1]- [4]. Employing opted suitable training data for ANN model, it is possible to predict future trends while some immediate data is not available or missing. Meanwhile, the key factors to affect solar PV electricity are solar irradiance, seasonal weather, temperature, humid and system efficiency. A hybrid forecasting algorithm with gradient-descent and ANN is developed for short-term solar power prediction [1]. Meteorological conditions affect PV system output, so ANN is trained by power output data based on fuzzy theory and weather reported data [2]. A forecasting model is designed with the combination of Particle Swarm Optimization (PSO) and Back Propagation Neural Network (BPNN). Regardless the day type transitions and other relevant factors, the prediction accuracy satisfies the management requirements of the power grid companies [3]. A deep Convolutional Neural Network (CNN) is applied for short-term PV power forecasting [4]. In the proposed method, a twodimensional data form with correlations in both daily and hourly timescales, reduces the data size and is easier for model training along with meteorological elements. In order to find a precise expectation for day-ahead PV power generation, conventional models are not enough under extreme weather conditions. Considering Aerosol index data as an additional input parameter, a BPNN approach is utilized to forecast the next 24-h PV power outputs [5]. An enhanced technique for short-term electric load forecasting lowers computational time and enhance load forecasting performance with respect to other existing techniques. [6]. A hybrid quantized Elman Neural Network (HQENN) with the least number of quantized inputs, hourly historical load and predicted target temperature for acceptable high precision of short-term load forecasting [7].
Much research has been devoted in finding optimum solutions by PSO algorithm associated with solar PV estimations. PSO is used to improve the accuracy of a simple physical dynamic model [8]. The effectiveness of the proposed PSO-based parameter estimation for the RC circuit model is verified using an experimental dataset. Compared with three models, the proposed model provides a significantly better temperature estimation for the thermal inertia of a PV module. Application of PSO algorithm for extracting solar cell parameters using single diode model presented in [9]. The proposed algorithm is implemented and tested using measurement data of a commercial silicon solar cell. Outcomes obtained using proposed method are quite promising and surpasses the other methods to solve solar cells optimization problems. Furthermore, PSO approach extracted equivalent circuit parameters of solar PV Cells can obtain good parameter precision under the variations of solar insolation and environmental temperature [10]. Quantification of uncertainties associated with solar photovoltaic (PV) power generation forecasts is essential for management of solar PV farms in the grid. The proposed hybrid PSO model has been applied in the area of solar PV forecasting. Test results demonstrate the high degree of efficiency in multiple seasons including sunny days, cloudy days, and rainy days [11]. A parallel PSO method extracted and estimated the parameters of a PV model in relation to traditional I-V characteristic. From the simulation results, it is observed that the proposed method is capable of accelerating the computational speed with the same accuracy compared to PSO [12].
Much research has also been devoted to the forecasting and estimation of solar PV generation or solar irradiance [13]- [25], results of case studies show ANN algorithm can deal with prediction matter and uncertainty well. Besides, solar PV prediction is usually a critical part of supply-demand planning in an electrical grid. Except ANN method, there are also other algorithms or models in search of solutions associated with solar PV or irradiance prediction. With classified and collected historical data of PV power output, support vector regression (SVR) is trained by input/output data sets for temperature, probability of precipitation, and solar irradiance to have better prediction accuracy [26]. A daily prediction model embedded into a solar PV monitoring system, based on the cloud and temperature data forecast information, was presented [27]. Additionally, a daily prediction model of solar PV was proposed [28]. A further sophisticated prediction model using various weather information indices such as the dew point, wind speed and direction, and humidity; nevertheless, atmospheric conditions are usually uncertain. Dealing with many non-determined data to get an acceptable prediction is difficult and not practical. Elman Neural Network (ENN) applied for discrete-time sequence predictions and relatively smaller network can result in better convergence and generalization performance [29]- [30].
There is limited research on PSO algorithm for solar PV power estimation on the basis of real solar irradiance measurements and system conversion efficiency in the literature. This paper in the beginning presents a Parallel ANN-PSO algorithm (PANN-PSO) to estimate solar PV electricity generation. Secondly, Solar PV system has some main characteristics such as intermittent, random, and volatility, and is closely connected to meteorological conditions. Average absolute percentage error (MAPE) and root mean square error (RMSE) are utilized in verifying accuracy of models. The remainder of this paper is organized as demonstrated. In section 2.1, a PANN-PSO algorithm is briefly reviewed. Section 2.2 propose an estimation model of solar electricity. Section 3 presents some practical numerical results.Section 4 discuss the future studies. Finally, conclusions are given in section 5

Parallel ANN-PSO algorithm
A Neural Network in engineering fields is composed of many non-linear arithmetic units (i.e., neurons) and connections. They operated in a parallel and decentralized manner are similar to human neural structures. It handles a great deal of data and learns from input information to get output results. Significantly, an artificial Neural Network as a prediction method because the neural-like is based on the analysis of input parameters and finds the best connections between input and output parameters. At present, the Neural Network is an excellent solution seeker for many fields, such as machine learning, image processing, signal processing, and computer science. Fig.1 illustrates a basic neural network architecture. Three layers including input, hidden and output section constitute of a network. A matrix W refers to interconnection strengths between two layers. The main purpose of a neural network is to get an acceptable output by updated weight matrices based on input information. On the subject of mathematics, Eqs. (1) represents the relationship of input and hidden layer, and Eqs.
(3) defines the link of hidden layer and output. A sigmoid activation function applied in (2) and (4) respectively makes the response to the signal in nature. Gradient descent approach works and takes outputs closer to targets as (5) to (7) shown. Keeping doing steps for improvements of deviation between outputs and targets to reach at the minimum is the main goal of optimization. It is really a good way of working out the minimum of a function; however, the function is so complex and difficult and not easy to solve it just using simple algebra. To avoid ending up in the wrong valley of solution space, training neural networks several times and trying to start from different initial points are necessary. Choosing different points means to select distinctive parameters in the scenario of neural networks. Adopting different initial link weights possibly imply to get inappropriate answers for questions or could be time-consuming in try and error. Therefore, updating weighted sum of inputs to output signals by other artificial intelligent ways becomes important parts of optimization study in neural networks. Particle swarm optimization (PSO), which comes from artificial life and social psychology as well as engineering and computer science, differs from evolutional computation in population members, called particles. When the particles are initialized, random values are given and they are stochastically assigned velocities. Every iteration, each particle's velocity is accelerated toward its previous best position and a neighborhood best position. The position might mean a highest or lowest fitness of cost function.
Optimization is everywhere in the world, from engineering to financial and decision making. In almost all applications of engineering and industry, scientists look forward to optimizing problems of cost, benefit or efficiency. Optimization algorithms can be carried in different ways. One normal way to focus on the nature of algorithms is to divide the algorithms into deterministic and stochastic areas. Deterministic algorithms follow a procedure, and designed variables. Functions are repeatable. Hill climbing is an example, for the same initial starting point, it will follow the same route.
Stochastic algorithms usually have some randomness. Compared to Gradient Descent requiring differentiable activation function to calculate derivatives, they do not demand a differentiable or continuous function and lead to better convergence without stuck at local minima. Therefore, final solutions are always a little bit of difference from time to time. It is also depending on the sources of inspiration to category into bio-inspired, nature-inspired, and meta-heuristic in general. PSO is inspired by social and cooperative behavior displayed by various species to fill their needs in the search space. The algorithm is guided by personal experience L(best), overall experience (Gbest) and the present movement of the particles to decide their next positions in the search area.
Assuming that the initial position of the swarm population of size N and dimension D is denoted as X = [X1, X2, ...,XN] T where 'T' denotes the transpose operator and X also represents the weight matrix of neural networks. Each individual particle Xj (j = 1, 2, ...,N) is given as Xj=[Xj,1,Xj,2, ..., Xj,D]. It is necessary that each solution vector or particle generated randomly must be within their corresponding lower and upper bounds of their each component, i.e., for jth particle LBj < Xj,i < UBj (i = 1, 2, ...,D), where LBj and UBj is the lower and upper bound on jth component of the solution vector. Also, the initial velocity of the population is denoted as XV= [XV1,XV2, ...,XVN] T , Thus, the velocity of each particle Xi (i = 1, 2, ...,D) is given as XVj=[XVj,1, XVj,2, ..., XVj,D]. The index j ranges from 1 to N whereas the index i varies from 1 to D. Furthermore, the experiences are accelerated by two factors c1 and c2, and two random numbers r1 and r2, generated randomly between 0 and 1. whereas the present movement is multiplied by an inertia factor Ir, varying between Ir,max and Ir,min. PSO algorithm, the updated positions of each particle in the search space is given by the following two equations (8)- (9). , , In (8), , represents personal or local best ith component of jth individual; whereas represents bth component of the best individual of population up to iteration k. Furthermore, in (8), Ir is the inertia weight of the velocity of the population defined in (10). The values of inertial factor Ir used to decrease linearly from Ir,max to Ir,min as iteration advances from starting towards its maximum value set. Mathematically, this is expressed as, In (10), Ir,max and Ir,min are the lower and upper limits of inertia weight, and k is the current iteration count and itemax is the maximum iteration count set.
Obtained new positions of each particle must be within specified boundaries. If any component of a particle violates the boundary condition, then that component is set to as (11) shown: , in the process of PSO iteration as (11) shown, is a joint of the weight matrix of neural network for hidden layer , and output layer , (12).
The initial Lbest of each particle is their initial weights whereas the initial Gbest is the initial weights of best particle among randomly initialized population. The updating rule of Lbest and Gbest of each particle is as follows; At iteration k, If Etotal(W k+1 )< Etotal (Lbest k ) then Lbest k+1 = W k+1 else Lbest k+1 = Lbest k .
where Etotal (·) is the objective or fitness function subject to minimization error.
The overall PSO algorithm can be expressed using the following procedure: Step1: set PSO parameters: Ir, c1, c2.
Step2: initialize weights X and velocities Xv of each particle of the population.
Step3: evaluate the fitness of each particle Etotal ( , ) . for all population and find the best particle index b.
Step6: update the velocity and position of each particle.
Step7: evaluate the updated fitness of each particle. Etotal ( , for all j,i and find the best particle index b Step8: update the Lbest of each particle ∀ p. If Etotal ( , Step9: update the Gbest of the population. If Etotal ( , Step10: if k < itemax then k = k+1 and go to step 6, else go to Step11 Step11: optimum solution is obtained

Proposed Estimation Model of Solar Electricity
As mentioned in the previous section, solar irradiance and system efficiency are crucial for solar PV generation. Another key factor to affect is cell temperature, which has been discussed in literature very often. PV cell suffers efficiency drop as their operating temperature increases. Solar irradiance data for PANN-PSO training comes from real measurements during normal plant operation, but system conversion efficiency is more complicated and difficult to count because there are many ways to influence system efficiency, for example, the elevation or orientation angle of solar PV panel, maintenance frequency, solar cell temperature. Some information is unknown and laborious or impossible to get, while historical operation records are still ready for the estimation model to take. Actual electricity generation to some extent is possible to represent system efficiency. Actual electricity divided by power capacity as (13) is a substitute to act as system efficiency physically.  (14). Eventually, the power generation will be returned to their original scales when calculation sequences are finished. Solar irradiance is uniformly performed in units of W/m2. Furthermore, system conversion efficiency and PV cell temperature might be normalized for weight matrices training needs. In (15), solar irradiance is divided by 1000 W/m2, and in (16), 50℃is selected as the base number of solar cell temperature, however, base temperature is possible to choose another reasonable number instead.
For reason of high precision, the estimation model has to accumulate useful data as much as possible. The main sunshine is usually from 6:00 to 18:00, but in four seasons, it will be some different. Fig.2 and Fig.3 represent typical charts of solar irradiance and electricity versus hours in winter and summer of the determined site. In this argument, the prediction timescale of the model is from 6:00 to 18:00. In terms of accuracy, the scale is divided into four sections in one day. They are growth period (6:00 to 9:00), peak period  Step2: normalize the data to make sure the consistency of the data unit to avoid the occurrence of singular numbers in the training process. For example, the power generation is converted into the power generation per kW in every case, and solar irradiance is transformed into normalized solar irradiance of i site in time t.
Step3: categorize the data into three segments of time, which belong to the growth period (6:00 to 9:00), peak period (10:00 to 14:00), and recession period (15:00 to 18:00) to improve the accuracy of training. Low irradiance usually happens in the morning and evening. It causes very low solar PV output. Moreover, in that period, the system conversion efficiency deviated from the linearity of Solar PV generation curve. The data beyond 6 to 18 o'clock are ignored.
Step4: create PANN-PSO models corresponding to three intervals.
Step 5: evaluating the accuracy by the mean absolute percentage error (MAPE) and root mean square error (RMSE).
In order to verify the performance of the proposed approach, two criteria, MAPE and RMSE, are used. The value of MAPE means the error between the estimated value and the actual value. The smaller is better. The evaluation equation is as (17)    Using one week of operational information as an analysis data set, to verify the performance of above-mentioned model is shown in Table 3. By grouping the same hour period week, as Fig.4 shown, a considerable weather condition makes solar PV generations to be unstable in all week; however, an acceptable estimation solution still could gain. In general, the results show that the model can be observed with considerable accuracy.    6. Schematic procedure of model training.

Model Verification Flow
Verification sequences are as follows: Step 1: weighted matrices trained by the power generation data of the selected sites from Mar. 2017 to Mar. 2018, and with the help of PANN-PSO model, are ready for verification.
Step 2: operation data of solar irradiance, system conversion efficiency and PV cell temperature are classified according to the timescale and areas as input data of the model.
Step 3: substitute the input data into the trained weight matrices to acquire solar power generation estimation.
Step 4: compare the actual solar power generation and estimation value. Checking error ranges to verify the model which can be applied to the estimation of solar power generation in various cases.

Examination estimation performance
The examination results are presented in this section to verify the estimation performance of the proposed model. The capacity of 157kW existed solar power plant, a comparison curve of estimation generation (Est. Gen) and actual generation (Act. Gen), table for MAPE, RMSE in different combinations of inputs and a chart of estimation error are presented and discussed.
Factors to affect solar system generation are complicated. Three key factors are concentrated on discussing the applied model in this paper. First, a variable system conversion efficiency replaces all the factors for one solar plant in the field. Additionally, measured and recorded solar irradiance. Last but not least, cell temperature of solar PV. Table  4 and Table 5 present the model applied in different inputs such as cell temperature, solar irradiance and system conversion efficiency in one day and one week. In Table 4, regarding to MAPE value, system conversion efficiency as an input is better than cell temperature. Integrating with three factors is possible to get a better RMSE value in the test case. Table 5 illustrates different results. Case3 of PV cell temperature and solar irradiance proceeds other two cases in RMSE and MAPE value. Detailed values of one-week estimation and deviation error are also submitted in Fig.7 and Fig.8. In short, the results display that three cases could be implemented as inputs of the PANN-PSO model and acquired acceptable answers in estimation of solar PV power generation. In addition, cell temperature as an input in the learning process has some problems. At first, it is not easy to choose any temperature for base for normalizing input data. Moreover, very low irradiance, almost impossible to generate solar power, still have cell temperature. Because the three factors are possible to affect learning procedure, the following test results verified by different combination of input1, input2 and input3, and illustrate the best performance of them.  Fundamentally, the lower latitudes of the sites have better power generation efficiency than the higher latitudes ones. However, the selected region compared to other countries in the world, the land from the north to the south is quite narrow and small. Regarding to quite different of climatic characteristics in various regions and seasons, the whole year of generation data as training sets are classified into four seasons and three areas. Meanwhile, four seasons are spring, summer, fall and winter, and three areas include the test site in the north, central and south. Furthermore, input combinations are three types, so the number of trained and acquired weight matrices is 36 in the test model. Every plant will be verified by three test cases. Case1 is selected to run solar irradiance and system conversion efficiency as inputs. Case2 has three inputs by adding PV cell temperature. Finally, Case3 is focused on solar irradiance and PV cell temperature only. Table 6 presents basic data of test cases in the following section. The proposed approach is used to estimate hourly solar PV power output by predicted solar irradiance and other available information. The experiments are implemented in MATLAB R2015b, and the computing platform for execution developed application programs is a PC with Windows 7 64-bit operating system. Separating trial results in the north, central and south cites are presented as follows. In the beginning, the results of the inspection case in north site plant of four individual seasons are exposed in Fig.9 to Fig.16. The estimation errors of the model in the hourly power generation are less than 5% during summer season. Larger deviation errors happen in Fall and Winter due to remarkable atmospheric phenomenon changes. The sunshine duration is low according to observation information of the Central Weather Bureau. In the second place, Fig.17 to Fig.24 presents similar phenomenon with respect to the first case. In winter, the estimation errors reach to near 13% in some day as Fig24.
shown. Finally, in south site plant as Fig.25 to Fig.32 illustrated, the average deviation errors are almost below 5 % of total four seasons. however, in our test cases, the two inputs, solar irradiance and system conversion efficiency are usually available information and easy to estimate from historical trends for training the proposed model, if some periods of measurements are missing or distorted.
Last but not least, solar energy is intermittent and unstable compared to traditional fossil fuel-based power generations. Significant weather conditions changing a lot in a same day or consecutive days in a month, as Fig.9 to Fig.32 shown, lead to difficulties for estimating PV power generation. The test results present the PANN-PSO model trained by abundant one year of generation information could enhance the estimation accuracy to acceptable ranges.

Discussion
Renewable energy policies and subsidies related to solar energy are widely used in various countries and cities, leading to the continuous increase/expansion of solar installations in various regions. In this paper, first, a PSO-ANN model for hourly forecasting was established. Using existing training datasets, three models for various timescales could forecast solar PV electricity of chosen sites in the northern, central, and southern regions. The results showed that the model achieved high accuracy, and could be applied to all cases in the northern, central, and southern areas. Moreover, the proposed model has a practical use in all four seasons. We showed that the solar irradiance and system conversion efficiency of solar PV systems are important parameters to effectively estimate solar power. The model was used to estimate the total power generation in a region with unknown solar PV plants. The results illustrated that the accuracy and feasibility of the area estimation model were reasonable and acceptable. In terms of the solar irradiance, system conversion efficiency, geographical location, and unavailable physical measurement data, the proposed method can provide a superior model for the forecasting of solar PV generation. It could be beneficial to electricity coordinators to organize generation and distribution schedules in the absence of accurate real-time feedback information of all the solar PV plants in their dispatching grids.
Future studies will focus on implementing some improvements. To preprocess the input data, a statistical method such as structural equation modeling (SEM) can be applied to search for individual and mutual relationships of the input and output variables. Data without strong connections can possibly be removed or modified. Recurrent neural networks have limitations in learning long-term dependencies. Long short-term memory (LSTM) addresses the exploding or vanishing gradient problem. Stochastic or random algorithms, such as particle swarm optimization (PSO), can be utilized to determine optimal solutions.

Conclusions
To conclude, solar energy related renewable energy policies and subsidies are widely applied in various counties and cities, which has led to consecutive improvement of solar plant investments in various regions. Because of characteristic of intermittent and nondispatchable, accurate estimation electricity of solar photovoltaic panels in the system networks is important for power system coordinators or planners. In this paper, first of all, a 157.3kW sample of real solar plant to evaluate RMSE and MAPE of different ANN related algorithms based on the same arbitrary initial weighted matrix and conditions is presented. The proposed PANN-PSO algorithm is better than BP-NN and Elman-NN regarding to verification criteria of RMSE and MAPE. Furthermore, on an hourly basis of PANN-PSO algorithm with respect to solar irradiance, system conversion efficiency and solar cell temperature is proposed. Through trained weight matrices by existed field operation data sets, the power generations of three test sites in different locations and four seasons are estimated, additionally, they could represent the estimation of all the year round. The test results show that the model has acceptable accuracy in terms of weather conditions alternating a lot in a same day or consecutive days in a month.