Developing invasive weed, social spider, shuffled frog leaping, biogeography-based, and harmony search optimization algorithms for the early prediction of residential building’s cooling load simulation

Regarding the high efficiency of metaheuristic techniques in energy performance analysis, this paper scrutinizes and compares five novel optimizers, namely biogeographybased optimization (BBO), invasive weed optimization (IWO), social spider algorithm (SOSA), shuffled frog leaping algorithm (SFLA), and harmony search algorithm (HSA) for the early prediction of cooling load in residential buildings. The algorithms are coupled with a multi-layer perceptron (MLP) to adjust the neural parameters that connect the CL with the influential factors. The complexity of the models is optimized by means of a trial-and-error effort, and it was shown that the BBO and IWO need more crowded spaces for fulfilling the optimization. The results revealed that the internal parameters (i.e., biases and weights) suggested by the BBO generate the most reliable MLP for both analyzing and generalizing the CL pattern (with nearly 93 and 92% correlations, respectively). Followed by this, the IWO emerged as the second powerful optimizer with mean absolute errors of 1.8632 and 1.9110 in the training and testing phases. Therefore, the BBO-MLP and IWO-MLP can be reliably used for accurate analysis of the CL in future projects.

These algorithms have also shown high capability in optimizing the performance of wellknown predictors. More clearly, most of the typical predictors like ANNs, SVM, and ANFIS encounter problems in the case of high-dimensional problems or might fall into traps like local minima in complex modeling [119,120]. Utilizing optimizers helps proper adjustment of hyperparameters of these models. In studies by Moayedi, et al. [121] and Le, et al. [122], for example, the efficiency of GWO and PSO metaheuristic algorithms for optimizing the ANN is investigated. The optimization robustness of imperialist competition algorithm (ICA) and genetic algorithms (GA) was compared by Tien Bui, et al. [123]. They concluded that both algorithms can effectively reduce the prediction error for both HL (around 18 and 23%, respectively by GA and ICA) and CL (around 21 and 25%) parameters. The ICA-based ensemble was also introduced as the superior model. The social behavior of the elephant was applied to the same problem by Moayedi, et al. [124] and it was found that the proposed algorithm outperforms those that are based on the lifestyle of ant and Harris hawk. Similarly, Zhou, et al. [125] established a comparison between the artificial bee colony and PSO. Considering the high applicability of metaheuristic algorithms in the field of CL (or) HL simulation, this study conducts a comparison between five novel metaheuristic techniques of biogeography-based optimization, invasive weed optimization, social spider algorithm, shuffled frog leaping algorithm, and harmony search algorithm for optimal prediction of cooling load in residential buildings. The results are evaluated by several accuracy criteria to demonstrate the capability of the models in analyzing and predicting the CL pattern. Also, a score-based ranking system is applied to detect the most capable optimizer.

Methodology
As explained above, this paper studies the optimization of the ANN (for cooling load simulation) by using various optimization techniques. To fulfill this purpose, an MLP neural network is trained by five metaheuristic techniques of BBO, IWO, SOSA, SFLA, and HAS. During this process, the main role of these algorithms is finding the most proper computational parameters of the MLP (i.e., the weights and biases) to surmount the existing drawbacks. A brief description of the named algorithms is presented in this section, and in order to shorten the paper, related references are mentioned for finding more details and mathematical relationships. Biogeography-based optimization (BBO) is a capable search method developed by [126] in 2008, that works based on geographical distributions. This algorithm has been previously used by scholars like Moayedi, et al. [127] to train ANNs for spatial analysis of geotechnical hazards. The solutions of the BBO are represented by habits and the goodness of each of those is measured by a habitat suitability index. This algorithm comprises two steps of migration and mutation during those, the candidate solution is modified to achieve a more fitted response. Also, a mutation process is considered to make the algorithm resistant against local minima.
Mehrabian and Lucas [128] introduced the invasive weed optimization (IWO) as a natureinspired optimizer. As the name implies, the aim of the algorithm is finding the most suitable place for plants (weeds) to grow and reproduce. The IWO is based on five steps, namely (a) initialization, (b) reproduction, (c) spatial dispersal, (d) competitive exclusion, and (e) stopping criteria evaluation. After the reproduction, some seeds are settled near the family and they are combined with the weeds to create the next generation. James and Li [129] designed the social spider algorithm (SOSA) by imitating the way that social spiders seeks food. The agents are social spiders that move within a multidimensional space which is considered their web in this algorithm. The candidate solutions are represented by the positions of the spiders which are highly in touch. Therefore, the spider's positions and the corresponding goodness values are two important ingredients of the SOSA. The optimization information (e.g., fitness values) are recorded in the individuals' memory. In this technique, the intensity of vibrations is indicative of the solution fitness. The shuffled frog leaping algorithm (SFLA) is another popular search technique that is suggested by Eusuff and Lansey [130] in 2003. The SFLA presents a combination of the PSO and Memetic algorithm. As two advantages, simplicity and high convergence speed have made it a broadly-used method. The relations in this algorithm are frogs. These individuals are grouped in so-called units "memeplexes". The fitness of the frogs is the basis of classifying them. After setting in descending order, the best-fitted ones leap as the first member of the memeplexes, and then second-fitted frogs are next members and so on. The positions of the frogs get updated to implement the optimization.
The harmony search algorithm (HSA) is introduced by Geem, et al. [131] in 2001. Scholars like have previously used the HSA for optimizing the ANN [132,133]. This algorithm draws on the player's action that intends to improvise the instrument's pitches to achieve a better harmony. A memory called harmony memory is first initialized and then new harmonies are improvised and updated for finding the best responses. This process is carried out by defining two parameters of pitch adjusting rate and harmony memory considering rate. As merit, the HSA (similar to the GA algorithm) possesses a genetic pool to store the solutions. More details, especially mathematical descriptions of the mentioned algorithms can be found in earlier studies (BBO [16,134,135], IWO [136][137][138], SOSA [139], SFLA [140,141], and HSA [142][143][144]).

Data and statistical analysis
By implementing a vast computer simulation, Tsanas and Xifara [145] provided the dataset used in this study. They employed Ecotect software [146] to acquire information of the cooling load as well as the heating load of a residential building. In this process, the effect of eight environmental factors (glazing area (GA), relative compactness (RC), wall area (WA), surface area (SA), glazing area distribution (GAD), overall height (OH), roof area (RA), and orientation (OR)) of the proposed buildings are taken into consideration. There are 12 buildings which bring four orientations, four GAs (0, 10, 25 and 40% of the floor area), and five distribution scenarios. Altogether, 768 cases are analyzed. Figure 1 depicts the distribution of the obtained CLs versus each influential factor. Table 1 is also presented to detail the statistical characteristics of the variables. Out of the provided data, 80 % and 20% (i.e., 614 and 154 samples) are randomly specified to the training and testing operations, respectively. Famously, the use of the first group is to discover the relationship between the target (i.e., CL) and corresponding independent factors. While the generalizability of the detected CL pattern is assessed by means of the second group.
where Q is the number of involved data, and the measured and forecasted CLs are shown by CLi observed and CLi predicted, respectively. Also, the average of the observed CLs is symbolized by observed.

Coupling the MLP with optimization schemes
Before developing the hybrid models, the MLP needs to be optimized concerning the number of neurons in the middle layer (NHN). Remarkably, although this network can possess more than three layers (two or more hidden layers), it has been widely shown that a three-layered MLP is adequate for handling every complex problem. Based on a trial and error process, the MLP distinguished by 6 hidden neurons was found the most proper one among 10 tested structures (NHN varied from 1 to 10). Following this, the general equation of the MLP was given to the BBO, IWO, SOSA, SFLA, and HAS so that the computational weights and biases are supposed to be adjusted.
Also, to ensure about using the most appropriate size of the swarm engaged with the problem (e.g., the number of spiders in the SOSA), nine different complexities of each algorithm (with the populations sizes 10, 25, 50, 75, 100, 200, 300, 400, and 500) are implemented and the records are shown in Table 2. A color intensity system is exert to this table to better illustrate the best responses. In this sense, the lower the obtained OF is, the more intense the assigned color is. As is seen, the lowest OFs   Moreover, Figure 3 shows the computation times required for implementing the models with taken population sizes. As explained before, the best populations sizes for the BBO, IWO, SOSA, SFLA were 400, 400, 100, 10, and 200, which took around 7378, 5408, 1311, 330, and 2685 seconds to optimize the MLP.

predictive model's reliability assessment
The results of the training phase are evaluated by comparing the target CLs with predicted values. The regression charts are presented in Figure 4. As mentioned earlier, the R 2 reports the correlation where 1 is ideal and vice versa. As is seen, the calculated R 2 s indicate around 93,92,88,88, and 91 % agreement between the expected and predicted CLs, respectively for   The models are then fed by data set aside for testing. Each model applied the established CL pattern to testing data to predict for unseen conditions. The results are shown in Figure  5 that compares the target and predicted testing CLs. According to this figure, all five models can successfully predict the CL pattern.

Efficiency evaluation and comparison
In this section, the employed predictive models are compared to identify the most efficient one. The resulted values of all three accuracy indices (RMSE, MAE, and R 2 ) are presented in Table 4. In Table 5, these values are ranked based on a score-based system. Compared to the other four models, each index receives a score and the overall score (OS) for each phase is calculated by the summation of three partial ones. Referring to the calculated OSs, the BBO trains the MLP more powerfully than other metaheuristic algorithms. After that, the IWO and HAS emerg as the second and third capable trainers, respectively. The same rankings are also observed for the testing phase. To sum up, the BBO-MLP is selected as the most competent predictor in this research, followed by IWO-MLP, HSA-MLP, SFLA-MLP, and SOSA-MLP.

The SCE-based CL predictive formula
With this in mind that the BBO featured as the most efficient optimizer among five tested ones, the BBO-based formula for predicting the CL is presented in this section. Equation 5 gives the mentioned formula. It, however, is required to first use Equation 6 to produce the middle parameters Z1, Z2, …, Z6. Notably, the term Tansig (Equation 7) represents the activation function of the hidden neurons. Specifically speaking, the numbers that can be seen in these two equations are the BBOoptimized internal biases and weights of the MLP. According to Equation 6, the input factors (i.e., GA, RC, WA, SA, GAD, OH, RA, and OR) are multiplied by the corresponding weights, and after adding the bias terms, the resulted value is activated by Tansig. The products released by this process are then given to the output neuron to calculate the overall response.

Conclusions
This paper investigated five capable metaheuristic techniques, namely biogeography-based optimization, invasive weed optimization, social spider algorithm, shuffled frog leaping algorithm, and harmony search algorithm for the early prediction of cooling load in residential buildings. The named algorithms were applied to a neural processor model for adjusting hyperparameters to develop the corresponding ensemble. Optimizing the complexity of the models showed that the best-required population sizes for both BBO-MLP and IWO-MLP are 400, while this value was 100, 10, and 200 for the SOSA-MLP, SFLA-MLP, and HSA-MLP, respectively. Comparing the prediction results showed that the BBO creates the most accurate MLP in both analyzing and predicting the CL pattern. After that, the IWO outperformed the SOSA, SFLA, and HSA in adjusting the MLP parameters. Regarding the outstanding performance of the BBO-MLP, the CL predictive formula of this method was extracted and presented. 6 References