A Combination Approach of Two Metaheuristic Algorithm for Optimal Feature Selection: Case Study Email Spam Detection

Selecting a feature in data mining is one of the most challenging and important activities in pattern recognition. The issue of feature selection is to find the most important subset of the main features in a specific domain, the main purpose of which is to remove additional or unrelated features and ultimately improve the accuracy of the categorization algorithms. As a result, the issue of feature selection can be considered as an optimization problem and to solve it, meta-innovative algorithms can be used. In this paper, a new hybrid model with a combination of whale optimization algorithms and flower pollination algorithms is presented to address the problem of feature selection based on the concept of opposition-based learning. In the proposed method, we tried to solve the problem of optimization of feature selection by using natural processes of whale optimization and flower pollination algorithms, and on the other hand, we used opposition-based learning method to ensure the convergence speed and accuracy of the proposed algorithm. In fact, in the proposed method, the whale optimization algorithm uses the bait siege process, bubble attack method and bait search, creates solutions in its search space and tries to improve the solutions to the feature selection problem, and along with this algorithm, Flower pollination algorithm with two national and local search processes improves the solution of the problem selection feature in contrasting solutions with the whale optimization algorithm. In fact, we used both search space solutions and contrasting search space solutions, all possible solutions to the feature selection problem. To evaluate the performance of the proposed algorithm, experiments are performed in two stages. In the first phase, experiments were performed on 10 sets of data selection features from the UCI data repository. In the second step, we tried to test the performance of the proposed algorithm by detecting spam emails. The results obtained from the first step show that the proposed algorithm, by running on 10 UCI data sets, has been able to be more successful in terms of average selection size and classification accuracy than other basic meta-heuristic algorithms. Also, the results obtained from the second step show that the proposed algorithm has been able to perform spam emails more accurately than other similar algorithms in terms of accuracy by detecting spam emails.


Introduction
Feature selection can be defined as the process of identifying related traits and eliminating unrelated and repetitive traits with the aim of observing a subset of traits that describe the problem well and with minimal reduction in efficiency. This has many benefits, some of which are described below. To define a "Feature Relevance" problem, the definition is used: "A attribute is relevant if it contains information about the target." More formally, John and Kohavi have divided the traits into three distinct categories: "strongly relevant," "weakly relevant," and "unrelated." These are called irrelevant features. In the approach of these researchers, the relationship between the X feature is defined as an ideal bisexual classification. Characteristic X is strongly related when its removal leads to damage to the accuracy of the prediction of the ideal biodegradation. This feature is called weakly related if it is not strongly related and there is a subset of S-properties, so that the performance classification of the ideal ideal on S is worse than the efficiency of {S∪ {X. An unrelated feature is defined if it is not strongly and poorly related. One feature is usually considered to be Feature Redundancy if there is a correlation between the features. The notion that two features overlap is acceptable to many researchers if their values are completely correlated, but at the same time it may not be easy to recognize the excess of features when a feature is associated with a set of features. According to the definition provided by John and Kohawi, a property is redundant if it is to be removed, so that it is weakly related and has a Markov blanket within the set of current properties. Because unrelated features need to be removed from all sides, they are cleared according to this definition. The existence of thousands of programs of information systems has complicated the role of extracting useful information from the collected data [1]. Feature selection (FS) is one of the most important steps in pre-processing because its purpose is to eliminate redundant and irrelevant variables in a data set. Feature selection methods are classified as packaging and filter [2,3]. A number of approaches proposed for FS can be broadly classified into the following three categories: packaging, filter, and hybrid. In the packing method, a predetermined learning model is assumed, in which features are selected that justify the learning performance of a particular learning model, while in the filtering method, statistical analysis of the set of characteristics, no learning model is required. Schematic diagrams show how to find packing approaches and filter outstanding features in Figure 1. The combined approach tries to take advantage of the complementary strengths of packaging and filtering approaches [6,7]. [7].

Fig. 1. Schematic diagrams of (a) wrapper approach and (b) filter approach
Classical optimization techniques have limitations to solve feature selection problems [8] and, therefore, evolutionary computational algorithms (EC) are alternatives to solving these limitations and finding the best solution [9]. Evolutionary computational (EC) algorithms are inspired by nature, group dynamics, social behaviors, and species biological interaction in a group. The binary version of these algorithms allows us to examine problems such as feature selection and achieve superior results. Thanks to feature selection techniques, we have improved benefits such as better model interpretation, shorter training times, and generalization by reducing additional connections when building classification models. FS can be considered as a space space search. Therefore, a full search can be performed in all covered search spaces. However, this approach is not possible for a large number of features. Therefore, an exploratory search evaluates these features that have not yet been selected in each iteration. A random search creates random subsets in the search space that can be evaluated to determine the importance of rank performance [10]. Nature-inspired meta-altruistic algorithms are now one of the most widely used algorithms for solving optimization problems. Optimization is the process of finding optimal solutions to a particular problem. There are many natureinspired metaheuristic algorithms that work on the feature selection problem, some of which are combined to combine algorithms, and some are used alone to solve this problem. Due to their random nature, metaphors such as particle optimization (PSO) [11], evolutionary algorithms (EA) [12], milk optimization (ALO) [13], bat algorithm (BA) [14], Flame worm algorithm (FA) [15], Whale Optimization Algorithm (WOA) [16], Genetic Algorithm [17], Flow Pollution Algorithm (FPA) [18] and other metaphorical algorithms are used. Characteristics Selection Several previous studies have used hybrid algorithms for the problem of feature selection. In Section 2, we will describe hybrid algorithms for combining features.
The rest of this document is organized as follows: Section 2 provides related work. The basic concepts of the WOA and FPA algorithms are presented in Section 3. Section 4 details the proposed method. In Section 5, the experimental results are presented and the results are analyzed. Finally, in Section 6, conclusions and subsequent work are given.

Related Works
In recent years, the metaheuristic hybrid has been used by many researchers in the field of optimization. In the feature selection domain, many successful hybrid metaheuristic algorithms have been proposed [2,7,10,[19][20][21]. The first hybrid method for the selection of characteristics belongs to Oh et al. [22] incorporating local search operations that are Forward Sequential Search (SFS), Floating Forward Sequential Search (SFFS), and Polynomial Time Approach (PTA) in GA to fine-tune the search process. Hybrid GA shows better convergence property over standard GA in experiments performed on various UCI repository data sets, including Glass, Wovel, Wine, Letter, Vehicle, Segmentation, WDBC, Ionosphere, Satellite, and Sonar. Khushaba et al. [23] combined ACO and DE for feature selection, where DE was used to search for the optimal feature subset based on the solutions obtained by ACO. Olabiyisi et al [24] develop another novel hybrid algorithm that includes GA and SA metaheuristic to extract characteristics for timing problems. In the proposed algorithm, the SA selection process is used instead of the GA selection process to avoid a local optimum. Experimental results show that SA performs better than GA and GA-SA hybrid in terms of optimization and runtime; however, GA and SA execution times are higher than those of the hybrid method. Due to the runtime performance, it is concluded that the hybrid algorithm is more applicable compared to GA and SA. In [25], a filter measurement (Pearson correlation measure) and an envelope measure (classification precision) were combined to form a single fitness function in an AG for the selection of characteristics to take advantage of each measurement. Akila [26] constructs a hybrid envelope and filter characteristics selection algorithm for classification problems using a combination of GA and local search (LS) technique. In the hybrid method, first, LS is applied by using correlation-based filter methods that include discretization, classification, and redundancy elimination with a measure of symmetric uncertainty for subsets of features; and then standard GA operators are applied to these subsets. Experimental analysis performed on the DNA gene analysis data set obtained from the UCI deposit shows that the hybrid method has the best performance. Babatunde et al. [27] propose another hybrid algorithm based on ACO and GA for the selection of characteristics. In this new algorithm, selected entity subsets are evaluated using the SVM classifier. The proposed algorithm is tested on the face detection dataset and compared to standard ACO and GA. The results demonstrate that the new hybrid technique outperforms ACO and GA. Hasani et al. [28] developed a combination of Linear Genetic Programming (LPG) and Bee Algorithm (BA) for feature selection in Intrusion Detection Systems (IDS). In the proposed algorithm, LPG is used to generate solutions of feature subsets, then the BA neighborhood search process is applied to these solutions; and finally, SVM is used to evaluate subsets of features. Experimental results demonstrate that the LPGBA method increases the accuracy of the classification and is more efficient than the basic LPG and BA. In [29] Liao et. to the. It proposes a new feature selection algorithm based on Ant Colony Optimization (ACO), called Advanced Binary ACO (ABACO), and it is introduced. The simulation results verify that the algorithm provides a suitable subset of features with good classification accuracy using a smaller feature set than the competitive feature selection methods. Nekkaa and Boughaci [30] propose a new hybrid search method combining the harmony search algorithm (HSA) and stochastic local search (SLS) for the selection of characteristics in the classification task. A new probabilistic selection strategy is used to apply stochastic exploitation. This algorithm is wrapped with the SVM classifier. Experimental results show that the HSA-SLS method is better than HSA and GA for feature selection. Sayed et al. [18] the study has investigated the effectiveness of combining CSA with FPA to compose a new hybrid algorithm called Binary Clonal Pollination Algorithm of Flowers (BCFA) to solve the problem of selection of characteristics. The precision of the OPF classifier was used as an objective function to maximize. The experiments were performed on three mass data sets in the UCI machine learning tank. The results showed that BCFA can achieve the best classification quality using the minimum number of properties selected in a short period of time. Khoshaba et al. [5] A new hybrid approach has been proposed that combines the artificial colony optimization method of artificial bees with a differential evolutionary algorithm for selecting classification tasks. The developed hybrid method is evaluated using fifteen sets of data from the UCI repository, which is widely used in classification problems. The experimental results of this study show that our developed combined method is able to select good features for classification works to improve the working time and accuracy of the classifier. In

Material and Methods
Optimal whale and flower pollination algorithms are two examples of meta-heuristic algorithms that have been used in combination to select features based on case-based learning and case studies on spam email detection. In the following discussion, we will describe and examine each of these algorithms.

Whale Optimization Algorithm
The Whale Optimization Algorithm (WOA) is a search and optimization algorithm recently developed by [31]. This is a mathematical model of the movements and behaviors of small whales in search of food. The performance of the WOA algorithm is inspired by an attack strategy in which whales are trapped in bubbles, they start targeting fish and form spiral bubbles around the fish, 12 meters below the ground. They landed and then retreated to catch themselves. Boat Kurdish fish as shown in Figure 2. The search process in this algorithm is described by random search according to the relative positions of the whales. Instead of choosing the best option, you can mathematically translate old solutions mathematically through another random selection. In addition to these interesting behaviors, WOA is very different from other optimization algorithms and requires only two parameter settings. These parameters make it possible to continuously transfer between search and operation processes.
The mathematical model of encircling prey, spiral bubble-net foraging maneuver and search for prey is described in the following section:

Encircling prey
Humpback whales encircle the prey and update their position towards the best search agent with the increasing number of iterations from start to a maximum number of iterations. This behavior is mathematically formulated as: Where and are the coefficient vectors, t indicates the current iteration, X* is the position vector of the best solution obtained so far, is the position vector, ││ is the absolute value and. Is an element-by-element multiplication.The vectors and are calculated as follows:

Bubble-net attacking method
The following two methods have been developed for mathematical modeling of pure bubble whale movement: 1. Decreased environmental mechanism: This behavior is achieved by lowering the E value from 2 to 0. (3) During repetition. The new location of the search agency can be determined anywhere between the original state of the agent and the location of the best agent by assigning random values to [-1, 1].
2. Spiral repositioning position: To simulate whale movement, the spiral equation between prey and whale position is as follows:  It should be noted that scrubber whales swim in a shorter frame around the prey and along the spiral path. Therefore, we estimate a 50% probability of choosing a short rotation method or a spiral model to update the position of the whales to model this behavior. The mathematical model is as follows: where ′ = | * − ( )| and indicates the distance of the ith whale to the prey (the best solution obtained so far), b is constant for defining the shape of the logarithmic spiral, l is a random number in[-1, 1] andprepresents a random number in [0, 1].

Search for prey (exploration phase)
The variation of vector can be utilized to search for prey, i.e., exploration phase. Therefore, can be used with the random values greater than 1 or less than -1 to force search agents to move away from a reference whale. The mathematical model for this phase is as follows: Where is a random position vector (a random whale) chosen from the current population .

Flower Pollination Algorithm
The Flower Pollution Algorithm was introduced in 2012 by computer scientist Zing Xi Yang, who has developed various algorithms such as Shabbat Cream, Bats and Coco. Pollination of flowers is a process that involves the transfer of pollen from flowers. The main participants in such migration are birds, bats, insects and other animals. There are a number of flowers and insects that we can call pollination. These flowers can only attract birds that participate in this partnership, and these insects are the main pollinators of these flowers [32]. There are two types of pollination: antibiotic and non-live. Biotic flowers make up 90% of pollinators, while non-living plants make up 10%. Non-living pollination does not require pollination. Some insects visit certain types of flowers. However, these insects move away from other flower species and this phenomenon is called: flower stagnation [33]. All flowers that have the properties of perennial flowers guarantee maximum reproduction. This process is useful for both flowering and pollination. The pollinator receives enough nectar, while the pollen retains the pollen and thus increases the number of known flower species. Depending on the availability of pollinators, pollination can also be defined as spontaneous and reciprocal pollination. In self-pollination, there is no reliable pollination, and during cross-pollination, pollutants such as bees, birds, and bats fly long distances, causing global pollination [34]. In [35], Young simulates the process of biological pollination using the following four laws of idealization: A) Environmental pollination and pollination is a global pollination with a pollinator that performs Levi's flight. B) Living and self-pollination are considered local pollination.
C) The persistence of flowers can be considered as a possibility of reproduction. It is suitable for two flowers. D) Local pollination and global pollination are managed with the possibility of P [0, 1] transmission. Due to other factors such as physical proximity and wind, local pollination can play an important role in overall pollination activities.
To shape up-to-date formulas, we need to translate the above rules into updated equations. For example, in the global pollination stage, pollen gametes are carried by pollinators such as insects, and pollen can travel long distances because insects can often fly and travel much longer distances. Therefore, Rule 1 and the stability of flowers can be expressed mathematically as follows: (9) Where is pollen i or solution vector Xi at iteration t, and B is the current best solution found among all solutions at the current generation/iteration. Here γ is a scaling factor to control the step size. In addition, L (λ) is the parameter that corresponds to the strength of the pollination, which essentially is also the step size. Since insects may move over a long distance with various distance steps, we can use a Lévy flight to imitate this characteristic efficiently. That is, we draw L > 0 from a Levy distribution: (10) Here, Γ (λ) is the standard gamma function, and this distribution is valid for large steps s > 0. Then, to model the local pollination, both Rule 2 and Rule 3 can be represented as: (11) Here from and from are pollinated from different flowers of similar plant species. This is actually an imitation of the permanence of flowers in a limited neighborhood. If mathematically, if and of are of the same species or are selected from the same population, we obtain U from the equal distribution [0, 1], which can be a local random walk. Although flower pollination measures are carried out on a global and global scale, flower patches or adjacent flowers near local pollen are pollinated farther away. To mimic this, we can effectively use the law of 4 or p to use global public pollination to severe local pollination such as the possibility of proximity. To get started, we can use the simple value p = 0.5 as the initial value. The initial parameter 0.8 = p showed that it could perform better for most applications [35]. Figure 4 shows the fake code of the flower pollination algorithm.

Fig. 4. Pseudo-Code Flower Pollination Algorithm.
The secondary flower pollination algorithm is provided by [36], in which the search area is designed as a twodimensional bovine network in which solutions are updated in the corners of the mushroom cloud. The problem with selecting properties is whether or not you select a particular attribute, so the solution is presented in the form of a binary vector, where 1 indicates the property selected to create a new data set, and so on. The sigmoid function is used to construct this binary vector using the following equation: (12) Thus, Eq. (9) and Eq. (11) will be replaced by the following equation:

Opposition Based Learning
The first concept of opposition was first expressed in ancient Chinese philosophy in the symbol of Yin Yang ( Figure  6). This symbolizes the concept of a pair of black and white yin (receiving power, subtle, dark, passive) and yang (creative, masculine, light and active power). He also described the classical Greek elements of the patterns of nature ( Figure 7) as contradictory concepts such as fire (hot and dry) and water (cold and wet), earth (cold and dry) and air (hot and humid). The nature of cold, hot, humid and dry and opposites shows them [37].

Fig. 6. Early opposite concept was mentioned in the Yin-Yang symbol [37]
Opposition-based learning (OBL) [38] is an effective concept for strengthening various optimization methods. The main idea of the OBL is to simultaneously consider the relevant antitrust assessment as secondary candidate solutions to achieve a better approach to the candidate's current solution. It has been proven that the solution of the opposing candidate randomly increases the chance of approaching a universally acceptable solution instead of a solution of the selected candidate. The opposite number can be identified as follows:  Fig. 7. The Greek classical elements to explain patterns in the nature [37]

Proposed Approach
In this section, the proposed method is described, which includes a combination of whale optimization algorithms and flower pollination. In the proposed method, we have used the binary version of the whale optimization algorithms [2] and flower pollination [36] to select the feature. In the proposed method, we also used the concept of opposition-based learning to contrast the search space and create better and newer solutions to improve the hybrid algorithm. In this section, we first describe the objective function selection function for the proposed algorithm and other metainnovative algorithms in this paper. This is because feature selection can be considered as a multi-objective optimization problem in which two conflicting goals are achieved, which include the minimum number of selected features and higher category accuracy. Therefore, we need a categorization algorithm to define the objective function of the property selection problem, and since most researchers [1, 2, 10, 16,19] use the simplest classification method, the KNN categorizer [40]. Benefiting, we also used this category to define the target function of the feature selection problem.
As a result, in the proposed method, we used the KNN classifier to more accurately evaluate the features selected by the proposed algorithm and other algorithms. Each solution is evaluated based on the proposed multi-objective function, which depends on the KNN categorizer. In the proposed multi-objective function, in order to balance between the number of features selected in each solution (minimum) and the accuracy of the classification (maximum), the fit function in Equation (15) is used to evaluate a solution in each meta-heuristic algorithm.
Where ( ) shows the classification error rate of a category and also | R | The multi-line subsystem is selected and | N | The total number of features in the data set is also the α parameter, the importance of the category quality, and the β parameter of the subset length. The values of these two parameters are also obtained as α ∈ [0, 1] and β = (1 -α) from the study [13]. After defining the objective function, we first used the combined method of whale optimization and flower pollination algorithms and then the proposed method of opposition-based learning concept to improve the hybrid algorithm, which we will discuss as the proposed method in this article. The main purpose of combining metaheuristic algorithms is to make more use of the natural process of two different algorithms to better solve all kinds of optimization and difficult problems. When combining meta-heuristic algorithms, the hybrid algorithm maintains the balance and performance of the hybrid algorithm by maintaining a balance between exploration and productivity. The whale optimization algorithm is a recently introduced optimization algorithm that shows very good results in solving many optimization problems. However, exploration in the standard WOA algorithm (Equation (8)) depends on changing the position of each search agent based on a randomly selected solution. It also gives us a glimpse into the productivity of the flower pollination algorithm to better improve the WOA algorithm in the feature selection problem. Due to the fact that all non-innovative algorithms, including whale optimization and flower pollination algorithms, use random quantification or uniform distribution to create the initial population, sometimes the production of inappropriate primary population causes less innovative initiatives towards The optimal target is convergent, and many of the measured variables, such as computational time, storage memory, and complexity, are related to the distance of this initial value from the optimal global solution. If we can examine a solution at the same time and in contrast to that Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 5 May 2020 doi:10.20944/preprints202001.0309.v2 solution, in fact, by using a population and the opposite of that population in meta-heuristic algorithms, great potential is created to accelerate convergence and improve accuracy in these algorithms. As a result, we have ensured the convergence and accuracy of the proposed algorithm by adding the concept of learning based on opposition to our combined algorithm. Figure 8 shows how the combination of whale optimization algorithms and flower pollination based on opposition-based learning is proposed.
As a result, in our proposed method, we tried to solve the problem of optimization of feature selection by using natural processes of whale optimization and flower pollination algorithms. We used opposition. In the proposed method, whale optimization algorithms use the bait siege process, bubble attack method and bait search, create solutions in their search space and try to improve it, and along with this algorithm, flower pollination algorithm by creating a way. Opposite solutions The whale optimization algorithm operates on the principle of opposition-based learning. In fact, in continuation of the flower pollination algorithm with two global and local search processes, the solutions created in the contrasting space will improve the solutions of the whale optimization algorithm. Also, at the end of each generation of the proposed algorithm, the population of the whale optimization algorithm population is combined with its opposite population, i.e. the population of the pollen pollination algorithm, and only solutions for the next generation are selected that are more optimal in terms of objective function. Thus, by combining two optimization algorithms and using contrasting search space, we were able to provide a powerful algorithm with high convergence and accuracy in this article. The simulation results in Section (5) confirm the proposed algorithm and convergence speed and accuracy of the proposed algorithm with various experiments.

The experiment results
The experiments were carried out using a PC Intel(R) Core(TM) i5-2430M CPU 2.40 GHz with 6GB RAM, Windows 10 operating system. The implementation of the proposed algorithm is done using Matlab.
To evaluate the performance of the proposed algorithm in this section, experiments are performed in two stages. In both cases, a binary version of the comparison algorithm is used. In the first phase, experiments were performed on 10 sets of data selection features from the UCI data repository Be. In Section 5.1, we provide a complete overview of the first stage experiments, along with the name and features of the UCI data set, and the initial settings and results of the proposed algorithm on the UCI data set.
In the second step, we tried to test the performance of the proposed algorithm by detecting spam emails. In Section (5.2), the experiments related to the second stage, along with the name and characteristics of the spam email data set and the initial settings and the results of the proposed algorithm, are fully listed on the spam email data set. The results obtained from the first step show that the proposed algorithm, by running on 10 UCI data sets, has been able to perform more successfully than other algorithms in terms of average selection number of features and classification accuracy. Also, the results obtained from the second step show that the proposed algorithm has been able to perform spam emails more accurately than other similar algorithms in terms of accuracy.

UCI Datasets
To evaluate the effectiveness of the proposed approaches, experiments were performed on 10 FS standard databases in the UCI database [42]. To analyze the performance of the algorithm presented in terms of features and samples, a low-dimensional data set with low and high dimensions and subsequent and small and small in this document has been approved. The packaging method is based on the KNN classification (where K = 5 [13]) is used to produce the best reduction. In the proposed method, each data set is divided into a cross-comparison method for evaluation [43]. In the inspection of K-folding stocks, K-1 is used equally for drills and verification, the rest is used for folding testing. This M load process is repeated. The dimensions of the training set and the test sample are the same. Table 1 shows the details of the data set used, for example, the number of features and samples in each set. These include the chest, wine, bottle, contraceptive selection (CMC), system entry (heart), ionosphere, lymphography, spectrum, blood, zoo.   Table 2, it is clear that the proposed algorithm performs better than the whale optimization and pollen algorithms of the flowers to achieve the two main objectives, which include the accuracy of the classification and the number of selected features. The proposed algorithm in this experiment is better among the 10 UCI data sets in terms of accuracy in 7 data sets than the whale optimization and flower pollination algorithms, and also the proposed algorithm is better in terms of the number of features selected in the 6 data sets. Optimize whale optimization algorithms and flower pollination. From the results reported in this experiment, according to Table (2), it can be concluded that the hybrid model, using the opposition-based learning method, has a significantly higher performance than the whale optimization algorithms and flower pollination. Because the opposition-based learning method in the proposed method is between repeating the whale optimization algorithms and pollinating the flowers, better results can be achieved by increasing the number of repetitions in the proposed algorithm. This motivated us to examine the proposed approach with a maximum number of iterations of 50 and a population size of 10. The results of the proposed algorithm with a higher number of iterations are shown in Table (3).  Table 3, it is clear that as the number of repetitions increases, the proposed algorithm of Whale Optimization and Flower Pollution Algorithms works much better to achieve the two main objectives, which include classification accuracy and the number of selected features. Slow down. In particular, this experiment shows that in order to accurately categorize, the proposed algorithm out of 10 UCI data sets, in all data sets, perform better than whale optimization and flower pollination algorithms, and also the proposed algorithm in terms of the number of selection features in 7 Data sets work better than whale optimization and flower pollination algorithms. From the results reported in this experiment, according to Table (3), it can be concluded that the hybrid model has greatly increased the performance of whale optimization algorithms and flower pollination in terms of convergence and accuracy by using opposition-based learning methods. Since the opposition-based learning method is located between the repetition of the whale optimization algorithms and the pollination of the flowers, it can be seen that by increasing the number of repetitions, better results have been obtained for the proposed algorithm. From the results reported in the two experiments performed in this section, it can be concluded that the proposed algorithm has solved the problem of optimizing the selection of features well, and also in the last experiment, considering the success of the algorithm in all data sets, convergence speed and algorithm accuracy. The proposal is guaranteed by the use of opposition-based learning methods. In the next section, we will try to compare the proposed algorithm with other basic algorithmic algorithms for further evaluation.

Comparison with the State-of-the-Art Approaches
In the previous section (5-1-1), after analyzing the results of the proposed algorithm on 10 valid UCI data sets, we found that the proposed method using opposition-based learning method of whale optimization algorithms and flower pollination from Categorization accuracy performs better in all data sets and has competitiveness in terms of the number of selected features compared to the whale optimization algorithms and flower pollination. In this sub-section, we compared the performance of the proposed method in addition to comparing the optimal whale-making algorithms and flower pollination with powerful extra-innovative methods such as genetic algorithm, particle density algorithm and bat algorithm. To show how the proposed method works compared to other powerful methods. In the experiment in this section, the parameters of all comparison algorithms are set with a maximum number of 80 repetitions and a population size of 10. Table ( 4) shows the results of comparing the proposed method with whale optimization algorithms, genetics, particle density, bat and flower spraying in terms of classification accuracy.  Table 4 results, the proposed algorithm is highly functional in terms of accuracy. In addition, the proposed algorithm performs better than all other methods in all data sets, except for three data sets, in which GA and other methods perform slightly better than the proposed method, and the proposed method ranks second. has it. Table 5 shows the results of comparing the proposed method with the optimized algorithms of whale, genetics, particle density, bat and flower spraying in terms of average selected features. As shown in the results of Table (5), the proposed algorithm has a good performance in terms of the average of the selected features compared to other advanced super-innovative algorithms. In addition, the proposed algorithm performs better than all other methods in most data sets, except for a few data sets in which other methods perform better with a small difference from the proposed method, and the proposed method is in second place. In this section, also for the convergence of the objective function defined in Equation (15), the convergence rate of each of the whale optimization algorithms, flower pollination, genetic algorithm, particle density algorithm and bat algorithm and finally the proposed method on 10 We have compared the valid UCI data set. To show how the proposed method works compared to other methods in terms of the degree of convergence of the objective function. In the experiment in this section, the parameters of all comparison algorithms are set with a maximum number of repetitions of 20 and a population size of 10. The results of this experiment are from Figures 9 to 14, respectively, and each of these figures is given to show the convergence of the algorithms on 10 data sets. Figure 9 shows the results of comparing the proposed method with the optimized algorithms of whale, genetics, particle density, bat and flower spraying in terms of the convergence of the objective function on the Boold data set.

Fig. 9. Comparison between the proposed approaches and the state-of-the-art approaches in terms of convergence-Dataset Boold
As shown in Figure 9, the proposed algorithm and other comparative algorithms have almost the same results. The reason for the closeness of the results in this form is that the Boold data set has 4 features, and each of the algorithms, with multiple executions, ultimately achieves the optimal answer. Therefore, the results of the implementation on the Boold data set by the proposed method and the optimization algorithms of whale, genetics, particle density, bat and flower spraying show that due to the small size of the data set, all the algorithms compared in this article have similar results. Figure (10) shows the results of comparing the proposed method with the optimization algorithms of whale, genetics, particle density, bat and flower spraying in terms of the convergence of the objective function on the BreastEW data set.

Fig. 10. Comparison between the proposed approaches and the state-of-the-art approaches in terms of convergence-Dataset BreastEW
As shown in Figure 10, the proposed algorithm and the ratio of other comparison algorithms have had a high performance in the convergence of the objective function, which is the high performance of the proposed algorithm because the BreastEW data set has more features. Each time the proposed algorithm has been able to use the opposite space of solutions, it is much stronger than other algorithms. Therefore, the results of the BreastEW data set by the proposed method and the optimized algorithms of whale, genetics, particle density, bat and flower spraying show that due to the large data set, comparative algorithms have reduced their efficiency but the proposed algorithm Due to the use of the opposite space of solutions in this article, the results are much better. Figure 11 shows the results of comparing the proposed method with the optimized algorithms of whale, genetics, particle density, bat and flower spraying in terms of the convergence of the objective function on the CMC data set.

Fig. 11. Comparison between the proposed approaches and the state-of-the-art approaches in terms of convergence-Dataset CMC
As shown in Figure 11 (11), the proposed algorithm and the ratio of other comparative algorithms performed better in the convergence of the objective function. In this experiment, the performance of the proposed algorithm is such that the proposed algorithm has provided better results than the genetic algorithms, particle density, bat at the beginning, and has been able to repeat more using solutions and the opposite space of solutions in The final iterations of all comparative algorithms yield better results. Figure (12) shows the results of comparing the proposed method with the optimized algorithms of whale, genetics, particle density, bat and flower spraying in terms of the convergence of the objective function on the Glass data set.

Fig. 12. Comparison between the proposed approaches and the state-of-the-art approaches in terms of convergence-Dataset Glass
As shown in the results of Figure (12), the proposed algorithm and the ratio of other comparative algorithms have performed relatively better in the convergence of the objective function. In this experiment, the performance of the proposed algorithm is such that the proposed algorithm has provided acceptable results at the beginning compared to the whale optimization and flower spraying algorithms, and has been able to replicate more using solutions and the opposite space of solutions. Get better results in the final iterations of all comparison algorithms. Figure 13 shows the results of comparing the proposed method with the optimized algorithms of whale, genetics, particle density, bat and flower spraying in terms of the convergence of the objective function on the Heart data set.

Fig. 13. Comparison between the proposed approaches and the state-of-the-art approaches in terms of convergence-Dataset Heart
As shown in Figure 13, the proposed algorithm and the ratio of other comparative algorithms performed similarly to whale optimization algorithms and particle density. Because the Heart data set is a median type of data set feature, it has made it possible for some algorithms such as the whale optimization algorithm and particle density to have a high performance in converging the objective function. Figure (14) shows the results of comparing the proposed method with the optimized algorithms of whale, genetics, particle density, bat and flower spraying in terms of the convergence of the objective function on the Ionosphere data set.

Fig. 14. Comparison between the proposed approaches and the state-of-the-art approaches in terms of convergence-Dataset Ionosphere
As shown in Figure 14, the proposed algorithm and the ratio of other comparative algorithms have had a high performance in the convergence of the objective function, which is the high performance of the proposed algorithm because the Ionosphere data set has more features. Each time the proposed algorithm has been able to use the opposite space, the solutions are much stronger than other algorithms. Therefore, the results of Ionosphere data collection by the proposed method and the optimized algorithms of whales, genetics, particle density, bats and flower spraying show that due to the large data set, comparative algorithms have reduced their efficiency but the proposed algorithm Due to According to the results reported in these two sub-sections (5.1.1) and (5.1.2), a significant advantage of the proposed algorithm in selecting fewer features and high classification accuracy and convergence of the objective function compared to other meta-heuristic algorithms in comparison. It was found. Because we managed to provide a powerful algorithm with convergence speed and high accuracy to solve the problem of optimizing feature selection by creating a combination and using a learning method based on opposition.

Email Spam Detection
In the previous section (5-1), we implemented our proposed algorithm on 10 valid UCI data sets, the results showed that the proposed algorithm excelled in selecting fewer features and classification accuracy than other super-innovative algorithms. It was significant. Therefore, due to the strong results of the proposed algorithm on the valid UCI data set, it motivated us to implement our proposed algorithm on the spam email data set as well, in order to gain more confidence in the high performance of the proposed algorithm. To perform this experiment, we used the valid Spam email data set, which we will describe in more detail in sections (5-2-1). We distributed all the data using the classification sampling method to the test and training data set and tried to include spam and non-spam emails in both test and training sets to better evaluate the performance and efficiency of the proposed algorithm and other algorithms.
Let's have comparisons in this section. We also divided the spam data set into two parts, training and testing, for testing in this section.
• 70% of the entire dataset was used for training and construction of the proposed implementation model • 30% of the remaining dataset was used for testing and validating the model.

Spam base dataset analysis
The Corpus bench brand is emailed from the spam database. At the time of receiving this spam message, it contained 4,601 messages, of which 1,813 (39%) were known as spam messages and 2,788 (61%) were known and purchased as non-spam messages. Were [44][45][46][47][48][49]. The command helped the non-spam message. This is taken from a single mailbox. Unlike most failures in their raw form, the purchase of this carcass has already been pre-processed. Accidents or characteristics are presented as 58-dimensional vectors. In a set of 57 attributes, 48 of the 48 attributes are displayed with words that consist of the main messages, which have no pause or stop list, and are placed in the list of the most unbalanced words for the class spam. The remaining 6 attributes are the percentage of specific characters ";", "" ("" "" "" "" "!", "$" And "#", which allows spam or non-spam to be transmitted through agencies 1 and 0 All data, including the database, is a good test bed.

Experimental results and discussion
In this section, we will show the results of the implementation of the proposed algorithm and other comparative algorithms on the data set of spam emails, and then we will discuss the results obtained. We divided the spam email data collection into test data set (30%) and training (70%), and in both test and training data sets, spam and non-spam emails are placed to better evaluate the performance and efficiency of the algorithm. We have suggestions and other comparative algorithms in this section. We also considered the initial population of all algorithms to be 10 and the number of iterations to be 100-1. In this section, we will first test the proposed algorithm with the WOA, FPA algorithm in terms of classification accuracy and average classification accuracy with different repetitions, and then in the next step we will propose the proposed algorithm and other comparative algorithms on the data set of spam emails from in terms of classification accuracy, we experiment with different iterations.
In the first experiment of this sub-section, the performance of WOA, FPA and HWOAFPA in terms of accuracy of detecting spam emails is shown. The parameters are set to a maximum number of repetitions of 1-100 and a population size of 10.
The results of the first experiment in this subsection, show that the proposed algorithm has achieved much better results and accuracy than whale optimization and flower spraying algorithms in email detection by increasing the number of iterations. The proposed algorithm works in the first repetitions like other powerful algorithms, but as the number of repetitions in increases, it can be seen that it has been able to excel due to its greater use of oppositionbased learning methods to create solutions in the opposite search space. Show the ratio of whale optimization algorithms and flower pollination. For further testing in this section, in addition to the experimental accuracy classification, we examined the proposed algorithm in terms of the average accuracy of spam email detection, which includes the entire population of the proposed algorithms and WOA and FPA. Therefore, the WOA, FPA and HWOAFPA algorithms are shown in terms of the average accuracy of spam email detection. The parameters are set to a maximum number of repetitions of 1-100 and a population size of 10.
The results of the second experiment in this subsection, show that by increasing the number of repetitions in the proposed algorithm, much better results are obtained from whale optimization and flower pollination algorithms in terms of average accuracy of spam email detection accuracy. Is. This experiment proves that the proposed algorithm in the number of repetitions changes its entire population and improves all the solutions available in the search space. Now, after comparing the proposed algorithm with whale optimization and flower pollination algorithms and showing the superiority of the proposed algorithm in terms of classification accuracy and average classification accuracy, we will try to further evaluate the proposed algorithm with other proposed algorithms. We test meta-initiatives such as genetic algorithms and particle and bat crowding in terms of classification accuracy with different repetitions.
The results of the third experiment in this subsection, show that the proposed algorithm has stronger results than comparative algorithms such as genetic algorithms, particle density, bat, whale optimization and flower pollination in terms of accuracy in detecting spam emails. Has achieved. The results of these three experiments in the discussion of spam email detection proved that the combination was well done in the proposed method. Using the opposition-based learning method in the combined mode, we were able to provide a powerful algorithm for detecting spam emails.

Conclusion
A feature selection method should search for the best subset of features among all possible subsets in the data set and extract these properties based on a specific evaluation criterion. Applying feature selection in the data set before the learning process plays an essential role in improving the efficiency of the categorization process. Finding this optimal subset can be considered as an optimization problem and using random, innovative and meta-innovative search methods to find these optimal or semi-optimal subsets. The important point is that when the search space is large, choosing a subset of features using traditional and innovative optimization methods is not effective and efficient, and meta-innovative algorithms as the most suitable option to solve this problem by many researchers so far. Provided. Innovative algorithms may also be combined to increase the convergence rate and accuracy of the production solution.
In this paper, we have tried to solve the problem of optimization of feature selection using the natural processes of whale optimization and flower pollination algorithms, and on the other hand, we have used the opposition-based learning method to ensure the convergence speed and accuracy of the proposed algorithm. In fact, in the proposed method, whale optimization algorithms use the bait siege process, bubble attack method and bait search to create solutions in their search space and try to improve the solutions to the feature selection problem, along with this algorithm. The pollen pollination algorithm with two global and local search processes improves the problem-solving feature in contrasting solutions with the whale optimization algorithm. In fact, we used both search space solutions and contrasting search space solutions, all possible solutions to the feature selection problem. To evaluate the performance of the proposed algorithm, we performed the experiments in two stages. In the first phase, experiments were performed on 10 sets of data selection features from the UCI data repository. In the second step, we tried to test the performance of the proposed algorithm by detecting spam emails. The results obtained from the first step show that the proposed algorithm, by running on 10 UCI data sets, has been able to be more successful in terms of average selection size and classification accuracy than other basic meta-heuristic algorithms. Also, the results obtained from the second step show that the proposed algorithm has been able to perform spam emails more accurately than other modern meta-heuristic algorithms in terms of accuracy.