Prediction of Combine Harvester Performance Using Hybrid Machine Learning Modeling and Response Surface Methodology

. Automated controlling the harvesting systems can significantly increase the efficiency of the agricultural practices and prevent food wastes. Modeling and improvement of the combine harvester can increase the overall performance. Machine learning methods provide the opportunity of advanced modeling for accurate prediction of the highest performance of the machine. In this study, the modeling of combine harvesting id performed using radial basis function (RBF) and the hybrid machine learning method of adaptive neuro-fuzzy inference system (ANFIS) to predict various variables of the combine harvester for the optimal performance. Response surface methodology (RSM) is also used to optimize the models. The comparative study shows that the ANFIS method outperforms the RBF method.


Introduction
A great amount of agricultural products is destroyed for various reasons during production, consumption, and also harvesting stage that is spent a lot of costs, energy, and hard work to produce them and also imposed pressure on the environment. The amount of waste of agricultural products (both quantitatively and qualitatively) has a major amount in Iran and is causing huge damages in the agricultural sector. Efforts to reduce wastes of agricultural products is more important and less costly compared to efforts to increase production [1]. Identification of the effective parameters in harvesting is the first and important step to control and reduce these types of wastes. These parameters are the time of harvesting, type of harvesting (mechanized or manual), the correct settings of the harvesting machines, transmission to the target market etc. According to the unique significance of machines in agricultural production systems as a power source, evaluation of mechanisms and performance of equipment is the inevitable priority of management in agricultural units.
Harvester combines have a specific position in the harvest of agricultural crops [2] due to their sensitive mechanisms and processes applied to the Strategic Grain Products [3]. Adjustment and optimization of the internal components of this machine are very important [4]. Factors affecting on improper functioning of combines or any other machine can be achieved using kinematic and dynamic analysis. These factors are placed in three groups include geometric parameters, working conditions, and product properties that inappropriate level of each of them led to reducing the performance of the combined [1]. Each component of combines will affect the flow and movement of the product based on the geometry and properties of the product. Geometric changes affect the machine performance that prediction and modeling of these changes will be an effective proceeding in machine design and new components production [5]. The present research studies the effect of three parameters include clearance of threshing unit, fan speed and openness of sieves on product damage, loss and the amount of non-grain materials for conventional combines. Based on a report in a study by Spengler et al [6], optimization of the threshing unit decreased 4 to 6 % of total machine loss in Germany in 1986. Harvesting by harvester combine is a complex nonlinear process [7] that is affected by a wide range of data [8].
Many mathematical models are used for modeling relationships between inputs and outputs of a process, but classical logic requires precise definitions of relationships of mathematical models to describe phenomena [9]. The use of artificial intelligence methods such as artificial neural networks, fuzzy logic, etc., to develop predictive models possible very complex data to be compared and analyzed [10]. One of these methods is the fuzzy method. Fuzzy method has a lot of benefits that one of this benefits is to process the improper data. Fuzzy method provides linguistic labels for modeling complex systems [11]. Artificial neural network is another processing method. In general, The artificial neural network method is used in a wide area such as mathematics, engineering, medicine, economics, the environment and agriculture [12]. The advantage of both systems is focused on ANFIS structure. ANFIS is widely used in the study of complex systems modeling, control, or estimate parameters. A combined learning algorithm is the basis of ANFIS to identify parameters using the Sugeno Fuzzy Inference System. This system is combination of least squares and back propagation error methods to train membership functions to compete with total training data to achieve the best output [11].
Harvester combines a wide range of products in various environmental conditions [4]. Harvester combines have variable rate Feature that enables combines to harvest in a wide range of conditions [7]. Harvester combines have five general duties: Harvesting and feeding, threshing, separating, cleaning, and loading of product. The threshing unit separates about 60 to 90 % of threshed seeds from clusters [13]. Separating unit is after the threshing unit and separates grain from different parts of the product. Cleaning of seeds from impurities refers to the final stage of the separation process. Cleaning is performed by both mechanical and aerodynamic simultaneously. Finally, cleaned seeds poured into transporters of the clean grain and are transferred to the tank by grain elevators. The simulating and modeling of the desired system (combine in this study) give better judge about the performance of various sectors [14,15]. The threshing process is one of the essential processes in harvesting stage that is evaluated by factors include thresher efficiency, cleaner efficiency, damaged seeds and chopped stalks [1]. System modeling using conventional mathematical tools such as deferential relations is not suitable in systems with unclear behavior and not well-defined systems [16]. Despite the classical systems, intelligent control systems do not need to know the mathematical models of behavior system. Now a days, intelligent systems and soft computing-based systems are used in all scientific fields [17]. Therefore, using the prediction methods such as soft computing and intelligent methods, that are increased recently, help evaluate the desired systems [18]. These methods have various types that most popular of them are fuzzy methods and artificial neural networks [19]. There are several studies on modeling, studying or optimizing the performance of combine harvesters. Some of them are as follow that studying them can help to define the aim of present study and the novelty of the work: Craessaerts et al. [4,20] studied genetic-based methodology for input selection to identify the cleaning process on a combine harvester in two parts, first for selection the input variables for identification of the sieve losses and second for identification of MOG content in a grain bin. Maertens and De Baerdemaeker [21] prepared a dynamic separation model to avoid using the non-linear, complex and uncertain relations. Zhao et al. [22] presented an indirect grain separation loss monitoring method to show the grain loss in separation unit based on the analysis of the relationship between grain separation loss and grain separation flux in the area under the concave. Mirza Zadeh et al [23] employed a multi-layered perceptron method of ANN to predict the grain separation of the combine harvester. They used the parameters of feed rate, stem height, the rotational revolution of thresher, and clearance ratio as the independent variables of the test. Results showed a correlation coefficient of 0.9. Maertens et al. [24,25] made an analytical approach in two parts to maintain the grain flow model for a combine harvester. The first part for model designing and the second part for analysis and application of the model. Miu and Kutzbach [26,27] did a two-part study on modeling the threshing and separation units of the combine harvester. These studies were developed based on mathematical modeling. They found a good correlation between predicted and experimental data. Miu [28] did a study on designing an optimized threshing process using a genetic algorithm. It was formulated a multi-objective genetic algorithm to optimize the functional parameters of threshing units. It was obtained that this method can be adapted to other threshing units in various crops. Miu and Kutzbach [29] simulated the threshing and separation process in the threshing unit of a combine harvester. It was developed, two models. One model describes the percentage of unthreshed grains, and the other model quantifies the cumulative percentages of separated MOG, separable MOG, and unfragmented MOG. In a study by Ryszard, and Jachimczyk [30] it was developed the mathematical model of grain separation in a straw walker. Based on results, kinematic parameters of walkers significantly effected on the quality of separation. In a study by Bulgakov, et al developed a mathematical model for the renewal of the combine harvester fleet on the basis of integral equations for the fleet of combine harvesters. In a two part-study by Craessaerts et al. it was considered the identification of the cleaning process on combine harvesters. The first part for developing a fuzzy model for prediction of the material other than grain (MOG) content in the grain bin and the second part for prediction of the sieve losses.
As is clear from previous studies, there are a limited number of studies with soft computing methods on modeling and studying combine harvester performances. While the soft computing methods can reduce the complexity of system and process and can increase the precision. The present study tries ANFIS and RBF models as the models that were not applied to modeling the combine harvester to determine and evaluation of this method on combine modeling. Moreover, the RSM method is used to optimize the product loss, MOG content, and broken seeds, simultaneously. However, so far, an optimization study with this dependent parameters has not been performed with the RSM method, and this is the main novelty of present work.

Material and Method
Measuring and collecting the required data was conducted from 1055i john deer combine in Agricultural Research Station of Ardabil Province of Iran. The specifications of cleaning and threshing units were: Threshing drum (TD) with a diameter of 610 mm and the length of 1080 include 8 blades and rotational speed of 410 to 1160 rpm.
In the present study, the RBF of ANN was employed to develop a model among dependent and independent variables. An RBF network is a three-layered feed-forward network with the structure which is presented in fig 1. Inputs directly enter to the hidden layer and after multiplying to related weights enter to a summation box and generate the outputs. The input layer is composed of neurons that are a divider of input signals to hidden layer neurons. In the hidden layer of the neurons are executed a non-linear mapping of the input space into hidden layer space based on eq.1 that are dimensionally equal [19,65,66]. In the present study, the Gaussian function was employed to define the neurons as local receivers. (1)

Network training
The experimental data of target combine harvester were employed to train the network. The trained network has to be able to connect the inputs and outputs to be able to predict and model the behavior of the system. Therefore, the factors of A, B, and C were considered as the input variables (independent variables) and the factors of BS, PL and MOG were considered as the output variables (dependent variables) of the network. These factors were selected based on adjustments at the disposal of the operator and the effect of the factors on system performance. 70 % of data were selected as training data, and 30 % of them were selected as testing data. The aim of the training process is to reduce the errors of the target and output values of the network. Mean absolute error (MSE) was used to compare the target and network's output values. The training process was started with five neurons in the hidden layer, and in each training step, 5 neurons were added to the number of previous neurons at the next run. Adding neurons to hidden layer continues as long as reducing the errors and taking a constant trend of errors.

ANFIS modeling
Adaptive neuro-fuzzy inference system (ANFIS) is an approach to model the nonlinear complex problems that use Sugeno model with fuzzy inputs and rules to prepare a strong predicting tool [67]. ANFIS is a class of adaptive feed-forward network that has 5 layers (fig 2). This system generates fuzzy rules based on input and output data (i.e. training data). A simple rule using Sugeno fuzzy model is as follow:

If x is Ai and y is Bi then z= f (x; y)
Where A and B are fuzzy categories and z = f (x; y) is usually a polynomial function [68]. In this study, training and testing data for developing ANFIS were the same data related to developing the RBF model. ANFIS was developed by MATLAB 2012a software. The used algorithm was a combined algorithm and was selected to change the initial membership functions. In order to determine the best training network, the ANFIS method was developed based on different types of membership functions. The trim type function with a linear method using three membership functions prepared the best response to network modeling based on comparing parameters.
Comparing parameters was the root mean square error (RMSE), correlation coefficient (r), and mean absolute error (MAE) to compare the target and output values of networks.
Where A is related to actual values and P is related to predicted values and N is the number of data. Using this parameter helps to choose the best structure and best network and provides the possibility of understanding the proximity of model.

Results
The present study was performed on a combine harvester and data were recorded experimentally.  According to Figure 4 (a) by considering the fixed value of other parameters, opening hot water tap, increases the growing hall temperature. Figure 4 (b) shows the variation of growing hall temperature by opening and closing air dampers when other parameters are fixed. Accordingly, if the rate of opening and closing of circulation and fresh air dampers be equal, respectively, the hall temperature almost will be fixed. In Figure 4 (c) by reducing the temperature of the water and by increasing the ambient temperature during the day, the growing hall temperature has undergone a constant trend.

Training stage
Studying and presenting a model based on RBF and ANFIS methods were considered as the main aim. One of the essential stages of preparing a precise model is the training stage.
Training process of RBF and ANFIS models were performed and the results were extracted. These results help us to choose the best model to enter the testing process. Prediction process was performed by the ANFIS and RBF networks. To perform modeling operations, BS, PL, and MOG were considered as the independent variable (output of network) and A, B and C were considered as independent variables (inputs of the network). In order to train target networks, test data were employed to develop the network. This stage was performed to create a precise network for the test stage. Results of training for RBF and ANFIS methods were presented in table 1 and 2, respectively:

Testing stage
The test data were imported to the selected network in the training stage. The output of networks was compared with target data and the comparison results were presented in Table 3. .96 e-4 and r value of 0.999, 0.999 and 0.999, for BS, PL and MOG, respectively, presented the best result compared to RBF network due to its low value of RMSE and high value of r compared to RBF network. Therefore, ANFIS was selected as the best prediction model in the present study.

Optimization
The optimization process was performed by response surface methodology (RSM). The RSM method is a statistical method to find the relationships between several explanatory variables (input variables) and one or more response variables (output variable (s)). The main idea of RSM is to use a sequence of designed experiments to obtain an optimal response [69].

Conclusions
In the present study, the performance factors of combine harvester including the BS, PL, and MOG was modeled based on three factors, A, B and C, using MLP and RBF of artificial neural networks. By the statistical analysis and checking the functional parameters, using the correlation test, the relationship between these parameters was significant at 5% probability level. After modeling and using the obtained results, it was observed that with an increasing number of neurons in the hidden layer, results would be better and better. The best result and the highest correlation value was obtained in 20 neurons in the hidden layer. Regarding results, due to the high adaptability and low error, we can say that the RBF has great value for system modeling.