Machine Learning Approaches for Accurate Prediction of Relative Humidity based on Temperature and Wet- Bulb Depression

The main parameters for calculation of relative humidity are the wet-bulb depression and dry bulb temperature. In this work, easy-to-used predictive tools based on statistical learning concepts, i.e., the Adaptive Network-Based Fuzzy Inference System (ANFIS) and Least Square Support Vector Machine (LSSVM) are developed for calculating relative humidity in terms of wet bulb depression and dry bulb temperature. To evaluate the aforementioned models, some statistical analyses have been done between the actual and estimated data points. Results obtained from the present models showed their capabilities to calculate relative humidity for divers values of dry bulb temperatures and also wet-bulb depression. The obtained values of MSE and MRE were 0.132 and 0.931, 0.193 and 1.291 for the LSSVM and ANFIS approaches respectively. These developed tools are user-friend and can be of massive value for scientists especially, who dealing with air conditioning and wet cooling towers systems to have a noble check of the relative humidity in terms of wet bulb depression and dry bulb temperatures.


Introduction
The main factors to control the quantity of moist air are the temperature and pressure. As the air temperature increases, the amount of water vapor is also increasing [1,2]. The widely applied parameter in practice for determining a characteristic of air is the dry-bulb temperature, which is known as the air temperature. Since the obtained air temperature by a thermometer is not function of the air humidity, it is named as dry bulb [3,4]. Conversely, the wet-bulb temperature obtained by a wet thermometer which is affected by the airflow. The rate of evaporative cooling that is a type of cooling with the capability of removing the moisture from a surface usually is measured by a practical applications for engineers who dealing with air conditioning and wet cooling towers systems. In more details, as indicated in the equation (1), the relative humidity means that the water vapor partial pressure in the air-water mixture relative to the equilibrium water vapor pressure at the similar temperature [8].

=
(1) The approximation of wet bulb temperature is performed through the equation presents below: In addition, the pressure difference between the wet-bulb and dew-point conditions usually is estimated by the following equation.
Equation (4) obtains from the combination of equations (2) and (3): Here, ′is identified as indicated in equation (6) and it denotes the psychometric constant for the air.
According to the proposed correlations, the wet-bulb depression is estimated by: (1 − ) (for Ta, 0-110 ºC) (9) Where the term of 1 is introduced as: In addition, when 1 < , 1 is identified as: Accordingly, the term can be approximated by the equation (3) through substituting 1 for , which is known as: These equations should be solved through iterative procedures.
Briefly, to determine Twb, the amounts of ′ and are substituted into equation (5). Then, these procedures are continued by substituting Twb1 for Twb with modified ′ and quantities.
Consequently, the new approximations are replaced in equation (5) and these procedures continue so long as Twb converges with good accuracy.
Bahadori et al. proposed an Arrhenius-type asymptotic exponential function model for predicting the relative humidity in terms of dry bulb temperature and wet bulb depression. Their correlation is easy-to-apply and has good agreement with real data [9]. This correlation is formulated in below and its coefficients can be found in its reference.
Beside this correlation type method, application of statistical learning approaches can be a benefit in the present case. There are several types used embranchments of statistical learning approaches, i.e. fuzzy logic, ANN, SVM, and ANFIS [10][11][12][13][14].
In the present study, the potential of adaptive ANFIS, LSSVM and two kinds of artificial neural network which are known as the MLP and RBF structures was investigated to calculate the relative humidity based on wet bulb depression and dry bulb temperature. Then, a databank of data points was gathered from the reference to achieve this end (see Table 1) [15]. ANFIS is a kind of the neural network method that has value for applying in function approximation problems [16,17]. In other words, an ANFIS structure is a combination of the knowledge obtained from the artificial neural network and fuzzy logic system. Each ANFIS structures contain some parameters called membership function parameters (MF) that should be optimized using optimization algorithms. Thus, owning to this special structure of the ANFIS, it is more systematic and its dependency on actual data is less than other machine learning approaches such as the ANN [18]. The illustration of typical the ANFIS can be seen in Fig. 1. A common ANFIS form is created basically from the five layers. In this layer, some nodes present and they are specified by their node functions. The relationship between each layer basically can be performed by internal connections. As demonstrated in this figure, the layer's inputs are supplied by the preceding layer's outputs. It should be noted that the Sugeno type is used as fuzzy system of the ANFIS method.

ANFIS
In more details, if the inputs contain two parameters and their notifications are x and y, and output consists of only one parameter, namely fi, the ANFIS rules can be introduced as [16]: Here, the terms M and N stand for the fuzzy sets and the first order fuzzy inference system outputs are shown by fi (x, y).
The adaptive nodes present in the first layer are defined as: where µ(y) and µ(x) represent membership functions.
In the next layer, each node is constant and indicates with notification .
where ωi , refers to the rule's firing strength.
The third layer contains nodes that are constant and indicate with notification N. The firing strength is normalized by their node functions. To that end, the firing strength value of i th node is divided to the all firing strength summation.
The fourth layer has nodes that are adaptive and indicate with square shapes.
f1 and f2 stand for the fuzzy if-then rules. The formulations of their rules are represented as follows: Where, pi,qi and ri refer to the consequential terms.
In the last layer, overall output is calculated through: However, the output is characterized as a linear combining the consequential terms. The last output is specified by: A typical ANFIS is learned by the hybrid algorithm, which is the combination of the least squares approach and gradient technique. Consequential terms can be determined by the least squares approach in the forward pass and the signal of errors are propagated in the backward pass [19]. In this study, we used the capability of a genetic algorithm to calculate parameters of the ANFIS method.

Least Square Support Vector Machine (LSSVM)
The supervised LSSVM approach was the first created by Suykens and Vandewalle in 1999 devoted to the function approximation and regression problems. If the inputs represent by Xi ( here in, wet bulb depression (Dwb) and temperature (T)) and the output indicates with Yi (relative humidity (RH)), the typical LSSVM nonlinear function can be defined as formulated at the following [20]: Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 2 November 2020 doi:10.20944/preprints202002.0075.v2 Here, the term f refers to the target (RH) and inputs (Dwb and T) connections, w applies for the vector of m-dimensional weight. Furthermore, b is the bias of the term. Owning to the theory of minimization, the regression problem usually is solved by considering following expression: In addition, following limitations should be regarded: Here, c refers to the margin parameter and ek represents the error variable of xk. According to the straightforward derivations of the LSSVM by Suykens, the following expression will be obtained [20,21]: Owning to the good performance of the radial basis function or RBF, it is usually employed as kernel function in regression faults. This kind of function can be indicated by the following equation: In Eq. (24)

Artificial neural network (ANN)
As observed in Fig. 3, the multilayer perceptron (MLP) neural network is created from input, hidden, and output layers. In this structure, the number of hidden layers may be one or more. Each layer consists of some neurons that their number should be specified by optimization techniques or trial and error procedures. In order to learn the ANN model, the parameters are adjusted till minimum square error (MSE) is obtained. In MLP structure of ANN usually, is optimized by back-propagation technique. The optimized number of iterations which is named by epochs must be used to determine the weights and biases to overcome undertraining and overtraining problems. These problems lead to obtaining poor results at testing phase [22,23]. There is a distinct computation in the structure of the MLP and RBF, while their utilizations are similar. Owning to the simple design of the RBF frameworks (see Fig. 4), they have more advantages than MLP structure. This distinction makes the RBF structure more capable for generalization of outcomes [24].
RBF framework is a kind of iterative function estimator which is used the concentrated basis functions. In order to create a model, this framework uses a supervisory training technique that is a kind of feed-forward neural network. Thanks to RBF structure, its computations is much simpler and quicker than MLP structure. The RBF configuration is similar to the conventional regularization framework. Three advantageous features of regularization framework are expressed as follows: • Its capability to estimate any multivariate nonstop function on a compressed domain in order to a reach satisfactory outcomes through utilization of an adequate number of components.
• The best solution is determined with minimization of a function that indicates its oscillation.
• Owning to the linear configuration of unidentified coefficients, there is the best estimation feature.
As shown in above two figures, the scheme of the RBF is nearly similar to the MLP. It has three layers; an output layer, an input layer, and a hidden layer. However, the main distinctions of these two configurations including: • RBF framework uses the easier construction than MLP.
• There is much simpler training process in RBF framework than MLP.
• MLP structure acts globally and its outcomes are specified through the relationship between all the neurons, while the RBF structure acts locally and its outcomes are determined by identified hidden units in particular local accessible areas.
• The classification manners in these frameworks are different. Separation of each cluster is performed by the hypersurfaces in the MLP structure, while this task is done by the hyperspheres in RBF configuration.

Development of Models
The precise experimental relative humidity values are required to evolve aforementioned models. In this study, we used the relative humidity, wet bulb depression, and temperature that have been recorded in the reference and their ranges were presented in Table 1 [15].
The next stage is the selection of inputs and target variables of models. To that end, the wet-bulb depression and temperature are given as inputs and relative humidity is regarded as the target parameter. The database used in this study consists of 330 data points. This data set should be divided into two datasets namely, training and testing sets, to train and test the capability of proposed models. The training data set contains 248 data points which are approximately 75% of total data points and the remained data points were used for testing. In addition, these data points were normalized within the ranges of -1 and +1 to achieve better performance with these models.
As mentioned, in the current work, four models based on machine learning concepts including, the ANFIS, LSSVM, and two kind of ANN namely the MLP and RBF structures were evolved to predict the relative humidity using wet bulb depression and dry bulb temperature. There are two parameters in the LSSVM which require being adjusted before the training step. The regularization parameter and the R-squared (R 2 ) were performed between the real and predicted data by the models. These analyses can be expressed as follows: Where αexp and αcal stand for the experimental and determined data point by algorithms respectively.
In addition, N stands for the number of data points.

Results and Discussions
We The performance of ANFIS based on RMSE of the determined and real values of humidity is shown in Fig. 5. As depicted, the maximum number of iterations was chosen 1000 and the optimum of RMSE was achieved as 0.40504. Fig. 6 illustrates trained membership function parameters for input variables including the wet bulb depression and dry bulb temperature, respectively. Their values vary between -1 and +1, because of normalizing.
We used the linear transfer function and Log-Sigmoid transfer function in the output and hidden layers of suggested MLP-ANN model, respectively. In addition, we apply trial and error procedure to obtain an optimum number of neurons in its hidden layer. We used 7 neurons in the hidden layer and trained the MLP-ANN by the back-propagation technique. Table 2   Details of suggested algorithms have been reported in Table 3.

(b)
The regression analysis has been shown in Fig. 10 for all models at training and testing phases.
According to statistical knowledge, the R 2 value is a well-known term which indicates the relationship between the model outputs and real values. Whereas R 2 = 1, an interesting linear relationship is established between the predicted and real values. Conversely, as R 2 is closer to zero, the linear relationship between the predicted and real values is weaker. A close-fitting of data points around the 45° line for the predictive tools represent the precision.  Besides above illustrations, the capability of aforementioned models was evaluated using the MSE, MRE, STD, RMSE, and R 2 . These statistical parameters have been summarized in Table 5.
Furthermore, the relative deviation percentages between the actual and estimated value of relative term hat* refers to warning leverage value and for the present models, the warning value obtained 0.027. More information can be found in the literature regarding William's plot [10,18,25].
Consequently, the ability of these proposed tools was significantly confirmed by the great agreement between the estimated data and the real data in evaluating the models for the training and testing stages.

Conclusion
The aim of this study was to investigate the potential of four models based on statistical learning concepts, such as ANFIS, LSSVM, RBF, and MLP artificial neural network for predicting relative humidity by dry bulb temperature and wet bulb depression. The membership function parameters of the ANFIS and the LSSVM parameters (i.e. the regularization parameter and kernel parameter) were determined through the genetic algorithm. The genetic algorithm had good performance to adjust their tuning parameters. Estimations were indicated to be in a close match with actual data points. Based on the results obtained by statistical analyses, the ability of the proposed methods was significantly confirmed by the good agreement between the estimated data and the real data in evaluating the models for the training and testing stages. In addition, outcomes of suggested models have been compared with another reported correlation and the accuracy of models was proved as expected. Unlike complex mathematical approaches for prediction of relative humidity, the suggested approaches are user-friend and would be of excellent help for scientists especially in cooling towers and air conditioning systems.