Systematic review of deep learning and machine learning models in biofuels research

Biofuels construct an essential pillar of energy systems. Biofuels are considered as a popular resource for electricity production, heating, household, and industrial usage, liquid fuels, and mobility around the world. Thus, the need for handling, modeling, decision-making, demand, and forecasting for biofuels are of utmost importance. Recently, machine learning (ML) and deep learning (DL) techniques have been accessible in modeling, optimizing, and handling biofuels production, consumption, and environmental impacts. The main aim of this study is to review and evaluate ML and DL techniques and their applications in handling biofuels production, consumption, and environmental impacts, both for modeling and optimization purposes. Hybrid and ensemble ML methods, as well as DL methods, have found to provide higher performance and accuracy in modeling the biofuels.


Introduction
The global energy systems are highly dependent on fossil fuels [1,2].The importance of energy systems and their role in economics and politics is not hidden for anyone [3,4].This issue is not only important for the advanced industrialized countries, which are major energy consumers but is also essential for oil-rich countries [5].Because countries have to understand the fact that fossil fuel resources are limited resources.In addition to the nature of these fuels, which contains polluting substances, the issue of their ending up has aggravated the growing concern.Therefore owing to depleting non-renewable energy resources, pollution, and environmental damage, the world is turning towards renewable energy resources [6].Fossil fuels remain as one of the major energy resources worldwide [7].Heavy dependence on fossil fuels has caused an energy crisis.
Using fossil fuel for economic activities leads to GHG emissions from almost all regions of the world [8].Renewable resources like biofuels make an attractive contribution towards meeting the growing demand for energy supply [9][10][11].Owing to environmental concerns and the rise and fluctuations in the fossil fuel resources, worldwide interests have moved towards biodiesel, a clean and renewable alternative for fossil fuels [12,13].Biofuels can be used in different fields for energy production like electricity production, power production, or transportation [14].The economy of biofuels and related refineries will be shaped by policies that have shaped the economy of hydrocarbon and its refineries over the last century [15][16][17][18].Due to the environmental benefits of biofuels, their contribution to the automotive fuel market is increasing sharply.Various scenarios have been written about the estimated biofuels from different sources in the future energy system.The availability of biofuels for the electricity market, heating, and liquid fuels is very important.Therefore the need for handling, modeling, decision making, and forecasting for biofuels can be one of the main challenges for scientists [19][20][21][22].Figure .1 shows the research trend in literature considering biofuels.Note that, since 2015 the research in this realm has stopped been progressing.because the production of the desired product needs an effective use of experimental model [23].These methods provide an independent modeling approach to the nature of the process or its mathematical models and are able to model the process with high accuracy [9,11,24,25].
The primary purpose of this study is to present a review in a specific field to find the strengths and weaknesses of the field and to provide a complete background.The main aim of this study is to evaluate the ML and DL techniques developed for handling biofuels production, consumption, and environmental impacts, both for modeling and optimization purposes.The study initially explains and defines different biofuels.Then provides a general survey about the characteristics and the basis of the developed studies.In the next stage, explains the state of art of the DL and ML techniques employed in the field.Finally, concludes the results and achievements and proposes the strengths and weakness of different DL and ML techniques.

ML and DL methods in biofuels research
The application of ML and DL methods in various scientific and engineering domains have been previously investigated [26][27][28][29][30][31][32][33][34][35][36][37][38][39][40].Generally, the ML methods are reported to be further advancing to through ensemble and hybrid techniques .On the other hand, the DL methods are still considered as a new phenomenon and are slowly progressing.
In this section, the most popular ML and DL methods in biofuels research are identified and reviewed.During the past decade, the application of these intelligent algorithms has been dramatically increases in biofuels research.Figure 2. represents the increasing demand and popularity of using DL and ML in handling biofuels.It is apparent that since 2010, the use of DL and ML has been increasing until the year 2017.
Since then, it starts to decline.The reason can be found in the overall decrease in the number of literature in biofuels research.We made three classifications of the methods, i.e., neural networks-based methods, single ML methods, and a separate group for deep learning, ensembles, and hybrid models.To optimize the prediction of liquid-liquid equilibria which is employed in the simulation of the biofuel process by the use of a novel non-random two-liquid-ANN method.

ANN -NRTL
-Biofuels [9] To develop different types of MLP networks for the estimation of enzyme function.MLP -Enzyme function -machine learning [60] To develop the ANN method for the prediction of unmeasurable variables during hydrogen and methane production through the anaerobic digestion process.

RNN -Biofuels
-RNN [61] To develop a comprehensive survey about the use of Reynel-Ávila et al. [59] developed an innovative hybrid non-random two-liquid-ANN method in order to increase the estimation performance of the liquid-liquid equilibria, which is used to simulate the biofuel process.Non-random two-liquid method is considered as a thermodynamic method to be used in a multi-component system.Therefore, hybridization of this method with the ANN method can improve the system accuracy for the regression and fitting proposes.Evaluation of the proposed method has been performed using RMSD factor for measuring the agreement between target and estimated values.This method as a flexible method, could successfully cope with the estimation task as well as increasing the accuracy of estimation.
Concu et al. [9] developed a study in order to employ different machine learning techniques for the estimation of protein function through a conversion process as a type of enzyme for considering in bioethanol production.The developed machine learning techniques included the single method containing different architectures of MLP methodology.Results have been evaluated using accuracy, sensitivity, and specificity.Methods have a different number of neurons in the hidden layer.The accuracy of the proposed MLP method was acceptable, as well as its higher sustainability.Camberos et al.
[60] developed a recurrent neural network method in order to estimate un-measurable variables during hydrogen and methane production through the anaerobic digestion process.The reason was the ability of the recurrent ANNs method in predicting the behavior of unknown and sophisticated systems.The method was a single method which benefited the external disturbances as well as the parameter uncertainties.The results have been evaluated using mean square error.Based on results, the proposed RNN method could successfully provide a high performance in confrontation with the complex system.Also, the method provided a high sustainability by a high stability in the presence of the external distributions.
Sewsynker-Sukai [61] did a comprehensive survey about the application of ANN, as one of the most popular and applied machine learning methods, in the field of biofuels for optimization and estimation purposes.This study also presents a brief explanation

Further single ML methods for biofuels research
This section includes support vector machines (SVM), decision trees (DTs) regression tree (RTs), Bayesian, k-means, and k-nearest neighbors, presented in table 3.  [63] developed non-destructive prediction methods for the estimation of the quality of the biofuel pellet using partial least-squares regression and a least-squares support vector machine as non-destructive diagnosis methods to be compound with successive projections algorithm.The performance of the methods have been compared using the determination coefficient and root mean square error values.Based on results, the best method was identified to be SPA-LSSVM method as a hybrid diagnosis method.This method employs the advantages of both LSSVM and SPA methods, consequently.most critical factors for this method was its lower processing time and its user-friendly application.These factors increase the method of sustainability factor to be employed in future researches.Wong et al. [65] developed a novel hybrid sparse Bayesian-based extreme learning machine technique for the estimation of the engine performance fuelled by biofuel as well as the calibration of the ECU.The proposed method has been also compared with the performance of ELM, Bayesian ELM and back propagation neural network in terms of mean absolute percentage error and standard deviation.The proposed hybrid method has an acceptable accuracy in both training and testing steps compared with that for the ELM, BPNN, and BELM methods.The proposed method also has a higher performance in the estimation of engine emissions.
Faizollahzadeh et al. [66] developed an innovative hybrid ELM-RSM and EVM RSM methods for the prediction of biofuel production yield and optimization of the production process for accessing a higher production yield.The developed methods have been compared with SVM, ANN and ANFIS methods in term of performance factors for the prediction phase.Based on results, hybrid ELM-RSM methods could provide higher performance by increasing the production yield compared with that of the other methods.This study also indicates the importance and strength of the hybrid method over single methods.In fact, this method benefits the highest prediction capability of ELM method in parallel with the optimization capability of the RSM.Therefore this study highlights the highest performance of hybrid techniques in comparison with single ones.Table 4 presents the comparison results of SVM based methods for biofuels handling.

Deep learning, machine learning, ensembles, and hybrid models for biofuels research
In this section, the more sophisticated ML methods in addition to DL are presented.
Here may include neuro-fuzzy models, various DL models, and ensemble MLs, presented in table 5.

Conclusion
This paper studies the applications and progress of ML and DL methods biofuels re- ensemble models that integrate two or more techniques.Survey shows that the single ML methods except for ANNs, have not been popular.However, the ensemble and hybrid models have emerged and continue to advance for higher accuracy and better performance.DL techniques also will bring a tremendous amount of intelligence for better prediction models.In general, modeling, forecasting, and decision making about the future of biofuels help for developing sustainable energy resources, which are lowcost resources with low environmental impacts.ML and DL techniques have been successfully employed in all fields of sciences and have improved the process.The various combinations of the hybrid and ensemble methods are found to be the most effective in handling biofuels.

Fig. 1 .
Fig. 1.The research trend in literature considering biofuels research (source: web of science )

Fig. 2 .
Fig. 2. Demand and popularity of using DL and ML in biofuels research (source: web of science)

Preprints
(www.preprints.org)| NOT PEER-REVIEWED | Posted: 17 August 2019 doi:10.20944/preprints201908.0179.v1of the comparison of the performance of ANN with another method and discussing the architectures of the developed ANN methods.Comparisons were performed using the coefficient of determination as to the performance factor.Based on results, developing ANN methods in this field provides a high production performance as well as reducing the time and the cost consuming.Reduction of the time and cost in the biofuels production and consuming processes also increases the sustainability and reliability of the system.Therefore, ANN can be a useful tool for handling biofuels and for managing the production and consuming processes for policymakers in the future researches.Kessler et al.[25] presented a study to estimate the cetane number of biofuel samples in the presence of furanic additives.Results have been evaluated using RMSE values.ANN as a predictive method could be successfully applied for the prediction of cetane number with a low error.Different applications of ANN tools in different fields of biofuels have been already discussed.However, there is a need for metrics and different criteria for the evaluation of the performance of each method.Table2present a brief comparison of the accuracy, reliability, and sustainability of methods developed for handling biofuels using different types of ANN methods.These factors have been prepared and presented based on different aspects which have been concluded by the reviewed studies.

-
Hybrid machine learningMancini et al.[62] developed three methods including partial least squares discriminant analysis, SVM, and principal component analysis linear discriminant analysis for the classification of biofuels.Based on results, all the methods could successfully cop with the classification task but SVM has the best classification performance.Feng et al.

Faizollahzadeh et al. [ 64 ]
developed a Sugeno based fuzzy method for the prediction of biodiesel fuel cetane number in the presence of Carbon number, Double bond, Saponification number, and Iodine value.The performance of the developed model has been calculated using the determination coefficient, and root mean square error.The developed model has high accuracy in both training and testing steps, but one of the Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 17 August 2019 doi:10.20944/preprints201908.0179.v1 search.This study presents an in-depth survey and analysis of the 'hybrid model' and Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 17 August 2019 doi:10.20944/preprints201908.0179.v1

Table 1 .
top studies developed by ANN-based methods in biofuel

Table 2 .
the comparison results of ANN-based methods for biofuels handling

Table 3 .
top studies developed by SVM based methods in biofuel

Table 4 .
the comparison results of SVM based methods for biofuels handling

Table 5 .
top studies developed by machine and deep learning-based methods in biofuel [67]gan et al.[67]developed a novel multi-criteria decision-making system for choosing the best biodiesel fuel for a compression ignition engine in terms of engine performance and combustion characteristics.Based on results, the hybrid Step-wise Weight Assessment Ratio Analysis-Multi-Objective Optimization by Ratio Analysis method and hybrid Analytic network process-Multi-Objective Optimization by Ratio Analysis provided the best performance for choosing the best fuel sample.

Table 6 .
the comparison results of DL and ML-based methods for biofuels handling