Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Ensemble Machine Learning Outperforms Empirical Equations for the Ground Heat Flux Estimation with Remote Sensing Data

Version 1 : Received: 16 February 2022 / Approved: 17 February 2022 / Online: 17 February 2022 (04:33:43 CET)

A peer-reviewed article of this Preprint also exists.

Bonsoms, J.; Boulet, G. Ensemble Machine Learning Outperforms Empirical Equations for the Ground Heat Flux Estimation with Remote Sensing Data. Remote Sens. 2022, 14, 1788. Bonsoms, J.; Boulet, G. Ensemble Machine Learning Outperforms Empirical Equations for the Ground Heat Flux Estimation with Remote Sensing Data. Remote Sens. 2022, 14, 1788.

Abstract

Estimating evapotranspiration at field scale is a major component of sustainable water management. Due to the difficulty to assess some major unknowns of the water cycle at that scale, including irrigation amounts, evapotranspiration is often computed as the residual of the instantaneous surface energy budget. One of the Surface Energy Bal-ance components with the largest uncertainties in their quantification over bare soils and sparse vegetation areas is the ground heat flux (G). Over the last decades, the es-timation of G with RS data has been mainly achieved with empirical equations, on the basis of the G and net radiation (Rn) ratio, G/Rn. G/Rn empirical equations generally require vegetation data (Type I empirical equations), in combination with surface tem-perature (Ts) and albedo (Type II empirical equations). In this article we aim to evalu-ate the estimation of G with RS. For the first time, we compare eight G/Rn empirical equations against two types of machine learning (ML) methods: an ensemble ML type, the Random Forest (RF), and the Neural Networks (NN). The comparison of each method is evaluated over dense dataset, including a wide range of climate and land covers, with data of Eddy-Covariance towers extended along the mid-latitude area that encompass the European and African continent. Our results have shown evidence that the driver of G in bare soils and sparse vegetation areas (Fraction of Vegetation, Fv <= 0.25) is Ts, instead of vegetation greenness indexes. On the other hand, the estimation of G with Rn, Ts or Fv decreases at dense vegetation areas (Fv >= 0.50). There are not significant differences between the most accurate type I and II empirical equations. For bare soils and sparse vegetation areas the empirical equation that better estimates G is E8, which combines the Leaf Area Index (LAI) and Ts. In dense vegetation areas (Fv >= 0.25), an exponential empirical equation based on Fv (E4), shows the best performance. However, ML better estimates G than the empirical equations, independently of the Fv ranges. A RF model with Rn, LAI and Ts as predictor variables shows the best accuracy and performance metrics, outperforming the NN model.

Keywords

ground heat flux; machine learning; remote sensing; surface energy balance

Subject

Environmental and Earth Sciences, Environmental Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.