Submitted:
04 December 2024
Posted:
05 December 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Materials
2.1.1. Study Area
2.1.2. Data
2.1.2.1. Remote Sensing Datasets
2.1.2.2. Elevation Data
2.1.2.3. In-Situ Measurement
2.2. Methods
2.2.1. Soil Moisture Retrieval
2.2.1.1. Feature Selection
2.2.1.2. Machine Learning Models
2.2.2. SHapley Additive exPlanations (SHAP) Method
2.2.3. Soil Moisture Heterogeneity Analysis
2.2.3.1. Overall Heterogeneity Assessment
2.2.3.2. Local Heterogeneity Assessment
2.2.4. Evaluation Metrics
3. Results
3.1. Feature Selection Analysis
3.2. Evaluation of Ensemble Learning Methods and Mapping of Soil Moisture Results at 30m Resolution
3.3. Explanation of the Features’ Contribution of SHAP to SM Prediction Results
3.4. Heterogeneity Analysis Results
4. Discussion
5. Conclusion
- (1)
- After reviewing multiple studies, we collected twenty-three indices that can characterize soil moisture based on Landsat data, including indices for vegetation and water bodies. Among these, NDWI, SWCI, and VSWI showed the best fitting results with the in-situ measured soil moisture data, achieving R² values of 0.35, 0.41, and 0.42, respectively. However, these results are significantly lower than those reported by other researchers, which we attribute to the high soil moisture heterogeneity in the study area. To address this issue, we added elevation and derived features such as slope and aspect as additional inputs for the ensemble learning models. Before this step, we employed the Boruta method for feature selection to ensure that no irrelevant features were included in the models, confirming that all input features contributed meaningfully.
- (2)
- Among the four ensemble learning models, CatBoost performed the best. The soil moisture predictions made using CatBoost yielded R², RMSE, bias, and MSE values of 0.88, 0.0463, 0.0039, and 0.0021 m³/m³ for the validation dataset, and 0.83, 0.0516, 0.0029, and 0.0027 m³/m³ for the test dataset, respectively. The fitting performance of the Random Forest (RF) and Extra Trees (ET) models was relatively lower. Our results also indicate that Gradient Boosting Decision Trees (GBDT) significantly reduce overfitting compared to purely tree-based models. We used these models to generate spatial distribution maps of soil moisture in the study area, capturing detailed spatial variations.
- (3)
- The SHAP analysis results indicate that elevation is the most critical feature among the input variables. SWCI, VSWI, slope, and aspect also play a significant role in the construction of the soil moisture model, which aligns with findings from numerous studies. Elevation has a negative predictive effect on soil moisture in lower-altitude regions, while in higher-altitude areas, its effect is positive. In the central, drier part of the study area, the SHAP values for the main features—elevation, SWCI, VSWI, slope, and aspect—are negative.
- (4)
- In the overall analysis of soil moisture heterogeneity, QLB-NET exhibited a high coefficient of variation (CV) value of 49.98%, indicating significant heterogeneity compared to typical soil moisture networks. In the local analysis, we represented local heterogeneity with terrain complexity, revealing a distribution pattern of higher values in the north and lower values in the south. Specifically, the northern-central region exhibited the greatest local heterogeneity. Furthermore, sites with higher heterogeneity displayed wider confidence intervals, reflecting greater uncertainty and serving as a source of model prediction error. This was corroborated by predicting the test set data using ensemble learning models after excluding high-heterogeneity data points.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Banwart, S.; Bernasconi, S.M.; Bloem, J.; Blum, W.; Brandao, M.; Brantley, S.; Chabaux, F.; Duffy, C.; Kram, P.; Lair, G.; et al. Soil Processes and Functions in Critical Zone Observatories: Hypotheses and Experimental Design. Vadose Zone J. 2011, 10, 974–987. [Google Scholar] [CrossRef]
- PENG Xinhua, W.Y.J.X. , Some key research fields of Chinese soil physics in the new era: Progresses and perspectives. Acta Pedologica Sinica, 2020. 57(5): p. 1071.
- Seneviratne, S.I.; Corti, T.; Davin, E.L.; Hirschi, M.; Jaeger, E.B.; Lehner, I.; Orlowsky, B.; Teuling, A.J. Investigating soil moisture—Climate interactions in a changing climate: A review. Earth-Sci. Rev. 2010; 99, 125–161. [Google Scholar] [CrossRef]
- Mittelbach, H.; Lehner, I.; Seneviratne, S.I. Comparison of four soil moisture sensor types under field conditions in Switzerland. J. Hydrol. 2012; 430-431, 39–49. [Google Scholar] [CrossRef]
- Bogena, H.R.; Huisman, J.A.; Baatz, R.; Franssen, H.-J.H.; Vereecken, H. Accuracy of the cosmic-ray soil water content probe in humid forest ecosystems: The worst case scenario. Water Resour. Res. 2013, 49, 5778–5791. [Google Scholar] [CrossRef]
- Dobriyal, P.; Qureshi, A.; Badola, R.; Hussain, S.A. A review of the methods available for estimating soil moisture and its implications for water resource management. J. Hydrol. 2012; 458-459, 110–117. [Google Scholar] [CrossRef]
- D, H. , Environmental Soil Physics. 1998, San Diego: USA:Academic Press. 771.
- Ochsner, E.; Cosh, M.H.; Cuenca, R.; Hagimoto, Y.; Kerr, Y.H.; Njoku, E.G.; Zreda, M. State of the Art in Large-Scale Soil Moisture Monitoring. Soil Sci. Soc. Am. J. 2013, 1–32. [Google Scholar] [CrossRef]
- Crow, W.T.; Berg, A.A.; Cosh, M.H.; Loew, A.; Mohanty, B.P.; Panciera, R.; de Rosnay, P.; Ryu, D.; Walker, J.P. Upscaling sparse ground-based soil moisture observations for the validation of coarse-resolution satellite soil moisture products. Rev. Geophys. 2012, 50, RG200 . [Google Scholar] [CrossRef]
- Robock, A. , et al., The Global Soil Moisture Data Bank. 2000, American Meteorological Society: Boston MA, USA. p. 1281 - 1300.
- Yang, Z.; He, Q.; Miao, S.; Wei, F.; Yu, M. Surface Soil Moisture Retrieval of China Using Multi-Source Data and Ensemble Learning. Remote. Sens. 2023, 15, 2786. [Google Scholar] [CrossRef]
- Nie, H.; Yang, L.; Li, X.; Ren, L.; Xu, J.; Feng, Y. Spatial Prediction of Soil Moisture Content in Winter Wheat Based on Machine Learning Model. 2018 26th International Conference on Geoinformatics; pp. 1–6.
- Famiglietti, J.S.; Ryu, D.; Berg, A.A.; Rodell, M.; Jackson, T.J. Field observations of soil moisture variability across scales. Water Resour. Res. 2008, 44. [Google Scholar] [CrossRef]
- Park, S.; Lee, B.; Kim, M.; Sang, W.; Seo, M.C.; Baek, J.; Yang, J.E.; Mo, C. Development of a Soil Moisture Predic-tion Model Based on Recurrent Neural Network Long Short-Term Memory (RNN-LSTM) in Soybean Cultivation. Sensors 2023, 23, 1976. [Google Scholar] [CrossRef]
- Western, A.W.; Zhou, S.-L.; Grayson, R.B.; A McMahon, T.; Blöschl, G.; Wilson, D.J. Spatial correlation of soil moisture in small catchments and its relationship to dominant spatial hydrological processes. J. Hydrol. 2004, 286, 113–134. [Google Scholar] [CrossRef]
- Singh, A.; Gaurav, K. Deep learning and data fusion to estimate surface soil moisture from multi-sensor satellite images. Sci. Rep. 2023, 13, 1–20. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Ling, Z.; Wang, Y.; Zeng, H. Improving spatial representation of soil moisture by integration of microwave observations and the temperature–vegetation–drought index derived from MODIS products. ISPRS J. Photogramm. Remote. Sens. 2016, 113, 144–154. [Google Scholar] [CrossRef]
- Xu, C.; Qu, J.J.; Hao, X.; Cosh, M.H.; Prueger, J.H.; Zhu, Z.; Gutenberg, L. Downscaling of Surface Soil Moisture Retrieval by Combining MODIS/Landsat and In Situ Measurements. Remote Sens. 2018, 10, 210. [Google Scholar] [CrossRef]
- Long, D.; Bai, L.; Yan, L.; Zhang, C.; Yang, W.; Lei, H.; Quan, J.; Meng, X.; Shi, C. Generation of spatially complete and daily continuous surface soil moisture of high spatial resolution. Remote. Sens. Environ. 2019, 233, 111364. [Google Scholar] [CrossRef]
- Wei, Z.; Meng, Y.; Zhang, W.; Peng, J.; Meng, L. Downscaling SMAP soil moisture estimation with gradient boosting decision tree regression over the Tibetan Plateau. 2019, 225, 30–44. [CrossRef]
- Araya, S.N.; Fryjoff-Hung, A.; Anderson, A.; Viers, J.H.; Ghezzehei, T.A. Machine Learning Based Soil Moisture Retrieval from Unmanned Aircraft System Multispectral Remote Sensing. IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium; pp. 4598–4601.
- Das, B.; Rathore, P.; Roy, D.; Chakraborty, D.; Jatav, R.S.; Sethi, D.; Kumar, P. Comparison of bagging, boosting and stacking algorithms for surface soil moisture mapping using optical-thermal-microwave remote sensing synergies. CATENA 2022, 217, 106485. [Google Scholar] [CrossRef]
- Ali, I.; Greifeneder, F.; Stamenkovic, J.; Neumann, M.; Notarnicola, C. Review of Machine Learning Approaches for Biomass and Soil Moisture Retrievals from Remote Sensing Data. Remote. Sens. 2015, 7, 16398–16421. [Google Scholar] [CrossRef]
- Meroni, M.; Colombo, R.; Panigada, C. Inversion of a radiative transfer model with hyperspectral observations for LAI mapping in poplar plantations. Remote. Sens. Environ. 2004, 92, 195–206. [Google Scholar] [CrossRef]
- Colombo, R.; Bellingeri, D.; Fasolini, D.; Marino, C.M. Retrieval of leaf area index in different vegetation types using high resolution satellite data. Remote Sens. Environ. 2003, 86, 120–131. [Google Scholar] [CrossRef]
- Couckuyt, A.; Seurinck, R.; Emmaneel, A.; Quintelier, K.; Novak, D.; Van Gassen, S.; Saeys, Y. Challenges in translational machine learning. Hum. Genet. 2022, 141, 1451–1466. [Google Scholar] [CrossRef] [PubMed]
- Senanayake, I.P.; Arachchilage, K.R.L.P.; Yeo, I.-Y.; Khaki, M.; Han, S.-C.; Dahlhaus, P.G. Spatial Downscaling of Satellite-Based Soil Moisture Products Using Machine Learning Techniques: A Review. Remote. Sens. 2024, 16, 2067. [Google Scholar] [CrossRef]
- Sarwar, A.; Peters, R.T.; Mohamed, A.Z. Linear mixed modeling and artificial neural network techniques for predicting wind drift and evaporation losses under moving sprinkler irrigation systems. Irrig. Sci. 2019, 38, 177–188. [Google Scholar] [CrossRef]
- Ahmad, S.; Kalra, A.; Stephen, H. Estimating soil moisture using remote sensing data: A machine learning approach. Adv. Water Resour. 2010, 33, 69–80. [Google Scholar] [CrossRef]
- Acharya, U. , Soil Moisture Prediction using Meteorological Data, Satellite Imagery, and Machine Learning in the Red River Valley of the North. 2021, North Dakota State University.
- Gruber, A.; Su, C.; Crow, W.T.; Zwieback, S.; Dorigo, W.A.; Wagner, W. Estimating error cross-correlations in soil moisture data sets using extended collocation analysis. J. Geophys. Res. Atmos. 2016, 121, 1208–1219. [Google Scholar] [CrossRef]
- NASA JPL (2021). NASADEM Merged DEM Global 1 arc second V001 Accessed: 2024-09-11. [CrossRef]
- Ge, Y.; Wang, J.H.; Heuvelink, G.B.M.; Jin, R.; Li, X.; Wang, J.F. Sampling design optimization of a wireless sensor network for monitoring ecohydrological processes in the Babao River basin, China. Int. J. Geogr. Inf. Sci. 2015, 29, 92–110. [Google Scholar] [CrossRef]
- Jin, R.; Li, X.; Yan, B.; Li, X.; Luo, W.; Ma, M.; Guo, J.; Kang, J.; Zhu, Z.; Zhao, S. A Nested Ecohydrological Wireless Sensor Network for Capturing the Surface Heterogeneity in the Midstream Areas of the Heihe River Basin, China. IEEE Geosci. Remote. Sens. Lett. 2014, 11, 2015–2019. [Google Scholar] [CrossRef]
- Chai, L.; Zhu, Z.; Liu, S.; Xu, Z.; Jin, R.; Li, X.; Kang, J.; Che, T.; Zhang, Y.; Zhang, J.; et al. QLB-NET: A Dense Soil Moisture and Freeze–Thaw Monitoring Network in the Qinghai Lake Basin on the Qinghai–Tibetan Plateau. Bull. Am. Meteorol. Soc. 2024, 105, E584–E604. [Google Scholar] [CrossRef]
- Kohavi, R.; John, G.H. Wrappers for feature subset selection. Artificial Intelligence 1997, 97, 273–324. [Google Scholar] [CrossRef]
- Kursa, M.B.; Rudnicki, W.R. Feature Selection with theBorutaPackage. J. Stat. Softw. 2010, 36, 1–13. [Google Scholar] [CrossRef]
- Breiman, L. , Random forests. MACHINE LEARNING 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Geurts, P. Ernst and L. Wehenkel, Extremely randomized trees. Machine Learning, 2006; 63, 3–42. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. KDD '16. New York, NY, USA, 2016; pp. 785–94.
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: unbiased boosting with categorical features. NIPS'18. Red Hook, NY, USA, 2018,6639-49.
- Li, Z. Extracting spatial effects from machine learning model using local interpretation method: An example of SHAP and XGBoost. Comput. Environ. Urban Syst. 2022, 96. [Google Scholar] [CrossRef]
- Shapley, L.S. , 17. A Value for n-Person Games, H.W. Kuhn and A.W. Tucker, H.W. Kuhn and A.W. Tucker^Editors. 1953, Princeton University Press. p. 307-318.
- Lundberg, S.M.; Lee, S. A unified approach to interpreting model predictions. NIPS'17. Red Hook, NY, USA, 2017,4768-77.
- Pradhan, B.; Dikshit, A.; Lee, S.; Kim, H. An explainable AI (XAI) model for landslide susceptibility modeling. Appl. Soft Comput. 2023, 142. [Google Scholar] [CrossRef]
- Li, Z.; Zhao, L.; Wang, L.; Zou, D.; Liu, G.; Hu, G.; Du, E.; Xiao, Y.; Liu, S.; Zhou, H.; et al. . Retrieving Soil Moisture in the Permafrost Environment by Sentinel-1/2 Temporal Data on the Qinghai–Tibet Plateau. Remote Sens. 2022, 14, 5966.
- Ruichen, M.; Jinxi, S.; Bin, T.; Wenjin, X.; Feihe, K.; Haotian, S.; Yuxin, L. Vegetation variation regulates soil moisture sensitivity to climate change on the Loess Plateau. J. Hydrol. 2023, 617. [Google Scholar] [CrossRef]
- Srivastava, A.; Saco, P.M.; Rodriguez, J.F.; Kumari, N.; Chun, K.P.; Yetemen, O. The role of landscape morphology on soil moisture variability in semi-arid ecosystems. Hydrol. Process. 2020, 35. [Google Scholar] [CrossRef]
- Zanaga, D.; Van De Kerchove, R.; De Keersmaecker, W.; Souverijns, N.; Brockmann, C.; Quast, R.; Wevers, J.; Grosu, A.; Paccini, A.; Vergnaud, S. , et al.. ESA WorldCover 10 m 2020 v100.:Zenodo, 2021. [CrossRef]
- Hengl, T. Soil texture classes (USDA system) for 6 soil depths (0, 10, 30, 60, 100 and 200 cm) at 250 m.:Zenodo, 2018. [CrossRef]
- LU, H.L., U. LIUX and G. TANG, Terrain Complexity Assessment Based on Multivariate Analysis. Mountain Research, 2012. 30(05): p. 616-621.
- B. D. Ripley, Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press, 1996.
- Varga, C. and C. Levente, The influence of slope aspect on soil moisture. Acta Universitatis Sapientiae, Agriculture and Environment, 2020. 12: p. 82-93.
- Fu, X.; Jiang, X.; Yu, Z.; Ding, Y.; Lü, H.; Zheng, D. Understanding the key factors that influence soil moisture estimation using the unscented weighted ensemble Kalman filter. Agric. For. Meteorol. 2022, 313, 108745. [Google Scholar] [CrossRef]
- Zhang, N.; Hong, Y.; Qin, Q.; Liu, L. VSDI: a visible and shortwave infrared drought index for monitoring soil and vegetation moisture based on optical remote sensing. Int. J. Remote. Sens. 2013, 34, 4585–4609. [Google Scholar] [CrossRef]
- Du, X.; Wang, S.; Zhou, Y.; Wei, H. Construction and validation of a new model for unified surface water capacity based on MODIS data. Geomatics and Information Science of Wuhan University. 2007, 32, 204–205. [Google Scholar]
- Hong, Z.; Zhang, W.; Yu, C.; Zhang, D.; Li, L.; Meng, L. SWCTI: Surface Water Content Temperature Index for Assessment of Surface Soil Moisture Status. Sensors 2018, 18, 2875. [Google Scholar] [CrossRef]
- Hegazi, E.H.; Samak, A.A.; Yang, L.; Huang, R.; Huang, J. Prediction of Soil Moisture Content from Sentinel-2 Images Using Convolutional Neural Network (CNN). Agronomy 2023, 13, 656. [Google Scholar] [CrossRef]
- Peng, J.; Hu, Y.N.; Liu, Y.X.; Ma, J.; Zhao, S.Q. A new approach for urban-rural fringe identification: Integrating impervious surface area and spatial continuous wavelet transform. Landsc. Urban Plan. 2018, 175, 72–79. [Google Scholar] [CrossRef]
- Fang-Fang, Z.; Bing, Z.; Jun-Sheng, L.; Qian, S.; Yuanfeng, W.; Yang, S. Comparative Analysis of Automatic Water Identification Method Based on Multispectral Remote Sensing. Procedia Environ. Sci. 2011, 11, 1482–1487. [Google Scholar] [CrossRef]
- Shi, W.; Guo, D.; Zhang, H. A reliable and adaptive spatiotemporal data fusion method for blending multi-spatiotemporal-resolution satellite images. Remote. Sens. Environ. 2022, 268, 112770. [Google Scholar] [CrossRef]
- Shao, Z.; Cai, J.; Fu, P.; Hu, L.; Liu, T. Deep learning-based fusion of Landsat-8 and Sentinel-2 images for a harmonized surface reflectance product. Remote. Sens. Environ. 2019, 235, 111425. [Google Scholar] [CrossRef]
- Mizuochi, H.; Hiyama, T.; Ohta, T.; Fujioka, Y.; Kambatuku, J.R.; Iijima, M.; Nasahara, K.N. Development and evaluation of a lookup-table-based approach to data fusion for seasonal wetlands monitoring: An integrated use of AMSR series, MODIS, and Landsat. Remote. Sens. Environ. 2017, 199, 370–388. [Google Scholar] [CrossRef]
- Fu, Z.; Ciais, P.; Wigneron, J.-P.; Gentine, P.; Feldman, A.F.; Makowski, D.; Viovy, N.; Kemanian, A.R.; Goll, D.S.; Stoy, P.C.; et al. Global critical soil moisture thresholds of plant water stress. Nat. Commun. 2024, 15, 1–13. [Google Scholar] [CrossRef]
- Han, Q.; Zeng, Y.; Zhang, L.; Wang, C.; Prikaziuk, E.; Niu, Z.; Su, B. Global long term daily 1 km surface soil moisture dataset with physics informed machine learning. Sci. Data 2023, 10, 101. [Google Scholar] [CrossRef] [PubMed]
- Skulovich, O.; Gentine, P. A Long-term Consistent Artificial Intelligence and Remote Sensing-based Soil Moisture Dataset. Sci. Data 2023, 10, 154. [Google Scholar] [CrossRef]
- O, S.; Orth, R.; Weber, U.; Park, S.K. High-resolution European daily soil moisture derived with machine learning (2003–2020). Sci. Data 2022, 9, 701. [Google Scholar] [CrossRef]














| Performance Metrics | Soil Conductivity | soil moisture Volumetric Water Content | Soil Temperature |
|---|---|---|---|
| Range | 0-8dS/m | 0-100% | -50-70℃ |
| Precision | ±(5%the value+0.05dS/m) | ±3% | ±0.02 |
| Accuracy | 0.5% | <0.05% | ±0.5℃ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
