Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Generalizability of a Random Forest-Based Model of Maize Lodging Built with Satellite Image Data and Its Application to Monitoring and Evaluating Maize Lodging Risks

Version 1 : Received: 29 May 2023 / Approved: 2 June 2023 / Online: 2 June 2023 (04:08:42 CEST)

How to cite: Guo, H.; Ming, B.; Nie, C.; Zhang, G.; Yang, H.; Gao, S.; Xue, B.; Xin, J.; Feng, D.; Jia, B.; Hou, P.; Xue, J.; Xie, R.; Wang, K.; Li, S. Generalizability of a Random Forest-Based Model of Maize Lodging Built with Satellite Image Data and Its Application to Monitoring and Evaluating Maize Lodging Risks. Preprints 2023, 2023060123. https://doi.org/10.20944/preprints202306.0123.v1 Guo, H.; Ming, B.; Nie, C.; Zhang, G.; Yang, H.; Gao, S.; Xue, B.; Xin, J.; Feng, D.; Jia, B.; Hou, P.; Xue, J.; Xie, R.; Wang, K.; Li, S. Generalizability of a Random Forest-Based Model of Maize Lodging Built with Satellite Image Data and Its Application to Monitoring and Evaluating Maize Lodging Risks. Preprints 2023, 2023060123. https://doi.org/10.20944/preprints202306.0123.v1

Abstract

Lodging is a common problem in maize production that seriously impacts yield, quality, and the capacity for mechanical harvesting. Evaluation of site-specific lodging risks requires establishment of a method for multi-year monitoring. In this study, spectral images collected by the Sentinel-2 satellite were processed to obtain three types of data: gray-level co-occurrence matrix texture (GLCM), vegetation indices (VIs), and spectral reflectance (SR). Lodging classification models were then established with Random Forest (RF) using each of the three data types separately (the GLCM, VI, and SR models) and in combination (SR+VI model, SR+GLCM model, VI+GLCM mod-el, and SR+VI+GLCM model). By gradually removing features with low importance scores from the SR+VI+GLCM model and analyzing the changes in the overall accuracy (OA), the optimal set of predictive variables was identified and used to construct the optimal model. A model built us-ing data from a single timepoint in 2021 was tested on data collected at a similar timepoint in 2019 and vice versa to assess interannual model generalizability. The results of this study demon-strate that for monitoring maize lodging, models constructed with a single feature type, the GLCM model had significantly lower accuracy compared to the VI and SR models. During certain growth stages, the model constructed with combined features had significantly higher accuracy in monitoring maize lodging compared to models constructed with a single feature. During the pro-cess of selecting the optimal predictive variables, it was found that the accuracy of the model did not increase as the number of predictive variables increased. The results show that the positive and negative validation models had an accuracy of 96.55% and 95.18%, with kappa values of 0.93 and 0.83, respectively. This indicates that the model has strong generality for the same repro-ductive stage between years. This study provides a detailed method for large-scale maize lodging monitoring, allowing for identification of optimal planting practices to reduce the probability of lodging and ultimately improving regional maize yield and quality.

Keywords

Sentinel-2 multispectral data; Maize lodging; Random Forest classification; Predictive variables; Model generalizability

Subject

Biology and Life Sciences, Agricultural Science and Agronomy

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.