ARTICLE | doi:10.20944/preprints201901.0287.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: on-farm precision experimentation; normalized difference vegetation index; data filtering; error correction
Online: 29 January 2019 (04:55:05 CET)
The objective of this work was to investigate the use of remotely sensed vegetation indices to improve the quality of yield maps. The method was applied to the yield data of twelve cornfields from the Data Intensive Farm Management project. The results revealed the need to time shift the yield values up to three seconds to better match the sensor readings with the geographic coordinates. The residuals of the yield prediction model were used to identify points with unlikely yield values for that location, as an alternative to traditional approaches using local spatial statistics, without any assumption of spatial dependence or stationarity. The temporal and spatial distribution of the standardized coefficients for each experimental unit highlighted the presence of trends in the data. At least five out of the twelve fields presented trends that could have been induced by data collection.
ARTICLE | doi:10.20944/preprints202112.0138.v2
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Yield mapping; vegetation index; Stepwise; SR; Random Forest; KNN
Online: 9 December 2021 (15:39:34 CET)
The use of machine learning techniques to predict yield based on remote sensing is a no-return path and studies conducted on farm aim to help rural producers in decision-making. Thus, commercial fields equipped with technologies in Mato Grosso, Brazil, were monitored by satellite images to predict cotton yield using supervised learning techniques. The objective of this research was to identify how early in the growing season, which vegetation indices and which machine learning algorithms are best to predict cotton yield at the farm level. For that, we went through the following steps: 1) We observed the yield in 398 ha (3 fields) and eight vegetation indices (VI) were calculated on five dates during the growing season. 2) Scenarios were created to facilitate the analysis and interpretation of results: Scenario 1: All Data (8 indices on 5 dates = 40 inputs) and Scenario 2: best variable selected by Stepwise regression (1 input). 3) In the search for the best algorithm, hyperparameter adjustments, calibrations and tests using machine learning were performed to predict yield and performances were evaluated. Scenario 1 had the best metrics in all fields of study, and the Multilayer Perceptron (MLP) and Random Forest (RF) algorithms showed the best performances with adjusted R2 of 47% and RMSE of only 0.24 t ha-1, however, in this scenario all predictive inputs that were generated throughout the growing season (approx. 180 days) are needed, so we optimized the prediction and tested only the best VI in each field, and found that among the eight VIs, the Simple Ratio (SR), driven by the K-Nearest Neighbor (KNN) algorithm predicts with 0.26 and 0.28 t ha-1 of RMSE and 5.20% MAPE, anticipating the cotton yield with low error by ±143 days, and with important aspect of requiring less computational demand in the generation of the prediction when compared to MLP and RF, for example, enabling its use as a technique that helps predict cotton yield, resulting in time savings for planning, whether in marketing or in crop management strategies.