Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Comparative Analysis of Machine Learning Classification Algorithms for predicting Olive Anthracnose Disease

Version 1 : Received: 31 July 2023 / Approved: 1 August 2023 / Online: 2 August 2023 (05:36:07 CEST)

How to cite: Kottaridi, K.; Milionis, A.; Demopoulos, V.; Nikolaidis, V.; Tsalgatidou, P.C.; Tsafouros, A.; Kotsiras, A.; Vithoulkas, A. Comparative Analysis of Machine Learning Classification Algorithms for predicting Olive Anthracnose Disease. Preprints 2023, 2023080073. https://doi.org/10.20944/preprints202308.0073.v1 Kottaridi, K.; Milionis, A.; Demopoulos, V.; Nikolaidis, V.; Tsalgatidou, P.C.; Tsafouros, A.; Kotsiras, A.; Vithoulkas, A. Comparative Analysis of Machine Learning Classification Algorithms for predicting Olive Anthracnose Disease. Preprints 2023, 2023080073. https://doi.org/10.20944/preprints202308.0073.v1

Abstract

Olive Anthracnose (OA) is the most important fungal disease of olive fruits worldwide. In the context of integrated pest management, the development of predictive models could be used for early diagnosis and control. In the current study, a dataset representing 58 cases (6 locations with 12 olive cultivars) was used to study the relationship between ΟΑ incidence (OAI) and 35 heterogeneous variables, including orchard characteristics, olive fruit parameters, foliar and soil nutrients, soil parameters and soil texture classes. The Random Forest-Recursive Feature Elimination with Cross Validation (RF-RFECV) feature selection method identified Location, Water Content, P, Ca, Mg, Exchangeable Mg, Trace Zn, Trace Cu as possible new indicators associated with OAI. Six different classification algorithms, namely Decision Tree (DT), Gradient Boosting (GB), Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbors (KNN) and Support Vector Machine (SVM), were developed for predicting conditions leading to OAI >0% and 10%. Hyperparameter optimization using grid search was used to optimize the parameters of the models and finally the best parameters were applied to predict the OAI. The final models were evaluated in terms of several standard metrics, such as accuracy, sensitivity, specificity and ROC AUC score. Findings suggested that GB performance was superior compared to the other models for the prediction of the occurrence of OA disease (OAI>0%) with an accuracy of 86.7%, a sensitivity of 100%, a specificity of 75% and a ROC-AUC score of 93%, while for the prediction of the spread of the disease (OAI>10%), DT stood out with an accuracy of 86.7%, a sensitivity of 81.8%, a specificity of 100% and a ROC-AUC score of 91%. RF classifier performed very well in both cases, with an accuracy of 80%, a sensitivity of 85.7%, a specificity of 75% and a ROC-AUC score of 93% for the prediction of the occurrence of the disease (OAI>0%), and an accuracy of 86.7%, a sensitivity of 90.9%, a specificity of 75% and a ROC-AUC score of 84% for the prediction of the spread of the disease (OAI>10%).

Keywords

olive anthracnose; machine learning; forecast models; classification algorithms; soil nutrients

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.