Submitted:
12 March 2025
Posted:
13 March 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- we make publicly available our self-curated and disease-annotated IoT environmental data, spanning from 2020 to May of 2024, to be further used on other studies,
- we perform a comparison analysis of different tabular data classifiers,
- we show the significance of using the state of the art TabPFN-Transformer on this kind of tabular data compared with other predictors,
- we provide a workflow for early predicting pathogens and prevent disease development, which can operate in real-time conditions since TabPFN-Transformer yields output in less than a second.
2. Impact of Downy Mildew, Powdery Mildew Disease to Grapevines
2.1. Downy Mildew (Plasmopara viticola)
2.2. Powdery Mildew (Erysiphe Necator)
3. Effects of Environmental Conditions on Enabling Grapevines Diseases
4. Materials and Methods

4.1. Data Description
4.1.1. Grapevines Field Description
4.1.2. IoT Environmental Data Acquisition & Labeling
4.2. Data Augmentation & Preprocessing
4.2.1. Gaussian Copula Based Synthetic Data Generation
4.2.2. Additive Gaussian Noise Based Synthetic Data Generation
4.2.3. Data Normalization (Standardization)
4.2.4. Data Balancing
4.3. Grapevines Disease Prediction Using ML
4.3.1. Logistic Regression
4.3.2. KNN
4.3.3. Support Vector Machine (SVM)
4.3.4. Random Forest
4.3.5. GradientBoosting
4.3.6. XGBoost
4.3.7. CatBoost
4.3.8. TabPFN Transformer
4.4. Evaluation Metrics
5. Results
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| KNN | K-Nearest Neighbours |
| SVM | Support Vector Machine |
| TabPFN | Tabular Prior-Data Fitted Network |
References
- Patil, S. S. & Thorat, S. A. Early detection of grapes diseases using machine learning and IoT, 2016 Second International Conference on Cognitive Computing and Information Processing (CCIP), Mysuru, India, 2016, pp. 1-5. [CrossRef]
- Sanghavi, Kainjan & Sanghavi, Mahesh & Rajurkar, Archana M. Early stage detection of Downy and Powdery Mildew grape disease using atmospheric parameters through sensor nodes, Artificial Intelligence in Agriculture, Volume 5, 2021, Pages 223-232, ISSN 2589-7217. Available online: https://www.sciencedirect.com/science/article/pii/S2589721721000283. [CrossRef]
- Hnatiuc, Mihaela & Ghita, Simona & Alpetri, Domnica & Ranca, Aurora & Artem, Victoria & Dina, Ionica & Cosma, Mădălina & Mohammed, Mazin Abed. 2023. "Intelligent Grapevine Disease Detection Using IoT Sensor Network" Bioengineering 10, no. 9: 1021. [CrossRef]
- Gessler, C. & Pertot, I. & Perazzolli, M. (2011). Plasmopara viticola: a review of knowledge on downy mildew of grapevine and effective disease management. Phytopathologia Mediterranea, 50(1), 3–44. Available online: http://www.jstor.org/stable/26458675.
- Martinson, T. E. & Dunst, R. & Lakso, A. & English-Loeb, G. 1997. Impact of feeding injury by Eastern Grape Leafhopper (Homoptera:Cicadellidae) on yield and juice quality of Concord grapes. Amer. J. Enol. Vitic. 48:291-302.
- Thind, T.S. & Arora, J.K. & Mohan, C. & Raj, P. (2004). Epidemiology of Powdery Mildew, Downy Mildew and Anthracnose Diseases of Grapevine. In: Naqvi, S.A.M.H. (eds) Diseases of Fruits and Vegetables Volume I. Springer, Dordrecht. [CrossRef]
- Williamson, B. & Tudzynski, B. & Tudzynski, P. & van Kan JA. Botrytis cinerea: the cause of grey mould disease. Mol Plant Pathol. 2007 Sep;8(5):561-80. [CrossRef] [PubMed]
- Gadoury, David. (1997). Effects of environment and fungicides on epidemics of grape powdery mildew: considerations for practical model development and disease management. Viticultural and Enological Science. 52. 225-229.
- Gadoury, David & Seem, Robert & Pearson, Roger & Wilcox, Wayne & Dunst, Richard. (2001). Effects of Powdery Mildew on Vine Growth, Yield, and Quality of Concord Grapes. Plant Disease - PLANT DIS. 85. 137-140. [CrossRef]
- Pool, R. M. & Pearson, R. C. & Welser, M. J. & Lasko, A. N & Seem, R. C. 1984. Influence of powdery mildew on yield and growth of rosette grapevine. Plant Disease, 68: 590-593.
- Rao, K.C, 1992. Epidemiology of some common diseases of grape around Hyderabad. In: “Proceedings of International Symposium on Recent Advances in Viticulture and Oenology, Hyderabad, India”, pp 323-329.
- Willocquet, Laetitia & Berud, F. & Raoux, L. & Clerjeau, Michel. (2007). Effects of wind, relative humidity, leaf movement and colony age on dispersal of conidia of Uncinula necator, causal agent of grape powdery mildew. Plant Pathology. 47. 234 - 242. [CrossRef]
- Koledenkova, K. & Esmaeel, Q. & Jacquard, C. & Nowak, J. & Clément, C. & Ait Barka, E. (2022) Plasmopara viticola the Causal Agent of Downy Mildew of Grapevine: From Its Taxonomy to Disease Management. Front. Microbiol. 13:889472. [CrossRef]
- Velasquez-Camacho, L. & Otero, M. & Basile, B. & Pijuan, J. & Corrado, G., Current Trends and Perspectives on Predictive Models for Mildew Diseases in Vineyards. Microorganisms. 2022 Dec 27;11(1):73. [CrossRef] [PubMed] [PubMed Central]
- Peng, J. & Wang, X. & Wang, H. & Li, X. & Zhang, Q. & Wang, M. (2024) Advances in understanding grapevine downy mildew: From pathogen infection to disease management. Molecular Plant Pathology, 25, e13401. [CrossRef]
- Fernandes de Oliveira, A. & Serra, S. & Ligios, V. & Satta, D. & Nieddu, G. (2021). Assessing the Effects of Vineyard Soil Management on Downy and Powdery Mildew Development. Horticulturae, 7(8), 209. [CrossRef]
- Ricciardi, Valentina & Crespan, Manna & Maddalena, Giuliana & Migliaro, Daniele & Brancadoro, Lucio & Maghradze, David & Failla, Osvaldo & Toffolatti, Silvia Laura & De Lorenzis, Gabriella, Novel loci associated with resistance to downy and powdery mildew in grapevine, Frontiers in Plant Science, vol.15, 2024. Available online: https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1386225ISSN 1664-462X. [CrossRef]
- Bois, B. & Zito, S. & Calonnec, A. (2017). Climate vs grapevine pests and diseases worldwide: the first results of a global survey. OENO One, 51(2), 133–139. [CrossRef]
- Nelsen, R. B. An introduction to Copulas, Springer Science & Business Media, 2006.
- Houssou, Regis & Augustin, Mihai-Cezar & Rappos, Efstratios & Bonvin, Vivien & Robert-Nicoud, Stephan. (2022) Generation and Simulation of Synthetic Datasets with Copulas. Available online: https://arxiv.org/abs/2203.17250.
- Meyer, David & Nagler, Thomas & Hogan, Robin. (2020). Copula-based synthetic data generation for machine learning emulators in weather and climate: application to a simple radiation model. [CrossRef]
- Li, Zheng & Zhao, Yue & Fu, Jialin. (2020). SYNC: A Copula based Framework for Generating Synthetic Data from Aggregated Sources. [CrossRef]
- Bilali, A.E & Taleb, A. & Bahlaoui, M. A. & Brouziyne, Y. An integrated approach based on Gaussian noises-based data augmentation method and AdaBoost model to predict faecal coliforms in rivers with small dataset, Journal of Hydrology, Volume 599, 2021, 126510. ISSN 0022-1694. [CrossRef]
- Chawla, Nitesh V. & Bowyer, Kevin W. & Hall, Lawrence o. & Kegelmeyer, Philip W., (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research 2002 16:321–357.
- Hollmann, N. & Müller, S. & Eggensperger, K. & Hutter, F., (2023). TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. Available online: https://arxiv.org/abs/2207.01848.
- Liu, D.C. & Nocedal, J. On the limited memory BFGS method for large scale optimization. Mathematical Programming 45, 503–528 (1989). [CrossRef]
- Vaswani, A. & Shazeer, N. & Parmar, N. & Uszkoreit, J. & Jones, L. & Gomez, A. & Kaiser, L. & Polosukhin I. Attention is all you need. In I. Guyon, U. von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Proceedings of the 30th International Conference on Advances in Neural Information Processing Systems (NeurIPS’17). Curran Associates, Inc., 2017.
- Chen, Tianqi & Guestrin, Carlos. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). Association for Computing Machinery, New York, NY, USA, 785–794. [CrossRef]
- Prokhorenkova, Liudmila & Gusev, Gleb & Vorobev, Aleksandr & Dorogush, Anna-Veronika & Andrey Gulin. 2018. CatBoost: unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 6639–6649.
- Müller, S & Hollmann, N. & Arango, S. & Grabocka, J. & Hutter, F. Transformers can do bayesian inference. In Proceedings of the International Conference on Learning Representations (ICLR’22), 2022. Published online: iclr.cc. Available online: https://openreview.net/forum?id=KSugKcbNf9.
| Classifier | Accuracy | ROC-AUC | Precision | Recall | F1-Score | |||
| No | Yes | No | Yes | No | Yes | |||
| Logistic Regression | 0.6125 | 0.6062 | 0.8058 | 0.3637 | 0.6299 | 0.5942 | 0.7017 | 0.4628 |
| KNN | 0.7849 | 0.7632 | 0.8861 | 0.5902 | 0.8099 | 0.7162 | 0.8464 | 0.6509 |
| SVM | 0.6987 | 0.6788 | 0.8351 | 0.4675 | 0.7339 | 0.6251 | 0.7807 | 0.5285 |
| Random Forest | 0.8576 | 0.8245 | 0.9082 | 0.7361 | 0.8959 | 0.7530 | 0.9021 | 0.7394 |
| GradientBoosting | 0.8742 | 0.7969 | 0.8816 | 0.8437 | 0.9585 | 0.6352 | 0.9184 | 0.7248 |
| XGBoost | 0.8379 | 0.7648 | 0.8645 | 0.7428 | 0.9231 | 0.6153 | 0.8977 | 0.6973 |
| CatBoost | 0.8642 | 0.8112 | 0.8846 | 0.7942 | 0.9365 | 0.6785 | 0.9098 | 0.7249 |
| TabPFN Transformer | 0.9669 | 0.9461 | 0.9648 | 0.9733 | 0.9801 | 0.9014 | 0.9773 | 0.9359 |
| Classifier | Accuracy | ROC-AUC | Precision | Recall | F1-Score | |||
| No | Yes | No | Yes | No | Yes | |||
| Logistic Regression | 0.5767 | 0.5917 | 0.8047 | 0.3375 | 0.5665 | 0.6162 | 0.6652 | 0.4462 |
| KNN | 0.7731 | 0.7674 | 0.8990 | 0.5509 | 0.7791 | 0.7558 | 0.8348 | 0.6376 |
| SVM | 0.5736 | 0.6283 | 0.8482 | 0.3632 | 0.5226 | 0.7441 | 0.6391 | 0.4892 |
| Random Forest | 0.8190 | 0.7838 | 0.8917 | 0.6423 | 0.8583 | 0.7094 | 0.8747 | 0.6742 |
| GradientBoosting | 0.7147 | 0.6383 | 0.8101 | 0.4606 | 0.8000 | 0.4868 | 0.8050 | 0.4785 |
| XGBoost | 0.8220 | 0.7374 | 0.8527 | 0.7059 | 0.9165 | 0.5681 | 0.8835 | 0.6333 |
| CatBoost | 0.8344 | 0.7345 | 0.8690 | 0.7162 | 0.9125 | 0.6283 | 0.8902 | 0.6725 |
| TabPFN Transformer | 0.9202 | 0.8713 | 0.9212 | 0.9166 | 0.9740 | 0.7973 | 0.9473 | 0.8653 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).