Submitted:
18 July 2024
Posted:
19 July 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Mater
2. Literature Survey
3. Objectives of the Present Research
- a)
- Develop a hybrid machine-learning technique that yields optimal results.
- b)
- Research on crop disease detection in various climates results in the creation of models unique to a given area.
- c)
- Offer a comprehensive list of evaluation metrics, such as accuracy, precision, recall, and F1-score, ROC-AUC.
4. Data and Methodology
4.1. Tomato Crop Location
4.2. Data Collection- Arduino Microcontroller
4.3. Calculation of Kendall’s Correlation (τ)
4.4. Bayesian Optimization with the KNN Algorithm
4.5. Data Preparation and Plant Health Classification
- a)
- Each record is annotated with a date and time stamp. The format is consistent with DD-MM-YYYY dates and HH:MM times, suggesting one-minute automated logging.
- b)
- NPK Level: This measures how much nitrogen (N), phosphorus (P), and potassium (K) are in the soil. It's given as a number between 1 and 3, indicating low, medium, or high levels of these nutrients.
- e)
- Temperature (°C): The ambient temperature is recorded in degrees Celsius (°C), ranging from 21°C to 24°C and this tight range reflects consistent weather during data collection.
- f)
- Humidity (%): Measures air moisture, ranging from 45% to 50%. Paddy fields are usually moderately humid.
- g)
- pH value: Recorded soil acidity or alkalinity ranges.
- h)
- Diseased: This binary classification serves as the target variable for machine learning models, where '1' represents the presence of disease and '0' signifies a healthy state.
5. Results and Discussion
- a)
- Data Collection Phase: The first step is to collect environmental data using an IoT device. It's very important that this data is accurate and complete because it forms the basis for all future analysis and decisions.
- b)
- Data Processing and Evaluation: After ensuring the data is accurate and complete, it goes through processing where we assess soil health. This step is key because it turns raw data into useful information about soil health, which is important for good crop management.
- c)
- Data Analysis and ML Implementation: The processed data is then analysed to see if it meets our standards. If it does, we move on to using machine learning (ML) algorithms. In this stage, we compare different ML algorithms to find the best one for improving soil health.
- d)
- Model Assessment and Adjustment: After implementing the ML algorithms, we check how well the model performs. If it doesn't meet our standards, we make adjustments or retrain the model. If it does meet the standards, we analyse its performance based on set metrics.
- e)
- Operational Implementation and Continuous Evaluation: Once the model proves it can improve crop management, it is used in real-world settings. This step is critical to see how well the technology works in practical situations.
6. Conclusion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Acknowledgments
References
- Blanchard, D. A colour atlas of tomato disease; Wolfe Pub. Ltd., Brook House, London, 1992; pp. 298.
- Arup Chattopadhyay, A.; Asit Kumar Mandal. ; Praveen Kumar Maurya.; Subrata Duttal. Effective Management of Major Tomato Diseases in the Gangetic Plains of Eastern India through Integrated Approach. Agricultural Research & Technology: Open Access Journal 2017, 10. [Google Scholar] [CrossRef]
- Ubalanka, V.; Jose, A.; Viswanath, D. Machine Learning Strategies for Predicting Crop Diseases. Journal of Physics. (2021). Conference Series, 1850, 012119. [CrossRef]
- Ahmed, I.; Habib, G.; Yadav, P. K. An Approach to Identify and Classify Agricultural Crop Diseases Using Machine Learning and Deep Learning Techniques. 2023 International Conference on Emerging Smart Computing and Informatics (ESCI), 1–6. /: https. [CrossRef]
- Sandeep K, H.; Rakesh B, S. Prediction of Disease in Tomato Leaves with use of Machine Learning Technique. International Journal of Advanced Research in Science, Communication and Technology 2023, 251–256. [CrossRef]
- Balu, V.; P, S. Wearable Multi-Sensor Data Fusion Approach for Human Activity Recognition Using Machine Learning Algorithms. SSRN Electronic Journal 2023. [Google Scholar] [CrossRef]
- Spiga, O.; Cicaloni, V.; Fiorini, C.; Trezza, A.; Visibelli, A.; Millucci, L.; Santucci, A. Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease. OrpHanet Journal of Rare Diseases 2020, 15. [Google Scholar] [CrossRef] [PubMed]
- CABI.; EPPO. 2016. Tomato leaf curl New Delhi virus. Distribution Maps of Plant Diseases. [CrossRef]
- Blesslin Sheeba, T.; Anand, L. ; D. Vijay,.; Manohar.; Gunaselvi Selvan.;Saravana Wilfred, C.; Bazil Muthukumar, K.; Padmavathy, S.; Ramesh Kumar, P.; Asfaw Belete Tessema. Machine Learning Algorithm for Soil Analysis and Classification of Micronutrients in IoT-Enabled Automated Farms, Journal of Nanomaterials 2022, 5343965, 7 pages, 2022. [CrossRef]
- Senapaty, M.K.; Ray, A.; Padhy, N. IoT-Enabled Soil Nutrient Analysis and Crop Recommendation Model for Precision Agriculture. Computers 2023, 12, 61. [Google Scholar] [CrossRef]
- Kitila,C.; Olana,G. Influence of farmyard manure and NPS fertilizer on Hot Pepper (Capsicum annuum L.) growth and yield variables at Western Ethiopia. Plant Science Today, 2024; 11, 397–404.
- Zhu, L.; Liao, Q.; Wang, Z.; Chen, J.; Chen, Z.; Bian, Q.; Zhang, Q. Prediction of Soil Shear Strength Parameters Using Combined Data and Different Machine Learning Models, Applied Sciences, 2022, 12, 5100. [CrossRef]
- Medvedkova, S.O. Relationship of melatonin and serotonin levels with clinical neurological data in patients with cerebral ischemic hemispHeric stroke during the early recovery stage of disease. Zaporozhye Medical Journal 2017. [CrossRef]
- Rajyaguru, D. J.; Borgert, A. J.; Halfdanarson, T. R.; Truty, M. J.; Kurup, A. N.; Go, R. S.; Reply to E.L. Pollom et al, N. Ohri et al, A. Fiorentino et al, D.R. Wahl et al, N. Kim et al, J. Boda-Heggemann et al, S. Rana et al, N. Sanuki et al, J.R. Olsen et al, G.L. Smith et al, and A. Shinde et al. Journal of Clinical Oncology 2018, 36, 2567–2569. [CrossRef]
- Aljumaily, A.; Kashmolaa, A. Building predictive models to assess degradation of soil organic matter over time using remote sensing data. Mesopotamia Journal of Agriculture 2022, 50, 19–27. [Google Scholar] [CrossRef]
- Chandra, R. ; Role of trace elements for health promotion and disease prevention. Nutrition Research 2003, 23, 23–1745. [Google Scholar] [CrossRef]
- Challet, D.; Ragel, V. Recurrent Neural Networks With More Flexible Memory: Better Predictions Than Rough Volatility. SSRN Electronic Journal 2023. [Google Scholar] [CrossRef]
- Ansari Arshiya.S.; Jawarneh.; Malik.; Ritong.; Mahyudin Jamwal.; Pragti Mohammadi.; Mohammad Sajid.; Veluri Ravi Kishore Kumar.; Virendra Shah.; Mohd Asif, Improved Support Vector Machine and Image Processing Enabled Methodology for Detection and Classification of Grape Leaf Disease, Journal of Food Quality, 2022, 9502475, 6 pages. [CrossRef]
- Aravind Reddy, Y.; M. Adimoolam; Efficient plant leaf disease detection using support vector machine algorithm and compare its features with Naive Bayes classification. AIP Conf. Proc. 7 February 2024; 2729 (1): 060015. [CrossRef]
- G. Shobana.; K. Vignesh .; S. Sree Dharshan, Plant Disease Detection Using Deep Neural Network, 2023 2nd International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), Coimbatore, India, 2023, pp. 1-6. [CrossRef]
- Attallah, O. ; Tomato Leaf Disease Classification via Compact Convolutional Neural Networks with Transfer Learning and Feature Selection. Horticulturae 2023, 9, 149. [Google Scholar] [CrossRef]
- J. Garcia Arnal Barbedo et al., Annotated Plant Pathology Databases for Image-Based Detection and Recognition of Diseases. IEEE Latin America Transactions, vol. 16, no. 6, pp. 1749-1757, June 2018.
- Ge, J. , Zhao, L., Yu, Z., Liu, H., Zhang, L., Gong, X., & Sun, H. Prediction of Greenhouse Tomato Crop Evapotranspiration Using XGBoost Machine Learning Model. Plants, 2022, 11. [CrossRef]
- Maimaitijiang, M. , Sagan, V., Sidike, P., Daloye, A.M., Erkbol, H., & Fritschi, F. Crop Monitoring Using Satellite/UAV Data Fusion and Machine Learning. Remote. Sens., 2020, 12, 1357. [CrossRef]
- Alzahrani, M.S.; Alsaade, F.W. Transform and Deep Learning Algorithms for the Early Detection and Recognition of Tomato Leaf Disease. Agronomy 2023, 13, 1184. [Google Scholar] [CrossRef]
- Trivedi, N.K.; Gautam, V.; Anand, A.; Aljahdali, H.M.; Villar, S.G.; Anand, D.; Goyal, N.; Kadry, S. Early Detection and Classification of Tomato Leaf Disease Using High-Performance Deep Neural Network. Sensors 2021, 21, 7987. [Google Scholar] [CrossRef] [PubMed]
- Newlands, N.K. (2018). Model-Based Forecasting of Agricultural Crop Disease Risk at the Regional Scale, Integrating Airborne Inoculum, Environmental, and Satellite-Based Monitoring Data. Frontiers in Environmental Science. [CrossRef]
- Deshannavar, U. ; High dimensional weather data used in a deep generative model to predict trajectories of aircraft. Journal of Airline Operations and Aviation Management 2022, 1, 80–88. [Google Scholar] [CrossRef]
- H.S. Sridhar.; N. S. M. P. Latha Devi.; G. Uma.; Auromeet Saha.; P. S. Brahmanandam.; K. Raghavendra Kumar. First-Time Observations of Fine Particle Matter (PM2.5) at a Rural Site in South India – A Case Study, PES journal 2024. [CrossRef]
- Luo, D.; Wen, X.; Xu, J. (2022). All-Sky Soil Moisture Estimation over Agriculture Areas from the Full Polarimetric SAR GF-3 Data. Sustainability, 14, 10866. [CrossRef]
- Wang, B.; Qiu, W.; Hu, X.; Wang, W. A rolling bearing fault diagnosis technique based on Recurrence Quantification Analysis and Bayesian optimization SVM. Applied Soft Computing, 2024, 111506. [CrossRef]
- Srithai ,V. C.; Barroso .; P. Phunchongharn, Computing Resource Optimization for a Log Monitoring System, 2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII ), Hualien, Taiwan, 2022, pp. 99-102. [CrossRef]
- Aydi Ben Abdallah, R. , Jabnoun-Khiareddine, H., Ayed, F., & M. Daami-Remadi. A Three-Year Study of Comparative Effects of Four Organic Amendments on Soil Health Dynamics, Tomato Production, and Rhizosphere Microbial Community. Communications in Soil Science and Plant Analysis, 2023, 54, 2439–2458. [Google Scholar] [CrossRef]
- GP Shetty, A Meghana, Sangeetha CG, Niranjan HG, Mahesh G Shetty, M Narayanaswamy. Impact of multiplex yield enhancer on the growth, yield, disease, and insect incidence of tomato crop. Int J Adv Biochem Res 2024, 8, 49–57. [CrossRef]
- Hang, Zhang, Chen, Zhang, Wang, Classification of Plant Crop Diseases Based on Improved Convolutional Neural Network. Sensors, 2019, 19, 4161. [CrossRef]
- Gadade .; Kirange, Machine Learning Approach towards Tomato Leaf Disease Classification, IJATCSE, 2020, 9, 490–495. [CrossRef]
- K. Kapucuoglu.; M. Kirci, Tomato Leaf Disease Detection Using Hyperparameter Optimization in CNN, 2021, 13th International Conference on Electrical and Electronics Engineering (ELECO), (IEEE, Bursa, Turkey, 2021), pp. 373–377.
- Y. A. Reddy .; A. M. A Framework System for Plant Leaf Disease Detection using K-Nearest Neighbours and comparison of its features with Naive Bayes Classification. 2022, International Conference on Business Analytics for Technology and Security (ICBATS), Dubai, United Arab Emirates, 2022, pp. 1-4. [CrossRef]
- Nikith, B. V.; Keerthan, N. K. S.; Praneeth, M.S.; Amrita, D. T. (2023). Leaf Disease Detection and Classification. Procedia Computer Science, 218, 291–300. [CrossRef]
- Najim, Mohammed Hussein, Salwa Khalid Abdulateef, and Abbas Hanon Alasadi. Early Detection of Tomato Leaf Diseases Based on Deep Learning Techniques. IAES International Journal of Artificial Intelligence (IJ-AI) 2024, 13. [CrossRef]
- M. Chilakalapudi, S. Jayachandran, Multi-classification of disease induced in plant leaf using chronological Flamingo search optimization with transfer learning, Peer J Computer Science 2024, 10, e1972.








| Indicator/Symptom | Early blight | Late blight | Cercospora crop mold |
|---|---|---|---|
| Lesion shape | Circular | Irregular | Irregular |
| Lesion color | Brown | Grey | Pale green |
| Crop spotting | Common | Common | Rare |
| Lesion margin | Defined | Undefined | Undefined |
| Sporulation | Moderate | High | Low |
| Soil Parameter | Optimal Range/Threshold | Early blight | Late blight | Cercospora crop mold |
|---|---|---|---|---|
| pH Value | 6.0 - 7.0 | High Risk | Moderate Risk | Low Risk |
| Nitrogen (N) | Moderate | Moderate Risk | High Risk | Low Risk |
| Phosphorus (P) | Moderate | Low Risk | Moderate Risk | High Risk |
| Potassium (K) | Moderate | Low Risk | Low Risk | Moderate Risk |
| Soil Humidity | 60-80% | Moderate Risk | High Risk | High Risk |
| Water Level | Well-Drained | Low Risk | High Risk | Moderate Risk |
| Sensor Type | Current | Voltage | Data Format | Threshold Band |
|---|---|---|---|---|
| NPK Sensor | 10-20 mA | 3.3-5V | Analog | Low, Medium, and High nutrient levels |
| Temperature Sensor | 0.5-10 mA | 3-5V | Analog | Temperature range (e.g., -40°C to 125°C) |
| Humidity Sensor | 0.5-15 mA | 2.5-5V | Analog | Humidity range (e.g., 0-100% RH) |
| GPS Sensor (NEO-6m) | 20-100 mA | 3-5V | Digital (NMEA, etc.) | Geographical coordinates |
| Wi-Fi Sensor | 15-200 mA | 3.3-5V | Digital (TCP/IP, etc.) | Signal strength (dBm) |
| RGB Color Sensor | 10-30 mA | 2.7-5.5V | Digital (RGB values) | Color intensity range |
| Timestamp | NPK Level | Temperature (°C) | Humidity (%) | pH Value |
|---|---|---|---|---|
| 27-11-2023 10:00 | 2 | 22 | 45 | 6.7 |
| 27-11-2023 10:01 | 1 | 23 | 47 | 5.6 |
| 27-11-2023 10:02 | 3 | 22 | 50 | 6.7 |
| 27-11-2023 10:03 | 2 | 21 | 48 | 6.8 |
| 27-11-2023 10:04 | 1 | 22 | 46 | 5.5 |
| 27-11-2023 10:05 | 3 | 23 | 49 | 4.4 |
| 27-11-2023 10:06 | 2 | 24 | 45 | 6.7 |
| 27-11-2023 10:07 | 1 | 21 | 47 | 6.6 |
| 27-11-2023 10:08 | 3 | 22 | 50 | 6.7 |
| 27-11-2023 10:09 | 2 | 23 | 48 | 6.6 |
| Timestamp | NPK Level | Temperature (°C) | Humidity (%) | pH Value | Diseased |
| 27-11-2023 10:00 | 2 | 22 | 45 | 6.7 | 1 |
| 27-11-2023 10:01 | 1 | 23 | 47 | 5.6 | 0 |
| 27-11-2023 10:02 | 3 | 22 | 50 | 6.7 | 0 |
| 27-11-2023 10:03 | 2 | 21 | 48 | 6.8 | 0 |
| 27-11-2023 10:04 | 1 | 22 | 46 | 5.5 | 0 |
| 27-11-2023 10:05 | 3 | 23 | 49 | 4.4 | 0 |
| 27-11-2023 10:06 | 2 | 24 | 45 | 6.7 | 1 |
| 27-11-2023 10:07 | 1 | 21 | 47 | 6.6 | 0 |
| 27-11-2023 10:08 | 3 | 22 | 50 | 6.7 | 0 |
| 27-11-2023 10:09 | 2 | 23 | 48 | 6.6 | 0 |
| Accuracy | 95.35% |
| Precision | 94.92% |
| Recall | 94.89% |
| F1 Score | 94.36% |
| S. No | Reference | Classification method | Database | ML-IoT Enabled | Main findings | Limitations | Accuracy Measure |
|---|---|---|---|---|---|---|---|
| 1. | Hang et al. [2019] [35] | Improved CNN | Plant crop diseases Library | No | Yielded better performance | Lack of real-world testing | 91.7 % |
| 2. | Gadade & Kirange [2020] [36] | DT, SVM, KNN with Gabor feature, NB |
Plant Village Datasets | No | KNN classification performed better than SVM | The study does not consider the impact of varying environmental conditions, a major let down. | 67% 73% 73% 67% |
| 3. | Kapucuoglu and Kirci [2021] [37] | CNN with hyperparameter optimisation | New Plant Diseases Dataset- Kaggle | No | Accuracy increased from 92% to 98 % once a proper hyperparameter tuning is done | Only accuracy test was performed | 98% |
| 4. | Reddy and Adimulam [2022] [38] | KNN & Naïve Bayes (NB) | Plant Village Dataset | No | NB’s accuracy was more than KNN | The comparative analysis is limited to KNN and NB and only reported accuracy as the evaluation metric. | 91 % |
| 5. | Nikhith et al. [2023] [39] | SVM, KNN, CNN | Net-based Images | No | CNN achieved 96% | Limited evolutions | Accuracy- 96 % |
| 6. | Najim [2024] [40] | CNN-based model | Plant Village Dataset | No | The present model outperforms traditional CNN models in terms of speed and storage | Model is not scalable | Accuracy- 92 % |
| 7. | Chilakalapudi & Sheela [2024] [41] | CFSA-TL-based CNN | Plant Village Dataset | No | Identified and classified the disease in its early. stage | Over dependency on high-quality images that may be challenging due to varying environmental conditions | Accuracy- 95.7 % |
| 8. | The present study | Bayesian optimization with KNN | Real-time and high-resolution data | Yes | Bayesian Optimisation with KNN | Lack of diversified databases to use in the present study. | Accuracy-95% Precision-95% Recall- 95% F1Score- 94% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).