Submitted:
20 October 2025
Posted:
21 October 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Material Preparation and Experimental Process
2.1. Loquat Image Collection
2.2. Dataset Creation
2.3. YOLOv8n Model
2.4. YOLO-MCS Model
2.4.1. EfficientNet-b0 Feature Extraction Network
2.4.2. C2f_SCConv Convolution Module
2.4.3. SimAm Attention Mechanism Module
3. Experimental Results and Analysis
3.1. Experimental Platform and Parameter Settings
3.2. Evaluation Criteria
3.3. Contrastive Investigation of Attention Mechanisms
3.4. Comparison Experiment between Different Backbone Networks
3.5. Ablation Experiment
3.6. Comparison of Different Models
3.7. Visual Analytics
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, S.; Xue, J.; Zhang, T.; Lv, P.; Qin, H.; Zhao, T. Research progress and prospect of key technologies of fruit target recognition for robotic fruit picking. Frontiers in Plant Science 2024, 15, 1423338. [Google Scholar] [CrossRef]
- Yang, Y.; Han, Y.; Li, S.; Yang, Y.; Zhang, M.; Li, H. Vision based fruit recognition and positioning technology for harvesting robots. Computers and Electronics in Agriculture 2023, 213, 108258. [Google Scholar] [CrossRef]
- Ariza-Sentís, M.; Vélez, S.; Baja, H.; Valenti, R.G.; Valente, J. An aerial framework for Multi-View grape bunch detection and route Optimization using ACO. Computers and Electronics in Agriculture 2024, 221, 108972. [Google Scholar] [CrossRef]
- Testolin, R.; Ferguson, A. Kiwifruit (Actinidia spp.) production and marketing in Italy. New Zealand Journal of Crop and Horticultural Science 2009, 37, 1–32. [Google Scholar] [CrossRef]
- Chen, C.; Lu, J.; Zhou, M.; Yi, J.; Liao, M.; Gao, Z. A YOLOv3-based computer vision system for identification of tea buds and the picking point. Computers and Electronics in Agriculture 2022, 198, 107116. [Google Scholar] [CrossRef]
- Liu, H.; Zhou, L.; Zhao, J.; Wang, F.; Yang, J.; Liang, K.; Li, Z. Deep-learning-based accurate identification of warehouse goods for robot picking operations. Sustainability 2022, 14, 7781. [Google Scholar] [CrossRef]
- He, W.; Gage, J.L.; Rellán-Álvarez, R.; Xiang, L. Swin-Roleaf: A new method for characterizing leaf azimuth angle in large-scale maize plants. Computers and Electronics in Agriculture 2024, 224, 109120. [Google Scholar] [CrossRef]
- Hua, X.; Li, H.; Zeng, J.; Han, C.; Chen, T.; Tang, L.; Luo, Y. A review of target recognition technology for fruit picking robots: from digital image processing to deep learning. Applied sciences 2023, 13, 4160. [Google Scholar] [CrossRef]
- Tang, Y.; Chen, M.; Wang, C.; Luo, L.; Li, J.; Lian, G.; Zou, X. Recognition and localization methods for vision-based fruit picking robots: A review. Frontiers in Plant Science 2020, 11, 510. [Google Scholar] [CrossRef]
- Tulbure, A.-A.; Tulbure, A.-A.; Dulf, E.-H. A review on modern defect detection models using DCNNs–Deep convolutional neural networks. Journal of Advanced Research 2022, 35, 33–48. [Google Scholar] [CrossRef]
- Jing, J.; Zhang, S.; Sun, H.; Ren, R.; Cui, T. YOLO-PEM: A lightweight detection method for young “Okubo” peaches in complex orchard environments. Agronomy 2024, 14, 1757. [Google Scholar] [CrossRef]
- Deng, F.; Chen, J.; Fu, L.; Zhong, J.; Qiaoi, W.; Luo, J.; Li, J.; Li, N. Real-time citrus variety detection in orchards based on complex scenarios of improved YOLOv7. Frontiers in Plant Science 2024, 15, 1381694. [Google Scholar] [CrossRef] [PubMed]
- Yu, K.; Tang, G.; Chen, W.; Hu, S.; Li, Y.; Gong, H. MobileNet-YOLO v5s: An improved lightweight method for real-time detection of sugarcane stem nodes in complex natural environments. Ieee Access 2023, 11, 104070–104083. [Google Scholar] [CrossRef]
- Sun, F.; Lv, Q.; Bian, Y.; He, R.; Lv, D.; Gao, L.; Wu, H.; Li, X. Grape Target Detection Method in Orchard Environment Based on Improved YOLOv7. Agronomy 2025, 15, 42. [Google Scholar] [CrossRef]
- Sun, H.; Wang, B.; Xue, J. YOLO-P: An efficient method for pear fast detection in complex orchard picking environment. Frontiers in plant science 2023, 13, 1089454. [Google Scholar] [CrossRef]
- Liu, Z.; Abeyrathna, R.R.D.; Sampurno, R.M.; Nakaguchi, V.M.; Ahamed, T. Faster-YOLO-AP: A lightweight apple detection algorithm based on improved YOLOv8 with a new efficient PDWConv in orchard. Computers and Electronics in Agriculture 2024, 223, 109118. [Google Scholar] [CrossRef]
- Qi, C.; Nyalala, I.; Chen, K. Detecting the early flowering stage of tea chrysanthemum using the F-YOLO model. Agronomy 2021, 11, 834. [Google Scholar] [CrossRef]
- Sun, Y.; Li, Y.; Li, S.; Duan, Z.; Ning, H.; Zhang, Y. PBA-YOLOv7: an object detection method based on an improved YOLOv7 network. Applied Sciences 2023, 13, 10436. [Google Scholar] [CrossRef]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2018; pp. 8759–8768.
- Huang, M.; Mi, W.; Wang, Y. Edgs-yolov8: An improved yolov8 lightweight uav detection model. Drones 2024, 8, 337. [Google Scholar] [CrossRef]
- Ma, B.; Hua, Z.; Wen, Y.; Deng, H.; Zhao, Y.; Pu, L.; Song, H. Using an improved lightweight YOLOv8 model for real-time detection of multi-stage apple fruit in complex orchard environments. Artificial Intelligence in Agriculture 2024, 11, 70–82. [Google Scholar] [CrossRef]
- Shi, Y.; Qing, S.; Zhao, L.; Wang, F.; Yuwen, X.; Qu, M. Yolo-peach: a high-performance lightweight yolov8s-based model for accurate recognition and enumeration of peach seedling fruits. Agronomy 2024, 14, 1628. [Google Scholar] [CrossRef]
- Atila, Ü.; Uçar, M.; Akyol, K.; Uçar, E. Plant leaf disease classification using EfficientNet deep learning model. Ecological Informatics 2021, 61, 101182. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2018; pp. 4510–4520.
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2018; pp. 7132–7141.
- Li, J.; Wen, Y.; He, L. Scconv: Spatial and channel reconstruction convolution for feature redundancy. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023; pp. 6153–6162.
- Yang, L.; Zhang, R.-Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International conference on machine learning; 2021; pp. 11863–11874. [Google Scholar]
- Sun, D.; Zhang, K.; Zhong, H.; Xie, J.; Xue, X.; Yan, M.; Wu, W.; Li, J. Efficient tobacco pest detection in complex environments using an enhanced YOLOv8 model. Agriculture 2024, 14, 353. [Google Scholar] [CrossRef]
- Lv, Q.; Sun, F.; Bian, Y.; Wu, H.; Li, X.; Zhou, J. A Lightweight Citrus Object Detection Method in Complex Environments. Agriculture 2025, 15, 1046. [Google Scholar] [CrossRef]
- Ma, R.; Wang, J.; Zhao, W.; Guo, H.; Dai, D.; Yun, Y.; Li, L.; Hao, F.; Bai, J.; Ma, D. Identification of maize seed varieties using MobileNetV2 with improved attention mechanism CBAM. Agriculture 2022, 13, 11. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020; pp. 11534–11542.
- Liu, Y.; Shao, Z.; Hoffmann, N. Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv preprint. arXiv:2112.05561 2021.
- Wang, Y.; Deng, H.; Wang, Y.; Song, L.; Ma, B.; Song, H. CenterNet-LW-SE net: integrating lightweight CenterNet and channel attention mechanism for the detection of Camellia oleifera fruits. Multimedia Tools and Applications 2024, 83, 68585–68603. [Google Scholar] [CrossRef]
- Lau, K.W.; Po, L.-M.; Rehman, Y.A.U. Large separable kernel attention: Rethinking the large kernel attention design in cnn. Expert Systems with Applications 2024, 236, 121352. [Google Scholar] [CrossRef]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.-C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V. Searching for mobilenetv3. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision, 2019; pp. 1314–1324.
- Li, J.; Zhu, Z.; Liu, H.; Su, Y.; Deng, L. Strawberry R-CNN: Recognition and counting model of strawberry based on improved faster R-CNN. Ecological Informatics 2023, 77, 102210. [Google Scholar] [CrossRef]
- Huang, R.; Pedoeem, J.; Chen, C. YOLO-LITE: a real-time object detection algorithm optimized for non-GPU computers. In Proceedings of the 2018 IEEE international conference on big data (big data); 2018; pp. 2503–2510. [Google Scholar]
- Malta, A.; Mendes, M.; Farinha, T. Augmented reality maintenance assistant using yolov5. Applied Sciences 2021, 11, 4758. [Google Scholar] [CrossRef]
- Norkobil Saydirasulovich, S.; Abdusalomov, A.; Jamil, M.K.; Nasimov, R.; Kozhamzharova, D.; Cho, Y.-I. A YOLOv6-based improved fire detection approach for smart city environments. Sensors 2023, 23, 3161. [Google Scholar] [CrossRef] [PubMed]
- Wu, D.; Jiang, S.; Zhao, E.; Liu, Y.; Zhu, H.; Wang, W.; Wang, R. Detection of Camellia oleifera fruit in complex scenes by using YOLOv7 and data augmentation. Applied sciences 2022, 12, 11318. [Google Scholar] [CrossRef]
- Ma, N.; Su, Y.; Yang, L.; Li, Z.; Yan, H. Wheat seed detection and counting method based on improved YOLOv8 model. Sensors 2024, 24, 1654. [Google Scholar] [CrossRef]
- Wang, Y.; Rong, Q.; Hu, C. Ripe tomato detection algorithm based on improved YOLOv9. Plants 2024, 13, 3253. [Google Scholar] [CrossRef] [PubMed]
- Li, A.; Wang, C.; Ji, T.; Wang, Q.; Zhang, T. D3-YOLOv10: Improved YOLOv10-based lightweight tomato detection algorithm under facility scenario. Agriculture 2024, 14, 2268. [Google Scholar] [CrossRef]
- Teng, H.; Wang, Y.; Li, W.; Chen, T.; Liu, Q. Advancing Rice Disease Detection in Farmland with an Enhanced YOLOv11 Algorithm. Sensors 2025, 25, 3056. [Google Scholar] [CrossRef] [PubMed]
- Yin, X.; Zhao, Z.; Weng, L. MAS-YOLO: A Lightweight Detection Algorithm for PCB Defect Detection Based on Improved YOLOv12. Applied Sciences 2025, 15, 6238. [Google Scholar] [CrossRef]











| Category | Original Image | Image Enhancement | Training Set | Validation Set | Test Set |
|---|---|---|---|---|---|
| Number | 325 | 1625 | 1560 | 195 | 195 |
| Type | Resolution | Channels | Layers | |
|---|---|---|---|---|
| 1 | Conv3×3 | 224 × 224 | 32 | 1 |
| 2 | MBConv1,k3×3 | 112 × 112 | 16 | 1 |
| 3 | MBConv6,k3×3 | 112 ×112 | 24 | 2 |
| 4 | MBConv6,k5×5 | 56 × 56 | 40 | 2 |
| 5 | MBConv6,k5×5 | 28 × 28 | 80 | 3 |
| 6 | MBConv6,k5×5 | 14 × 14 | 112 | 3 |
| 7 | MBConv6,k5×5 | 14 × 14 | 192 | 4 |
| 8 | MBConv6,k3×3 | 7 × 7 | 320 | 1 |
| 9 | Conv1×1&Pooling&FC | 7 × 7 | 1280 | 1 |
| Training Parameters | Numerical Value |
|---|---|
| Image size | 300 |
| Batch size | 640×640 |
| Initial learning rate | 0.01 |
| Optimizer | SGD |
| Momentum | 0.937 |
| Multi-threaded | 16 |
| Models | Precision (%) |
Recall (%) |
mAP@0.5 (%) |
GFLOPs (G) |
Params (M) |
|---|---|---|---|---|---|
| YOLOv8n | 92.5 | 87.6 | 90.8 | 8.2 | 3.0 |
| CBAM | 86.8 | 84.2 | 88.0 | 5.5 | 1.7 |
| SE | 89.8 | 86.3 | 86.9 | 5.4 | 1.7 |
| ECA | 85.8 | 84.3 | 89.0 | 5.4 | 1.7 |
| GAM | 86.8 | 87.2 | 88.5 | 6.5 | 2.2 |
| LSKA | 88.7 | 80.6 | 88.2 | 5.6 | 1.7 |
| SimAm | 93.8 | 87.6 | 93.0 | 5.4 | 1.7 |
| Backbone network |
Precision (%) |
Recall (%) |
mAP@0.5 (%) |
GFLOPs (G) |
Params (M) |
|---|---|---|---|---|---|
| YOLOv8n | 92.5 | 87.6 | 90.8 | 8.2 | 3.0 |
| MobileNetv3 | 90.3 | 77.3 | 87.4 | 5.8 | 2.4 |
| ShuffleNetv2 | 83.4 | 84.8 | 89.6 | 4.7 | 1.5 |
| GhostNetv2 | 82.1 | 84.6 | 87.3 | 7.7 | 3.5 |
| EfficientNet-b0 | 93.8 | 87.6 | 93.0 | 5.4 | 1.7 |
| Efficient Net-b0 |
Sim Am |
SC Conv |
P(%) | R(%) | mAP@0.5 (%) |
GFLOPs (G) |
Params (M) |
|
|---|---|---|---|---|---|---|---|---|
| 1 | - | - | - | 92.5 | 87.6 | 90.8 | 8.2 | 3.0 |
| 2 | √ | - | - | 81.4 | 87.6 | 86.3 | 5.7 | 1.9 |
| 3 | - | √ | - | 91.3 | 87.6 | 91.1 | 8.3 | 2.9 |
| 4 | - | - | √ | 88.8 | 91.2 | 92.3 | 7.8 | 2.8 |
| 5 | √ | √ | - | 87.5 | 85.0 | 88.9 | 5.8 | 1.8 |
| 6 | √ | √ | √ | 93.8 | 87.6 | 93.0 | 5.4 | 1.7 |
| Models | Precision(%) | Recall(%) | mAP@0.5(%) | GFLOPs(G) | Params(M) |
|---|---|---|---|---|---|
| Faster R-CNN | 70.1 | 98.0 | 95.5 | 370.2 | 137.1 |
| YOLO-Lite | 91.7 | 94.0 | 95.8 | 15.4 | 3.7 |
| YOLOv5 | 90.5 | 85.4 | 90.0 | 7.2 | 2.5 |
| YOLOv6 | 88.7 | 84.7 | 88.7 | 11.9 | 4.2 |
| YOLOv7 | 92.0 | 82.2 | 90.3 | 105.3 | 27.7 |
| YOLOv8 | 92.5 | 87.6 | 90.8 | 8.2 | 3.0 |
| YOLOv9 | 84.5 | 95.2 | 94.8 | 102.3 | 25.3 |
| YOLOv10 | 89.5 | 73.0 | 85.4 | 58.9 | 15.3 |
| YOLOv11 | 83.9 | 88.6 | 89.3 | 50.8 | 12.5 |
| YOLOv12 | 85.2 | 87.4 | 88.0 | 48.6 | 11.9 |
| YOLO-MCS | 93.8 | 87.6 | 93.0 | 5.4 | 1.7 |
| Model | Conditions | P(%) | R(%) | mAP@0.5(%) |
|---|---|---|---|---|
| YOLO-MCS | Different light conditions | 87.3 | 81.7 | 88.7 |
| Different perspectives | 87.0 | 89.0 | 89.9 | |
| Different obstruction | 85.4 | 87.8 | 89.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).