Submitted:
30 December 2025
Posted:
31 December 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Methods
2.1. YOLOv5s Grid Architecture

2.2. Preprocessing Optimization

2.3. Introduction of Attention Mechanisms
2.4. Improved Loss Function
3. Experiments and Results
3.1. Model Training
3.1.1. Image Acquisition
3.1.2. Training Environment and Parameter Configuration
| Environmental Parameters | Configuration |
|---|---|
| Operating System | Windows 10 |
| Central Processing Unit (CPU) | i7-14700KF |
| GPU | RTX 4060Ti(8GB) |
| Training Framework | PyTorch 1. 11. 0 |
| Programming Language | Python 3. 8 |
| Parameters | Values |
|---|---|
| epoch | 300 |
| Batch Size | 2 |
| Image Size | 640×640 |
| Initial Learning Rate | 0. 01 |
| Weight Decay Coefficient | 0. 0005 |
| Learning Rate Momentum | 0. 937 |
| Optimizer | SGD (Stochastic Gradient Descent) |
3.2. Evaluation Metrics Commonly Used in Object Detection
3.3. Model Training Effectiveness Comparison
3.4. Detection Results
3.4.1. Comparison with Attention Mechanisms
3.4.2. Introducing Loss Function Comparison
| YOLOv5s Model | YOLOv5s_SimAM Model | Improved YOLOv5s Model |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
3.4.3. Ablation Study
4. Discussion
4.1. Interpretation of Performance Improvements
4.2. Comparison with State-of-the-Art Methods
4.3. Limitations and Future Work
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li GJ. Key characteristics and control system general architecture of intelligent crane. J Mech Eng. 2020;56(24):254-68. [CrossRef]
- Chen ZM, Li M, Shao XJ, Zhao ZC. Obstacle avoidance path planning for bridge crane based on improved RRT algorithm. J Syst Simul. 2021;33(8):1832-8. [CrossRef]
- Hongjie Z, Huimin O, Huan X. Neural network-based time optimal trajectory planning method for rotary cranes with obstacle avoidance. Mechanical Systems and Signal Processing. 2023;185. [CrossRef]
- Wa Z, He C, Haiyong C, Weipeng L. A Time Optimal Trajectory Planning Method for Double-Pendulum Crane Systems With Obstacle Avoidance. IEEE ACCESS. 2021;9:13022-30. [CrossRef]
- Alkhaldi TM, Asiri MM, Alzahrani F, Sharif MM. Fusion of deep transfer learning models with Gannet optimisation algorithm for an advanced image captioning system for visual disabilities. Scientific Reports. 2025;15(1):40446-. [CrossRef]
- Bing Z, Sinem G. Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support. IEEE transactions on visualization and computer graphics. 2020;PP. [CrossRef]
- Gao K, Chen L, Li Z, Wu Z. Automated Identification and Analysis of Cracks and Damage in Historical Buildings Using Advanced YOLO-Based Machine Vision Technology. Buildings. 2025;15(15):2675-. [CrossRef]
- Wu Y, Liu M, Li J. Detection and Recognition of Visual Geons Based on Specific Object-of-Interest Imaging Technology. Sensors. 2025;25(10):3022-. [CrossRef]
- Zou H, Yu XL, Lan T, Du Q, Jiang Y, Yuan H. Classification and recognition of black tea with different degrees of rolling based on machine vision technology and machine learning algorithms. Heliyon. 2025;11(14):e43862-e. [CrossRef]
- Akber MZ, Chan WK, Lee HH, Anwar GA. TPE-Optimized DNN with Attention Mechanism for Prediction of Tower Crane Payload Moving Conditions. Mathematics. 2024;12(19):3006-. [CrossRef]
- Zhao Y, Zhang J, Zhang C. Deep-learning based autonomous-exploration for UAV navigation. Knowledge-Based Systems. 2024;297:111925-. [CrossRef]
- Liu D. Research on lightweight monocular vision based end-side localization algorithm [Master’s thesis]. Beijing: Beijing University of Posts and Telecommunications; 2024. DOI:2024. 10. 26969/d. cnki. gbydu. 2024. 000947.
- Gui DD. Research and implementation of traffic safety helmet wearing detection system based on deep learning [Master’s thesis]. Nanjing: Southeast University; 2023. [CrossRef]
- Peng H, Shaochun M, Chenyang S, Zhengliang D. Emergency obstacle avoidance system of sugarcane basecutter based on improved YOLOv5s. Computers and Electronics in Agriculture. 2024;216:108468-. [CrossRef]
- Kim K, Kim K, Jeong S. Application of YOLO v5 and v8 for Recognition of Safety Risk Factors at Construction Sites. Sustainability. 2023;15(20):15179. [CrossRef]
- Guoyan Y, Yingtong L, Ruoling D. An detection algorithm for golden pomfret based on improved YOLOv5 network. Signal, Image and Video Processing. 2022;17(5):1997-2004. [CrossRef]
- Junzhou C, Kunkun J, Wenquan C, Zhihan L, Ronghui Z. A real-time and high-precision method for small traffic-signs recognition. Neural Computing and Applications. 2021;34(3):2233-45. [CrossRef]
- Wenjie L, Lu W. Quantum image edge detection based on eight-direction Sobel operator for NEQR. Quantum Information Processing. 2022;21(5). [CrossRef]
- Yuan S, Li X, Xia S, Qing X, Deng JD. Quantum color image edge detection algorithm based on Sobel operator. Quantum Information Processing. 2025;24(7):195-. [CrossRef]
- Sun T, Xu J, Li Z, Wu Y. Two Non-Learning Systems for Profile-Extraction in Images Acquired from a near Infrared Camera, Underwater Environment, and Low-Light Condition. Applied Sciences. 2025;15(20):11289-. [CrossRef]
- Yang H, Wang W, Wang Y, Wang P. Novel method for robust bilateral filtering point cloud denoising. Alexandria Engineering Journal. [CrossRef]
- Zhou Y, Zhang T, Li Z, Qiu J. Improved Space Object Detection Based on YOLO11. Aerospace. 2025;12(7):568-. [CrossRef]
- Yuan X, Wang Y, Li Y, Kang H, Chen Y, Yang B. Hierarchical flow learning for low-light image enhancement. Digital Communications and Networks. 2025;11(04):1157-71. [CrossRef]
- Yuanfu G, Puyun L, Xiaodong Z, Lifei Z, Guanzhou C, Kun Z, et al. Enlighten-GAN for Super Resolution Reconstruction in Mid-Resolution Remote Sensing Images. Remote Sensing. 2021;13(6):1104-. [CrossRef]
- Wu YL. Research on road obstacle detection and distance measurement algorithm based on YOLOv5 [Master’s thesis]. Wuhu: Anhui Polytechnic University; 2023. [CrossRef]
- Peng R, Liao C, Pan W, Gou X, Zhang J, Lin Y. Improved YOLOv7 for small object detection in airports: Task-oriented feature learning with Gaussian Wasserstein loss and attention mechanisms. Neurocomputing. 2025;634:129844-. [CrossRef]
- Dong Z. Vehicle Target Detection Using the Improved YOLOv5s Algorithm. Electronics. 2024;13(23):4672-. [CrossRef]
- Yang XJ, Zeng ZY. Dy-YOLO: an improved object detection algorithm for UAV aerial photography based on YOLOv5. J Fujian Norm Univ (Nat Sci Ed). 2024;40(1):76-86. [CrossRef]
- Doong SH. Predicting postural risk level with computer vision and machine learning on multiple sources of images. Engineering Applications of Artificial Intelligence. 2025;143:109981-. [CrossRef]



| Model | Average Precision | Average Recall | F1 Score | mAP@0. 5 | Inference Speed | Training Time | Model Size |
|---|---|---|---|---|---|---|---|
| YOLOv5n | 0. 892 | 0. 875 | 0. 883 | 0. 949 | 16 | 5. 525h | 7. 5 |
| YOLOv5s | 0. 905 | 0. 882 | 0. 893 | 0. 952 | 18 | 5. 428h | 14. 1 |
| YOLOv5m | 0. 913 | 0. 889 | 0. 899 | 0. 957 | 25 | 5. 385h | 43. 7 |
| YOLOv5l | 0. 911 | 0. 886 | 0. 898 | 0. 951 | 32 | 8. 764h | 89. 2 |
| Model | Person Accuracy | Materi l Accuracy | Obstac le 1 Accuracy | Obstacle 2 Accuracy | Obstacle 3 Accuracy | mAP@0. 5 | Detection Speed |
|---|---|---|---|---|---|---|---|
| Original YOLOv5s | 0. 54 | 0. 62 | 0. 72 | 0. 69 | 0. 75 | 0. 876 | 19. 5 |
| YOLOv5s+SimAM | 0. 78 | 0. 82 | 0. 83 | 0. 81 | 0. 83 | 0. 921 | 19. 8 |
| YOLOv5s+SimAM+EIOU | 0. 94 | 0. 96 | 0. 96 | 0. 97 | 0. 97 | 0. 952 | 20. 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).














