Submitted:
17 March 2025
Posted:
18 March 2025
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
2. YOLOv8 Overview
2.1. Input End
2.2. Backbone
2.2.1. C2f
2.2.2. SPPF
2.3. Neck
2.4. Head
2.4.1. Decoupled Head
2.4.2. Anchor-Free
3. Improved YOLOv8s Object Detection Algorithm
3.1. Cloformer and CBAM Attention Mechanism
3.2. Model Lightweight Design
3.3. Improvement of the SPPF Module
4. Experimental Results and Analysis
4.1. Experimental Dataset
4.2. Experimental Environment and Parameter Settings
4.3. Evaluation Indicators
4.4. Melting Experiments and Result Analysis
4.5. Controlled Experiment
4.5.1. Comparison Before and After Improvement
4.5.2. Comparison with Other Algorithms
5. Discussion
6. Conclusion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Fang, F.; Hu, X.-j.; Zhang, B.-m.; Xie, Z.-h.; Jiang, J.-q. Deformation of dual-structure medium carbon steel in cold drawing. Materials Science and Engineering: A 2013, 583, 78–83. [Google Scholar] [CrossRef]
- Zhou, L.; Gong, J.; Li, B. Image Information Restoration of Automotive Strip Steel Surface Based on Sparse Representation. Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences 2021, 48, 141–148. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.K.; Girshick, R.B.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015, 779–788. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Feng, X.; Gao, X.-w.; Luo, L. X-SDD: A New Benchmark for Hot Rolled Steel Strip Surface Defects Detection. Symmetry 2021, 13, 706. [Google Scholar] [CrossRef]
- Li, S.; Wang, X. YOLOv5-based Defect Detection Model for Hot Rolled Strip Steel. Journal of Physics: Conference Series 2022, 2171, 012040. [Google Scholar] [CrossRef]
- Shen, L.; Shi, T.; Liao, J. Surface Defect Detection Algorithm of Hot-Rolled Strip Based on Improved YOLOv7. IAENG International Journal of Computer Science 2024, 51, 345–354. [Google Scholar]
- Zhang, W.K.; Liu, J. Steel Surface Defect Detection Based on Improved YOLOv8s. Journal of Beijing Information Science & Technology University (Natural Science Edition) 2023, 33–40. [Google Scholar] [CrossRef]
- Wang, L.L.; Gong, Z.Z.; Liang, Z.Q. Surface Defect Detection of Strip Steel Based on Improved YOLOv5s Algorithm. Machine Tool & Hydraulics 2024, 181–186. [Google Scholar] [CrossRef]
- Dong, J.; Cheng, J.; Wu, J.; Zhang, C.; Zhao, S.; Tang, X. Real-Time AIoT for UAV Antenna Interference Detection via Edge-Cloud Collaboration. IEEE Internet of Things Journal 2024, 1–1. [Google Scholar] [CrossRef]
- Fan, Q.; Huang, H.; Guan, J.; He, R. Rethinking Local Perception in Lightweight Vision Transformer. 2023. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module; Springer: Cham, 2018. [Google Scholar]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017; pp. 1800–1807. [Google Scholar]
- Chen, J.; Kao, S.h.; He, H.; Zhuo, W.; Wen, S.; Lee, C.H.; Chan, S.H.G. Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 17-24 June 2023; pp. 12021–12031. [Google Scholar]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
- Hussain, M. YOLO-v1 to YOLO-v8, the Rise of YOLO and Its Complementary Nature toward Digital Manufacturing and Industrial Defect Detection. Machines 2023. [Google Scholar] [CrossRef]
- Zhu, X.H.; Li, G.W.; Chang, D.F.; Du, J.W. A Surface Defect Detection Method for Hot-Rolled Strip Steel Based on Improved YOLOv8s. Ship Engineering 2025, 124–131. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Sun, Y.; Chen, G.; Zhou, T.; Zhang, Y.; Liu, N. Context-aware Cross-level Fusion Network for Camouflaged Object Detection. In Proceedings of the International Joint Conference on Artificial Intelligence, 2021. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017; pp. 936–944. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18-23 June 2018; pp. 8759–8768. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection. arXiv 2020, arXiv:2006.04388. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. arXiv 2019, arXiv:1911.08287. [Google Scholar] [CrossRef]
- Huang, L.; Yang, Y.; Deng, Y.; Yu, Y. DenseBox: Unifying Landmark Localization with End to End Object Detection. arXiv 2015, arXiv:1509.04874. [Google Scholar]
- Wu, C.K.; Huang, F.; Li, B.; Gu, L.L.; Liu, L.; Fang, Y.M. DMS-YOLOv8 Slab Detection Algorithm Based on Improved Depthwise Separable Convolution and Hybrid Attention Mechanism. Metallurgical Automation 2024, 31–39. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Jocher, G.; Stoken, A.; Chaurasia, A.; Borovec, J.; Kwon, Y.; Michael, K.; Thanh Minh, M. ultralytics/yolov5: v6.0-YOLOv5n 'Nano' models, Roboflow integration, TensorFlow export, OpenCV DNN support. Zenodo 2021. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. In Proceedings of the International Conference on Learning Representations, 2017. [Google Scholar]
- Buslaev, A.V.; Parinov, A.; Khvedchenya, E.; Iglovikov, V.I.; Kalinin, A.A. Albumentations: fast and flexible image augmentations. arXiv 2018, arXiv:1809.06839. [Google Scholar]
- Chattopadhay, A.; Sarkar, A.; Howlader, P.; Balasubramanian, V.N. Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 12-15 March 2018; pp. 839–847. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.E.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision, 2015. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13-19 June 2020; pp. 10778–10787. [Google Scholar]















| Experiments | mAP@0.5 | Params(M) | FLOPs(G) | FPS |
|---|---|---|---|---|
| 1 | 76.8 | 12.69 | 15.17 | 98.8 |
| 2 | 77.4 | 12.51 | 15.48 | 94.6 |
| 3 | 77.5 | 11.81 | 15.52 | 89.1 |
| 4 | 77.1 | 11.57 | 15.09 | 102.3 |
| 5 | 76.6 | 11.75 | 14.79 | 102.9 |
| 6 | 76.1 | 12.45 | 14.74 | 102.7 |
| Experiment | CBAM | DWConv | PConv | CloFomer | SimSPPF | Params | FLOPs | P | R | mAP50 | FPS |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | / | / | / | / | / | 11.17 | 14.36 | 74.4 | 95 | 75.68 | 107.8 |
| 2 | √ | / | / | / | / | 11.51 | 14.36 | 74.3 | 95 | 76.65 | 102.4 |
| 3 | / | √ | / | / | / | 9.61 | 12.5 | 79.4 | 95 | 77.61 | 136.7 |
| 4 | / | / | √ | / | / | 8.31 | 10.35 | 71.1 | 97 | 75.23 | 173.5 |
| 5 | / | / | / | √ | / | 12.75 | 15.9 | 77.2 | 96 | 76.82 | 83.6 |
| 6 | / | / | / | / | √ | 11.17 | 14.36 | 76.8 | 94 | 75.88 | 129.9 |
| 7 | √ | √ | / | / | / | 9.95 | 12.5 | 74.9 | 96 | 78.43 | 122.6 |
| 8 | √ | √ | √ | / | / | 7.10 | 8.49 | 72.9 | 97 | 77.27 | 191.7 |
| 9 | √ | √ | √ | √ | / | 8.27 | 9.31 | 77.2 | 98 | 79.08 | 180.7 |
| 10 | √ | √ | √ | √ | √ | 8.28 | 9.31 | 77.5 | 98 | 80.41 | 182.4 |
| Method | mAP | FLOPs | Params | FPS |
|---|---|---|---|---|
| YOLOv5s | 75.6 | 7.2 | 8.7 | 195 |
| YOLOv6s | 78.2 | 9.8 | 12.4 | 165 |
| Faster R-CNN | 76.4 | 41.5 | 134.2 | 22 |
| SSD | 68.9 | 26.8 | 35.4 | 125 |
| EfficientDey-D1 | 73.8 | 13.6 | 10.5 | 95 |
| Ours | 80.41 | 8.21 | 9.31 | 182.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).