Preprint
Article

This version is not peer-reviewed.

Research on Intelligent Identification Method of Pantograph Positioning and Skateboard Structural Anomalies Based on Improved YOLO v8 Algorithm

A peer-reviewed article of this preprint also exists.

Submitted:

21 November 2024

Posted:

25 November 2024

You are already at the latest version

Abstract
Abstract: The abnormal structural state of the pantograph skateboard is a significant and highly concerned issue that has a sig-nificant impact on the safety of high-speed railway operation. In order to obtain real-time information on the abnormal state of the skateboard in advance, a defect intelligent identification model suitable for the monitoring device of the pantograph skate-board was designed using computer vision based intelligent detection technology for pantograph skateboard defects, combined with improved YOLO v8 model and traditional image processing algorithms such as edge extraction. The results show that the anomaly detection algorithm for the pantograph sliding plate structure has good robustness, maintaining recognition accuracy of 90% or above in complex scenes, and the average time is 12.32ms. Railway field experiments have proven that the intelligent recognition model meets the actual detection requirements of railway sites and has strong practical application value.
Keywords: 
;  ;  ;  

1. Introduction

As an important part of railway transportation, high-speed railway not only has the advantages of convenience, safety, and affordability, but also has become one of the important ways for mass travel due to its fast and comfortable characteristics [1]. The reliable operation of high-speed rail pantographs determines the safety of railway transportation to a large extent. In order to ensure that trains maintain normal operation and detect pantograph faults in a timely manner, abnormal pantograph faults must be detected quickly and accurately [2-6]. The pantograph slide monitoring device is installed on the stations, throat areas of stations, high-speed trains and locomotive depots of electric locomotives or high-speed railways. It uses high-speed, high-resolution, non-contact image analysis and measurement technology and other methods to achieve monitoring automatic detection of the status of the pantograph slide and visual observation of foreign objects on the roof and the status of key components [7-9].
At present, the defect recognition method of pantograph slide based on image processing has achieved some results and applications in the pantograph sliding state monitoring device. The commonly used pantograph positioning methods include Hough transform, HOG feature, edge detection, pattern matching, second-generation curve wavelet transform, spatial region reconstruction, and density clustering algorithm. However, these traditional image processing methods are susceptible to changes in lighting and noise interference, and their recognition ability in complex backgrounds needs to be strengthened [10,11]. For example, Girshick et al.[12] designed an R-CNN model using convolutional neural networks, replacing the traditional graphic feature extraction part with AlexNet, and combined it with a selective search method for generating region proposals. In the object detection task of the PASCAL VOC 2012 dataset, they achieved a mean average precision (mAP) of 62.4%, which is 20% higher than traditional object detection methods. He Kaiming et al.[13] designed a pyramid pooling layer after the last convolutional layer of SPPNet to unify the size of feature images without considering the size of the input image. The experiment showed that SPPNet's testing time was 24-102 times faster than R-CNN. The Faster R-CNN network proposed by Ren Shaoqing et al.[14] for the first time adopts the Region Proposal Network (RPN), which can complete both candidate region generation and region classification tasks, thereby reducing the computational time overhead caused by selective search algorithms and achieving end-to-end training, solving the problems of R-CNN time and space overhead.
In recent years, with the rapid development of deep learning technology and big data analysis technology, combined with deep learning and traditional image processing methods, this paper proposes an improved intelligent recognition method for pantograph positioning and skateboard structure anomaly and defect recognition based on YOLO v8 algorithm. This chapter first analyzes the two-stage algorithm YOLO v8 network model as the basic structure of the object detection algorithm, and then improves the network structure based on the complex background and large target characteristics of the pantograph image, ultimately achieving the intelligent recognition function of the pantograph. The specific YOLO v8 network model has been described in detail in Chapter 2.

2. Pantograph Slide Positioning Detection Method Based on Convolutional Neural Network

The pantograph slide status monitoring device is mainly composed of high-definition imaging equipment, data acquisition and processing equipment, remote network transmission channels, user terminals, etc. It can monitor the technical status of the pantograph slide, promptly discover the abnormal status of the pantograph slide, and shorten the inspection scope, guide the maintenance of contact network, and the management department is the Bureau Group Company and the power supply section. The pantograph skateboard status monitoring device monitors the status of the pantograph skateboard through high-definition imaging equipment, transmits video or image information to relevant departments, and they can capture and identify the EMU train number. In the pantograph slide condition monitoring device, the image acquisition module adopts a dual-machine redundant structure to improve the reliability of information during the image acquisition process. With the development of Chinese 5G communication technology, long-distance transmission is achieved through 5G wireless communication technology.

2.1. YOLO v8 Network Structure

The first prerequisite for realizing pantograph slide defect identification is to target the pantograph in the images collected by the pantograph slide condition monitoring device [15]. Deep learning methods are based on massive data and efficient computing resources, and realize automatic feature extraction and learning through neural network models. Compared with artificially selected image features and artificially formulated classification standards in traditional image processing methods, the features obtained by deep learning methods have better generalization and robustness and are suitable for changing and complex scenes [16]. Pantograph intelligent recognition uses image target detection based on deep learning as the basis. The traditional image target detection algorithm has been discussed in Chapter 2. First, the two-stage algorithm YOLO v8 network model is analyzed as the basic structure of the target detection algorithm, and then based on the actual background of the pantograph image is complex and the target is large, so the network structure is fine-tuned to finally realize the intelligent recognition function of the pantograph.
The YOLO v8 intelligent recognition algorithm proposed in this article aims to solve the shortcomings of previous research. It is based on the success of previous YOLO versions and makes new improvements to further improve accuracy, real-time detection, etc. The Backbone and Neck parts of YOLO v8 use the C2F module with richer gradient flow, and adjust the number of channels for different scale models to form a neural network model with stronger feature representation capabilities; the Spatial Pyramid Pooling Fast (SPPF) feature is used pyramid network greatly reduces the amount of calculation. The head part adopts the now popular decoupling head structure, which can clearly separate the classification head and detection head. The network structure of YOLO v8 is shown in Figure 1.
Since the catenary scene does not need to support multiple categories of classification information, the model does not need to use complex deep networks for feature extraction, and can be implemented using simpler and lighter networks [17]. C2FDarkNet-53 is an efficient and fast feature extraction network. It optimizes the network structure according to the design guidelines of efficient networks. Compared with ShuffleNetv2, its operating efficiency is greatly improved. The comparison of the two network models is shown in Table 1.
As can be seen from the table, the network parameter size of C2FDarkNet-53 is reduced by nearly 100 times compared to ShuffleNetv2, and the running time is also reduced from 14ms to 3ms. It can be seen that C2FDarkNet-53 has faster calculation speed and is more suitable for feature extraction tasks on large-scale data sets.
Since the activation function needs to be embedded and calculated after each basic operation, the time cost and computational overhead of using the Mish function are relatively large, and the accuracy rate is not significantly improved. The pantograph intelligent recognition model needs to have both accuracy and running speed, so this article replaces the activation function and uses the LeakyRelu activation function, which is simple to calculate and has basically no decrease in effect, to replace the Mish function. The calculation formula of the LeakyRelu activation function is shown in Equation (1). In the formula, x is the characteristic value calculated by network convolution; the value range of a is (0,1).
Preprints 140474 i001
Since the recognition task of the pantograph skateboard condition monitoring device has a lot of background interference and is difficult to detect, a more powerful feature extraction network is needed in the feature fusion part [18]. NAS-FPN uses NAS (Neural Architecture Search) technology to optimize the architecture of FPN (Feature Pyramid Networks). This method uses a search method instead of manually planning the feature fusion method between various scales, so it is better in terms of feature fusion effect, and the calculation speed is not seriously affected.

2.2. Image Sample Expansion

Image quality varies. Due to interference from lighting, vehicle speed, and signals, some images have poor quality problems such as blur, noise, overexposure, and underexposure, and such images are less distributed in the entire data set. Directly using the original data set for training will lead to uneven distribution of the trained model, which may not be effective in actual applications. The distribution of positive and negative samples is uneven. Compared with normal images, images with pantograph slide defects have insufficient negative samples, which can easily lead to overfitting and affect the recognition results of the model [19-22].
To solve the above problems, data enhancement operations on original images are essential. The principle of data enhancement processing is to generate new samples based on the original samples and add them to the data set through image operations, cropping, rotation, etc., thereby changing the sample distribution of the data set. Sample amplification is divided into single sample amplification and multiple sample amplification. The YOLO v8 network model integrates the Mosaic method of multi-sample amplification. On this basis, a single-sample amplification method is added [23,24], including image flipping, contrast change, grayscale adjustment, adding noise, image blurring, etc. To simulate the situation of poor image quality and ensure the robustness and stability of the trained model.

2.3. The Impact of Backbone Model on Experimental Results

In the same experimental environment, we further studied the impact of different Backbone models on the detector accuracy. We conducted a series of experiments on the Backbone of the YOLO v8 algorithm and the Backbone of the old YOLO version algorithm. Their performance comparison is shown in the Table 2. We use traditional GPUs for training and verify a large number of features that improve the accuracy of classifiers and detectors. The end result is that YOLO v8 has higher accuracy and better results.
Figure 2. shows the comparison results of various versions of the Y.
Figure 2. YOLO algorithm performance comparison.
Figure 2. YOLO algorithm performance comparison.
Preprints 140474 g002

2.4. Model Training and Verification

The hardware used is graphics card GTX1080Ti, 16GB memory, 250GB solid state drive, based on the TensorFlow deep learning framework. The original images are collected. Due to the uneven distribution of samples in the data set, the image sample expansion method is used to amplify the images with too dark and uneven brightness, so that the total number of images with too dark and uneven brightness is roughly equivalent to normal images [25,26], the total number of images after amplification is 5000. Due to the supervised approach, 5,000 pantograph images need to be manually annotated before training the pantograph parts detection network, 80% of which are used as training sets for pantograph recognition, and the remaining 20% are used as tests for this model gather. Use the Label mg script tool for annotation, use a box to mark the position of the skateboard in each pantograph image, and the script will automatically convert it into the input data format of the YOLO network.
The pantograph data comes from the imaging module of the pantograph skateboard condition monitoring device, which contains pantograph image data of different road conditions, different angles, different environments and different models, totaling 3,500 images. The statistics of the proportion of training set and test set are shown in Table 3.
Use the expanded data set to train the pantograph recognition YOLO v8 model. During the training process, the model loss value changes. It can be seen from the curve that after 4000 iterations, the loss value of the final model on the training set fluctuates stably around 0.05, and the change is small, indicating that the model training has basically converged. In order to verify the accuracy of the pantograph recognition model based on the improved YOLO v8, 1000 pantograph images in the test set were used to test the pantograph recognition effect of typical images. The verification results are shown in Figure 3.
Table 4 shows that the average image recognition rate of the model described in this article is about 97.02% under normal brightness, the average image recognition rate under uneven brightness is about 90.06%, and the average image recognition rate under too low brightness is About 77.60%. The average image recognition time is 13.75ms. The statistical results of model processing are shown in Figure 4.

3. Principle Of Slider Structure Anomaly Detection

The pantograph imaging camera continuously collects pantograph images and uses the above model to locate the pantograph area. The traditional image processing method was used to extract the contour features of the two carbon slides, and a threshold was set to determine whether there were structural abnormalities in the two pantograph slides. The pantograph slide structure anomaly detection process is shown in Figure 5.
The Gaussian filtering algorithm used for noise reduction can filter out isolated noise anomalies, improve image quality, and reduce interference edges in the subsequent edge extraction process. During the Gaussi an filtering process, each pixel is obtained by a weighted average.
The Canny operator is used to extract edges from gradient images. The change in grayscale in an image is positively correlated with the magnitude of the gradient values. Before edge extraction, a binarization method is first applied to the gradient image to extract strong edges, with a binarization threshold set to 150. To avoid interfering with edge extraction, edge detection needs to be performed on the basis of strong edges. The lines extracted from the edges display different directions. To reduce the processing time of subsequent algorithms, first filter the lines by angle and remove lines with small differences in angle from the x-axis. Firstly, perform line fitting on all coordinate points on the line, calculate the angle between the fitted line and the x-axis, and then retain the line with a significant angle difference from the x-axis based on the angle. The thermal map extracted from the pantograph image is shown in Figure 6.
Calculate the angle between every 2 straight lines in the reserved straight lines. The threshold is set to 15°. If the angle between the straight lines is greater than the threshold, it is determined that there is an abnormality in the structure of the pantograph carbon plate. We selected 400 datasets to test the results. The heatmap for anomaly detection is shown in Figure 7.
Otherwise, it is considered normal. The determination of pantograph slide defect abnormality is shown in Figure 8.

3.1. Evaluation Criteria

We usually use precision (P), recall (R), average precision (mAP), and frames per second (FPS) as metrics to evaluate algorithm performance. The calculation formulas for several performance indicators are as follows:
Preprints 140474 i002
Preprints 140474 i003
Preprints 140474 i004
Accuracy denotes the proportion of samples correctly classified as positive among all instances the model predicts as positive. In the aforementioned formula, True Positives (TP) represent the number of samples accurately classified as positive by the model, while False Positives (FP) indicate the number of samples erroneously classified as positive. A higher P-value corresponds to a lower false positive rate. The recall rate signifies the proportion of samples correctly classified as positive among all instances that are actually positive. Additionally, False Negatives (FN) denote the number of samples incorrectly classified as negative by the model. A higher R value corresponds to a lower false negative rate [22,23].

3.2. Comparison Experiment with Other Algorithms

To validate the superiority of the algorithm introduced in this study for detecting surface defects on pantographs, we compared several leading object detection algorithms, namely Improved-YOLOv4, YOLOv5, SSD, and Faster R-CNN, under identical conditions. The experimental results are presented in Table 5.
From Table 5, it can be seen that our proposed improved YOLOv8 network model has significantly improved detection accuracy and speed compared to the two-stage algorithm Faster R-CNN. P and R have increased by 8.27% and 6.65%, respectively, and FPS has increased by 127.91%; Compared with the single-stage algorithms Improved YOLOv4, YOLOv5, and SSD, P and R showed improvements of 12.84%, 11.3%, 9.77%, and 4.73%, 2.97%, and 2.64%, respectively, while FPS showed improvements of 158.55%, 175.21%, and 111.2%, respectively. Therefore, the model proposed in this article improves the accuracy of precision measurement while ensuring detection speed, making the positioning detection of pantograph skateboard faster and more accurate.

3.3. Ablation Experiment

To further validate the effectiveness of the enhanced modules in the algorithm proposed in this paper, ablation studies were performed on the baseline model. Ablation experiments were sequentially conducted on the YOLOv8 model with the integration of C2FDarkNet-53 and the enhanced LeakyRelu activation function. The experiments in this study were conducted using identical equipment and datasets. The experimental results are presented in Table 6, where a "√" indicates the adoption of the method.
From Table 6, it can be seen that in the improved model based on the original YOLO v8 algorithm, both P and R have been improved to varying degrees. Among them, the E model P and R improved by 5.75% and 8.09% respectively, making it the most successful ablation experimental group. However, its computational complexity increased accordingly, and the FPS was lower than the final H model, resulting in a slower model recognition speed. By combining C2FDarkNet-53 and referencing LeakyRelu activation function, the improved model achieved P and R of 95.73% and 94.49%, respectively, with a recognition speed of processing 45 frames per second. Compared with the original model, P and R increased by 4.84% and 6.86% respectively, and FPS increased by 131.19%, significantly improving the recognition speed and accuracy of the original model, and further proving the effectiveness of the improved model.

4. Actual Line Abnormality Verification

Based on a certain line data, a vehicle-mounted dynamic test was conducted on the intelligent detection algorithm for pantograph slide defects.
During the test, the pantograph slide condition monitoring device was used to collect pantograph dynamic images, and the pantograph positioning and defect identification were analyzed based on on-site measured data to verify the timeliness and effectiveness of the algorithm.
The line inspection obtained 350 pantograph pictures, 63 of which had structural abnormalities in the carbon slide. The on-site pantograph slide condition monitoring device scans the camera with a sampling frequency of 10kHz and generates 2048 image lines. The above-mentioned pantograph slide structural anomaly detection algorithm is used to determine structural anomalies in the picture. According to statistics, the average detection time of a single image during the detection process is 12.32ms. The time-consuming curve of structural abnormality identification of 350 pantograph slides is shown in Figure 9.
The test results show that the method in this paper has a recognition rate of pantograph slide abnormality of 96.42% under normal exposure conditions, and the recognition rate of pantograph slide structural abnormalities can still reach more than 90% when the image quality is poor. The accuracy statistics are shown in Table 7.

5. Conclusions

This chapter starts from the needs of pantograph online detection and improves the YOLO v8 algorithm. The C2FDarkNet-53 backbone network is used to replace the ShuffleNet v2 network to improve the running speed of the network. The LeakyRelu activation function is used to replace the Mish function to further save time and computing costs. NAS-FPN is used to replace the original path search module PANet to improve the efficiency of feature fusion. The test results show that the improved YOLO v8 algorithm can better meet the requirements of real-time positioning of the pantograph, and then realize the abnormal detection of the pantograph slide structure.
In the pantograph recognition task, the average image recognition rate under normal brightness was 96.42%, and the average image recognition rate under uneven brightness was 90.68%. In the pantograph structural anomaly detection task, image processing methods are used to determine pantograph slide plate abnormalities through filtering denoising, contour extraction, and slide plate fine extraction. Tests were carried out on actual lines, and the results showed that the pantograph slide structure anomaly detection algorithm has good robustness, maintaining a recognition accuracy of 90% and above in complex scenes, and the average time is 12.32ms, which can meet the requirements of the pantograph slide structure anomaly detection algorithm. Requirements for online detection of pantograph status.

Author Contributions

Ruihong Zhou: Writing-original draft. Baokang Xiang: Investigation. Long Wu and Litong Dou: Writing-review and editing.Yanli HU and Kaifeng Huang: Supervision

Acknowledgments

This research is funded by the Open Fund of the State Key Laboratory for Deep Coal Mining Response and Disaster Prevention (SKLMRDPC22KF16), the key natural science research project of Anhui Provincial Department of Education (2024AH051735), the 2024 campus level natural science research project of Huainan Normal University (2024XJZD011), and the discipline construction project of Anhui scientific research innovation platform (SJKYCXPT202304).

Conflicts of Interest Statement

The authors declare no conflicts of interest.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Bruni S.; Bucca G.; Carnevale M. Pantograph–catenary interaction: recent achievements and future research challenges. International Journal of Rail Transportation 2018, 6, 57-82. [CrossRef]
  2. Zhan, W.; Zou, D.; Tan, M. Review of pantograph and catenary interaction[J]. Frontiers of Mechanical engineering 2018, 13, 311-322.
  3. Wu, G.; Dong, K.; Xu, Z. Pantograph–catenary electrical contact system of high-speed railways: recent progress, challenges, and outlooks. Railway Engineering Science 2022, 30, 437-467.
  4. Liu, Z.; Wang, H.; Chen H. Active pantograph in high-speed railway: Review, challenges, and applications. Control Engineering Practice 2023, 141, 105692. [CrossRef]
  5. Song, Y.; Wang, Z.; Liu, Z. A spatial coupling model to study dynamic performance of pantograph-catenary with vehicle-track excitation. Mechanical Systems and Signal Processing 2021, 151, 107336. [CrossRef]
  6. Pappalardo C M.; La Regina R.; Guida D. Multibody modeling and nonlinear control of a pantograph scissor lift mechanism. Journal of Applied and Computational Mechanics 2023, 9, 129-167.
  7. Sriwastav N.; Barnwal A K.; Wazwaz A M. A novel numerical approach and stability analysis for a class of pantograph delay differential equation. Journal of Computational Science 2023, 67, 101976. [CrossRef]
  8. Chen, L.; Duan. F.; Song. Y. Three-dimensional contact formulation for assessment of dynamic interaction of pantograph and overhead conductor rail system. Vehicle system dynamics 2023, 61, 2432-2455.
  9. Wang, Y.; Wang, Y. H.; Wang, P. Rail Magnetic Flux Leakage Detection and Data Analysis Based on Double-Track Flaw Detection Vehicle. Processes 2023, 11, 1024.
  10. Mun, J.; Kim, J.; Do, Y. Design and Implementation of Defect Detection System Based on YOLOv5-CBAM for Lead Tabs in Secondary Battery Manufacturing. Processes 2023, 11, 2751. [CrossRef]
  11. Chen, R. X.; Lv, J. T.; Tian, H. T. Research on a New Method of Track Turnout Identification Based on Improved Yolov5s. Processes 2023, 11, 2123. [CrossRef]
  12. Ross Girshick, Donahue Jeff, Darrell Trevor, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[A]//2014: 580-587.
  13. Kaiming He, Zhang Xiangyu, Ren Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
  14. S Ren, He K, Girshick R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149. [CrossRef]
  15. Zheng, Z. H.; Chen, N. X.; Wu, J. H. EW-YOLOv7: A Lightweight and Effective Detection Model for Small Defects in Electrowetting Display. Processes 2023, 11, 2037. [CrossRef]
  16. Hou, S. Z.; Xu, Y.; Guo, W. Distribution Network Fault-Line Selection Method Based on MICEEMDAN–Recurrence Plot–Yolov5. Processes 2022, 10, 2127.
  17. Vignesh Kumar.; R. Nirmalan.; S. Sujitha. Multi-modal active learning with deep reinforcement learning for target feature extraction in multi-media image processing applications. Multimedia Tools and Applications 2022, 82, 5343-5367.
  18. Lv, N.; Xiao, J.; Qiao, Y. Object Detection Algorithm for Surface Defects Based on a Novel YOLOv3 Model. Processes 2022, 10, 701. [CrossRef]
  19. Umair, M.; Farooq, M.U.; Raza, R.H.; Chen, Q.; Abdulhai, B. Efficient Video-based Vehicle Queue Length Estimation using Computer Vision and Deep Learning for an Urban Traffic Scenario. Processes 2021, 9, 1786. [CrossRef]
  20. Singh, M.; Gehin, A. L.; Ould-Boaumama, B. Robust Detection of Minute Faults in Uncertain Systems Using Energy Activity. Processes 2021, 9, 1801. [CrossRef]
  21. Yao, S.; Kang, Q.; Zhou, M.; Abusorrah, A.; Al-Turki, Y. Intelligent and Data-Driven Fault Detection of Photovoltaic Plants. Processes 2021, 9, 1711. [CrossRef]
  22. Ngan H Y.; Pang G K.; Yung N H. Automated fabric defect detection—A review. Image and vision computing 2011, 29, 442-458.
  23. Wang, L.; He, M.; Xu, S.; Yuan, T.; Zhao, T.; Liu, J. Garbage classification and detection based on YOLOv5s network. Packag. Eng. 2021, 42, 50–56.
  24. Saberironaghi, A.; Ren, J.; El-Gindy, M. Defect detection methods for industrial products using deep learning techniques: A review. Algorithms 2023, 16, 95. [CrossRef]
  25. Li, W.; Zhang, H.; Wang, G. Deep learning based online metallic surface defect detection method for wire and arc additive manufacturing. Robotics and Computer-Integrated Manufacturing 2023, 80,102470. [CrossRef]
  26. Singh, S. A.; Desai, K. A. Automated surface defect detection framework using machine vision and convolutional neural networks. Journal of Intelligent Manufacturing 2023, 34, 1995-2011. [CrossRef]
Figure 1. Network structure diagram of YOLO v8.
Figure 1. Network structure diagram of YOLO v8.
Preprints 140474 g001
Figure 3. Verification result chart.
Figure 3. Verification result chart.
Preprints 140474 g003
Figure 4. Model processing statistical chart.
Figure 4. Model processing statistical chart.
Preprints 140474 g004
Figure 5. Pantograph structural anomaly detection flow chart.
Figure 5. Pantograph structural anomaly detection flow chart.
Preprints 140474 g005
Figure 6. Schematic diagram of the pantograph image extraction process.
Figure 6. Schematic diagram of the pantograph image extraction process.
Preprints 140474 g006
Figure 7. Heat map for anomaly detection.
Figure 7. Heat map for anomaly detection.
Preprints 140474 g007
Figure 8. Pantograph structure abnormality judgment.
Figure 8. Pantograph structure abnormality judgment.
Preprints 140474 g008
Figure 9. Identification time-consuming curve.
Figure 9. Identification time-consuming curve.
Preprints 140474 g009
Table 1. Comparison between two network models.
Table 1. Comparison between two network models.
Network Type Network parameter size /MB Running time / ms
C2FDarkNet-53 2.7 13
ShuffleNetv2 253.4 14
Table 2. Different Backbone on training.
Table 2. Different Backbone on training.
Method Backbone Size FPS mAP
YOLO v2 Darknet-19 512*512 13 25.1%
YOLO v3 Darknet-53 512*512 23 30.6%
YOLO v4 CSPDarkNet-53 512*512 32 44.9%
YOLO v5 CSPDarkNet-53 512*512 38 48.2%
YOLO v6 EfficientRep 512*512 56 49.9%
YOLO v7 E-ELAN and MPConv 512*512 63 53.4%
YOLO v8 C2FDarkNet-53 512*512 79 65.9%
Table 3. Ratio of image types between training set and test set.
Table 3. Ratio of image types between training set and test set.
Image type Normal image% Low brightness % Uneven brightness %
training set images 76.45 12.43 11.12
Testing set images 65.93 15.65 18.42
Table 4. Image recognition rate statistics (%).
Table 4. Image recognition rate statistics (%).
Group Normal brightness Low brightness Uneven brightness
1 96.54 78.23 90.86
2 97.21 76.36 90.88
3 97.13 78.25 90.15
4 96.84 78.31 88.46
5 97.03 75.54 89.47
6 97.14 76.21 90.87
7 96.65 77.54 91.12
8 96.87 77.32 90.10
9 96.76 77.51 90.41
10 97.31 76.85 88.97
11 97.14 77.38 89.56
12 96.85 78.15 89.74
13 96.79 78.16 90.68
14 97.31 79.13 91.13
15 97.26 76.52 90.57
16 97.16 78.54 91.14
17 96.97 79.12 90.16
18 97.23 77.25 88.94
19 96.89 77.36 88.76
20 97.24 78.29 89.13
Table 5. Comparison experiment with other algorithms.
Table 5. Comparison experiment with other algorithms.
Model P/% R/% mAP@0.5/% FPS
Improved-YOLOv4 82.89 86.63 80.2 26.3
YOLOv5 84.43 88.39 82.7 23.8
SSD 85.96 88.72 83.5 37.5
Faster R-CNN 87.46 84.71 85.3 32.6
This study 95.73 91.36 87.5 41.7
Table 6. Comparison of ablation experimental indicators.
Table 6. Comparison of ablation experimental indicators.
Model YOLO v8 C2FDarkNet-53 LeakyRelu P/% R/% FPS
A 90.89 87.63 34.3
B 91.66 88.60 36
C 92.42 89.71 35.3
D 93.43 89.99 34.3
E 96.64 95.72 38.5
F 93.62 90.75 42.5
G 94.98 93.72 41.1
H 95.73 94.49 45
Table 7. Accuracy rate statistics of pantograph slide structure anomaly detection.
Table 7. Accuracy rate statistics of pantograph slide structure anomaly detection.
Image type Normal image% Low brightness % Uneven brightness %
Accuracy 96.42 91.23 90.68
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated