2. Related Work
Current investigations into the measurement of fastener geometric parameters are limited; however, researchers have extensively explored fastener detection, categorizing the methodologies primarily into two types: those utilizing Two dimension(2D) vision and those employing three dimension(3D) vision techniques. Detection methods based on 2D vision imaging involve rapidly capturing 2D images of the railway with 2D cameras, and then using target localization, feature extraction, and and track image classification[
7]. Typically, methods such as template matching[
8], edge segmentation[
9], and gray level statistics[
10] are used to quickly locate the fastener from the 2D track image. Subsequently, different features of the fastener image are extracted based on prior knowledge and the shape of the fastener, such as Histogram of Oriented Gradients (HOG) features[
11], local binary features[
12], orientation fields[
13], bag of words features[
9], and semantic features[
14]. Finally, classifiers are used to categorize the features and identify fasteners with visual defects. These methods are highly dependent on specific scenes and lack robustness to environmental changes. In recent years, Deep Convolutional Neural Networks (DCNN) have achieved groundbreaking success in the field of image classification. DCNNs have multi-layered end-to-end networks that can directly extract features and classify targets. Networks such as AlexNet CNN, R-CNN, Mask R-CNN, Fast R-CNN, ResNet, and the Yolo series have yielded good detection results in fastener defect detection applications[
2,
3,
15]. By constructing samples with normal and visual defective fastener images to train the DCNN model, the model can directly output detection results, demonstrating good robustness to the environment. The above-mentioned 2D vision imaging methods can detect fasteners with visual defects and have been successfully used for rapid detecting visual defects. However, 2D vision imaging can only obtain 2D images of the track, and thus cannot detect fastener tightness defects or measure the geometric parameters of the fastener.
In recent years, 3D vision imaging technology has gradually been used for track detection, utilizing RGB-D depth cameras or linear structured light cameras to obtain both 2D images and 3D depth images of the scene. By processing 3D depth data or fusing 2D images with 3D depth information, various defects of fasteners can be detected. Shi[
4] used an RGB-D sensor to obtain RGB images and depth images (Depth, D) of subway tracks, referred to as RGB-D images. By fusing depth images with RGB images, the influence of environmental light on detection results is overcome, thereby improving the accuracy of visual defect detection. Nevertheless, RGB-D cameras exhibit limited accuracy in depth measurement, rendering them unsuitable for precise geometric parameter measurement. Structured light imaging can accurately obtain 3D data of the scene. Jin et al.[
16] used a 3D Ranger camera to obtain railway intensity (Intensity, I) and D images, forming a bimodal I-D image. They located fasteners in the depth image and fused the location parameters with the intensity map to achieve precise fastener location, and constructed a ResNet network to detect visual defects from the intensity image. Similarly, Zhan et al.[
17] used the height of fasteners in the depth image to quickly locate the fastener and constructed a RailNet DCNN modal to extract features and classify the intensity image of fasteners, thereby detecting visual defects. The above 3D vision imaging methods can simultaneously obtain color and depth information of the scene. By fusing the color and depth data of bimodal images, the influence of environment can be effectively overcome, improving the accuracy of detection of visual defects of fasteners. However, these methods have not been studied for the detection of fastener tightness defects and measurement of the geometric parameters of fasteners.
To detect fastener tightness defects, Han et al.[
18] used a 3D Ranger camera to obtain 2D depth images and 3D point clouds of the track. Based on the template matching algorithm, they quickly located fasteners in the depth image and used the watershed algorithm to segment the bolt of the fastener in the depth map. The structural defects of fasteners were detected based on the height difference between the bolt height and the rail edge. Gao et al.[
19] used line structured light sensor to obtain 3D point clouds of track fasteners. They segmented the 3D point clouds of fastener bolt using prior knowledge and regional growth methods and detected severely loose fasteners based on bolt height. Wang et al.[
20] used a 3D camera to obtain sparse 3D point cloud data of the track, constructed an accumulated height function, and segmented the rail and fastener based on prior knowledge. They detected fastener tightness by calculating the height difference between the bolt and the non-wearing area outside the rail head. This method can accurately detect severely loose fasteners. Mao et al.[
21,
22] used the structured light sensors to collect the railway point cloud, and detect visual defects based on decision trees according to the bolt height, and the tightness of the fastener was judged by calculating the gap of the clip according to the spatial position of the centerline. The above 3D measurement methods can detect the over-loose fasteners, but they do not conduct research on the measurement of fastener geometric parameters. In order to measure the geometric parameters, Cui et al.[
23] used a structured light camera to build a 3D measurement system to collect dense point cloud of the track. They measured the rail gauge, bolt height, height adjustment pads based on dense point clouds. However, this method has not considered the defect of the fastener.
In summary, 2D vision imaging methods can rapidly detect visual defects of fasteners, and 3D vision sensors can simultaneously obtain color and depth images of the scene. By combining multi-modal data, the influence of environment and lighting can be further overcome, enhancing the accuracy of visual defect detection. 3D imaging can accurately obtain 3D point clouds of the track, and by utilizing the 3D spatial structure of fasteners, the tightness defects of fasteners can be detect. However, most of the research mainly focuses on defect detection, with relatively little research on the measurement of fastener geometric parameters. Moreover, there is a lack of a detection technology that can simultaneously detect fastener defects and measure the geometric parameters of fasteners. Therefore, this paper proposes a method for detecting defects and measuring the geometric parameters of fasteners based on 3D linear laser sensor.
The workflow of our method is illustrated in
Figure 4. Initially, a 3D imaging system is constructed based on a 3D linear laser sensor to generate RGB-P bimodal data, which includes an RGB depth image and its corresponding point cloud. Subsequently, the visually defective fasteners is detected and visually normal fasteners is located from the RGB depth image using the Yolov8s network. The point clouds of visually normal fasteners are then rapidly segmented based on the RGB-P bimodal data mapping. Finally, the point clouds of fastener components are segmented using the PointNet++ network, and the geometric parameters of the fastener components are measured based on their spatial structure. The main contributions of this research are summarized as follows:
(1)The RGB-P bimodal data, including the RGB depth image and corresponding point cloud, is constructed based on the 3D linear laser sensor to facilitate the detection of defects and the measurement of geometric parameters of railway fasteners. The visual defect of fastener is rapidly detected from the RGB image, and the geometric parameters are accurately calculated by processing the fastener point cloud based on the spatial structure of fastener components.
(2)The geometric parameters of railway fasteners, including the specifications of insulated blocks and the thickness of height adjustment pads, are accurately quantified for precise railway fine-tuning. For fastener components with minimum specification differences of 1 mm, the measurement error remains below 0.5 mm.
(3)The 3D measurement system of railway fastener is constructed based on the 3D linear laser sensor to detect defects and measure the geometric parameters of fastener, the system achieves a real-time detection and measurement speed of 4.32 km/h, effectively replacing manual inspection for high-speed railway fasteners.
The remainder of this paper is organized as follows: Our method is detailed in Section Ⅲ, while Section Ⅳ verifies the precision through a full experimental demonstration of the defect detection and geometric parameter measurement. and finally, conclusions are summarized in Section V.
3.1. The 3D Imaging System Construction Based on 3D Linear Laser Sensor
3.1.1. Data Acquisition with 3D Linear Laser Sensor
Constructing an effective imaging system is crucial for efficient and precise railway fastener inspection. This research utilizes Gocator 2450 3D linear laser sensor, produced by LMI, to collect track data. The sensor employs a blue light laser with a 405nm wavelength, offering a depth of field of 550mm and a field of view that spans from 145mm to 425mm. It boasts a linear measurement accuracy of 0.01mm in the z-direction. The measurement accuracy provided by the sensor ranges from 0.019mm to 0.06mm and decreases progressively from close range to far range. It can operate at a maximum acquisition frequency of 5000Hz, capturing up to 5000 lines of profiles per second.
The sensor is mounted vertically on the inspection vehicle, allowing it to scan objects within its depth and width field of view. Configuring the sensor for external trigger mode and trigger distance, the user can set parameters such as field of view, height range, and horizontal spacing. Each trigger pulse from an external source prompts the sensor to scan a line profile. An encoder attached to the inspection vehicle’s wheels generates pulses that correspond to wheel rotations, serving as input for the sensor to regulate profile acquisition. When the inspection vehicle’s speed is lower than the sensor’s maximum acquisition speed, the sensor scans the track profile heights within the field of view autonomously, synchronizing its scanning speed with the vehicle’s speed. The principles and schematic diagram of railway fastener point cloud data collection are illustrated in
Figure 5.
Detecting defect of fastener based on point clouds processing can be a time-intensive process. To enhance computational efficiency, we construct RGB-P bimodal data, which integrates an RGB depth image with its associated point cloud. Each pixel in the RGB depth image corresponds to a specific point within the point cloud representation. The RGB depth image conveys the fastener’s shape and existence, whereas the point cloud offers spatial configuration and depth details. This methodology facilitates the detection of visual defects in the fastener and its localization via the RGB depth image, while the geometric parameters of the fastener can be derived from the point cloud data.
For the construction of RGB-P bimodal data, a coordinate system is established for data acquisition, tailored to the characteristics of the 3D laser sensor. The Y-axis aligns with the direction of the rail, the X-axis is perpendicular to the rail, and the z-axis represents vertical height. The laser sensor’s sample interval in the X and Y directions is denoted as dx and dy, with dy also acting as the trigger distance. The dimensions of the scanned area are defined by the width (W) and length (L), and the number of points along the X and Y axes is represented by M and N, respectively. These values are derived from the equations M=W/dx and N=L/dy, with the parameters W, L, dx, and dy user-defined for the 3D laser sensor. N and M correspond to the rows and columns of the point clouds and the RGB depth image.
If the depth image is represented by a grayscale image, then each pixel corresponds to a height. However, grayscale images only have 256 gray levels, which limits the amount of information they can contain. In contrast, RGB image can depict 224 colors, and each height segment is represented by RGB color. This enables the representation of over 160 million height segments. As a result, this paper constructs RGB depth image to correspond one-to-one with the point clouds. This indicates that the corresponding pixel in the depth image appears in the same position as the point cloud map.
Each row in the profile contains
M points, and a point cloud data set is formed by
N rows of these profiles. This specific point cloud data set corresponds to an image with a size of
N×
M. A point located at the
i-th row and
j-th column within this set of point clouds can be represented as (
Xi,
Yj,
Zij), where
i=1,2,...,
N,
j=1,2,...,
M. The depth field of the sensor is
H, which equals
Zmax-
Zmin, where
Zmax and
Zmin are the maximum and minimum measurement heights of the 3D laser sensor, respectively, and
. The height
H is divided into
Num parts as equal interval
, where
and
is set by users. A color table is then established, and
Num+1 different colors are selected in the color table. This enables the representation of different height ranges with different colors, and the relationship between height and color can be expressed as follows.
where [
x] represents an integer for
x, ni is the index of color table.
Due to the impact of the 3D line laser imaging technique and track conditions, local occlusions occur, creating blind spots when the 3D laser scans the railway track. As a result, the height value of the blind area becomes invalid. Thus, it is essential to verify the validity of the height value for each point when constructing the depth image. In summary, the algorithm for constructing RGB depth image is outlined as follows:
Step 1: Getting the parameters of the sensor, including dx, dy, L, W.
Step 2: Constructing an empty RGB image I with a size of .
Step 3: During 3D laser sensor scanning of the railway, it is necessary to determine whether the Zij value of point cloud Pij(Xi, Yj, Zij) is valid. If Zij is a valid value, the point Pij is then mapped into a 2D image I with coordinates (i, j), and its color is determined by Equation (1).
Step 4: If Zij of point Pij(Xi, Yj, Zij) is an invalid value, the pixel of the image at coordinates(i, j) is represented by the last color in the color table.
Step 5: Save RGB depth images along with point cloud with the same filename, and output RGB-P bimodal data of the scanned scene, as shown in
Figure 6.
3.1.3. The Mapping Relationship of RGB-P Bimodal Data
In this research, the collected railway profiles are consolidated into a point cloud and recorded in a text file. Each line within this file contains three numerical values that represent the
X,
Y, and
Z coordinates of a single point. Given that the size of the RGB depth image is
M×
N, the text file contains
M×
N lines, each representing a point. The point clouds are then imported from the text file into a container of point structures, referred to as
Vect. The number of points in the
Vect container corresponds to the pixel count of the RGB image, with each point detailing the
x,
y, and
z coordinates of the point cloud. Given image
I and its coordinates (
i,
j), the mapping relationship between image
I and point cloud
Pid(
Xi,
Yj, Zij) in
Vect is described by Equation (2):
Any pixel in the image can be located with the corresponding point cloud using formula (2).
3.2. Visual Defect Detection and Fastener Region Location
Fasteners’ visual defects are manifested as conspicuous texture deviations compared to their normal states, while structural defects are predominantly reflected in height disparities. The linear laser sensor operates on the triangulation principle for imaging, and the resulting RGB depth image remains unaffected by illumination conditions. Additionally, the „ω-shaped” fastener features a relatively simple geometry. For normal fasteners of the same type in appearance, the texture of the RGB depth images of their tracks is nearly identical. In contrast, defective fasteners exhibit substantial shape differences from normal ones. As a result, conventional deep-learning-based object detection algorithms can efficiently identify fastener defects from RGB depth images. Common deep-learning object detection methods encompass the R-CNN series[
24], the SSD network[
25], and the Yolo series[
26], among others. These object detection networks have found extensive applications across diverse fields, each boasting its own set of advantages. In this study, the classic Yolov8 network is employed to detect fastener defects in RGB depth images[
27]. The use of rectangular bounding box detection enables rapid localization of fasteners. Leveraging the positions of the rectangular bounding boxes of objects and the mapping relationship between images and point clouds, the point cloud of the visually normal fastener region can be swiftly segmented.
Given the varying nature of visual defects of fasteners and the similar appearance of normal fasteners and those with structural defects, this paper categorizes visual defects as „defect”. Fasteners that appear visually normal and structural defects, are labeled as „normal”. To facilitate the simultaneous detection of defective fasteners and the precise segmentation of the fastener regions, the rectangular labels for normal samples are designed to closely match the minimum bounding rectangle around each fastener.
We compiled a training dataset for the Yolov8s model using RGB depth images of „ω-shaped” fasteners. The trained model generates a prediction box for each detection result, represented by (x, y, w, h). Here, (x, y) indicates the top-left corner of the rectangle, and w and h denote the width and height of the box, respectively. Therefore, we utilized the box (x, y, w, h) to locate the visually normal fastener area from the railway image.
The Point clouds of visually normal fastener area can be rapidly segmented based on the relationship of RGB-P bimodal data with Equation (1). The detection results of fastener based on the YOLOv8s is shown in
Figure 7 and the flowchart and result of point clouds segmentation of fastener is presented in
Figure 8.
Given that geometric parameters cannot be directly calculated from 2D images, point cloud analysis is employed to determine the geometric parameters of fasteners. Precise measurement of fastener geometric parameters requires segmentation of point cloud data into individual fastener components. The geometric parameters of these components can then be computed by analyzing their spatial interrelationships and relative positions.
3.3.1. Fastener Component Point Cloud Segmentation Using PointNet++ Network
Point cloud segmentation methodologies can be broadly categorized into traditional approaches (region growing, clustering, RANSAC plane segmentation) and deep learning-based methods[
28]. Region growing algorithms initiate from a seed point and iteratively incorporate adjacent points meeting predefined criteria until convergence[
29]. However, this approach is highly sensitive to initial parameter selection, particularly seed point placement, which can significantly impact segmentation outcomes. While clustering algorithms (e.g., K-means and DBSCAN) are extensively employed, their effectiveness is constrained by parameter sensitivity, including cluster quantity specification and density threshold determination[
30]. RANSAC-based plane segmentation employs random sample consensus to identify optimal plane models and delineate planar regions[
31]. While this approach excels in environments dominated by planar surfaces (e.g., indoor scenes), it proves inadequate for complex geometric structures. In contrast, deep learning approaches leverage neural networks to perform point cloud segmentation, with architectures such as PointNet and PointNet++ demonstrating remarkable capability in handling complex geometric structures. PointNet++ overcomes these limitations by its unique hierarchical feature learning framework. It can better capture local geometric structures at multiple scales, making it more suitable for segmenting the complex point clouds of fasteners. Given its hierarchical architecture and enhanced segmentation capabilities, PointNet++ was used for the primary methodology for fastener component point cloud segmentation in this study.
PointNet++, introduced by Qi et al.[
32], represents a significant advancement in 3D point cloud processing architecture. This framework has demonstrated exceptional versatility across various applications, including point cloud classification, semantic segmentation and 3D object recognition. The architecture, illustrated in
Figure 9, implements hierarchical feature learning through a set abstraction operation that processes local point cloud regions at multiple scales. The PointNet++ architecture processes point cloud data as input. This network is characterized by a hierarchical feature learning encoder and a task-specific prediction layer, which facilitates either classification or segmentation. The hierarchical feature learning process comprises two key operations: sampling and grouping, and local feature learning using PointNet layers. The sampling and grouping process utilizes farthest point sampling to identify representative key points. To address point cloud sparsity and irregularity, multi-scale grouping or multi-resolution grouping is implemented, which constructs spherical neighborhoods with different radii around each sampled point to handle varying point densities. In the feature extraction phase, the PointNet architecture processes the sampled groups to extract comprehensive local-to-global features. This hierarchical feature aggregation mechanism captures both point-wise characteristics and contextual information within local neighborhoods, enabling effective representation of geometric relationships. For classification tasks, the PointNet++ network typically employs categorical cross-entropy as the loss function, while for segmentation tasks, it uses point-wise cross-entropy, defined as:
where
is the number of samples,
C is the number of categories,is the probability of the true label category (0 or 1), and is the category probability output by the network.
The network was optimized using the Adam optimizer, which employs adaptive moment estimation to dynamically adjust learning rate, facilitating faster convergence and enhanced model performance. The network training hyper-parameters are detailed in
Table 1. The trained PointNet++ model enables precise segmentation of fastener components, yielding individual point cloud subsets for each component. These segmented point clouds serve as the foundation for subsequent geometric parameter extraction and analysis.
3.3.2. Geometric Parameters Calculation of Fastener
The „ω-shaped” fastener is primarily composed of bolt, metal clip, insulated block, height adjustment pads under the rail, height adjustment pad under the iron plate, and various other components. Among these components, the bolt presses the metal clip to secure the rail on the sleeper, hence the height of the bolt upper surface reflects the tightness of the fastener. The insulated block is used for fine adjustment of the rail gauge, while the height adjustment pad under the rail and under the iron plate are used for adjusting the rail height. These two types of pads come in different specifications.
Track maintenance operations frequently require precise adjustments of critical parameters, particularly vertical alignment and gauge spacing between parallel rails. These precise geometric adjustments are fundamental to ensuring safe and comfortable high-speed rail operations, contributing significantly to China’s exemplary high-speed railway performance.
During operational service, track geometry gradually deteriorates due to dynamic loading and environmental factors, necessitating periodic adjustments to maintain optimal gauge and vertical alignment parameters. Long-term operation leads to variations in fastening system component geometries. Accurate measurement of these parameters is crucial for maintenance planning and component selection during track adjustment procedures. Traditional manual measurement methods are both time-intensive and prone to human error.
This study demonstrates the point cloud-based geometric parameter measurement methodology using the WJ-8 fastening system as a case study. The proposed methodology is generalizable to other fastener systems, including the Vossloh-300 series and WJ-8.
Figure 10 presents the structural schematic, in-situ photograph, and point cloud representation of the WJ-8 fastening system.
(1)
The specification of insulated block. Insulated blocks are manufactured in five standard specifications (designated as No. 7 through No. 11), with corresponding nominal thicknesses ranging from 7 mm to 11 mm in 1 mm increments. Track gauge adjustment can be achieved through the selective replacement of insulated block, where the appropriate thickness is chosen to achieve the desired gauge modification.
Figure 11 illustrates the installation configuration and structural details of the insulated blocks, where
d denotes the block thickness. Given that the insulated block’s inner surface interfaces directly with the rail edge , its thickness can be computed as the differential between the x-coordinates of the insulated block’s outer edge and the adjacent rail edge. The insulated block thickness
d can be determined using the following mathematical expression:.
In equation (3) and are x-coordinates of the outer edge of the insulated block and the adjacent edge of the rail, respectively. To accurately calculate and , this paper fits lines to the point clouds of the outer side of the insulated block and the edge of the rail obtained from PointNet++. The x-coordinates of these lines are correspond to the values of and .
(2) The height adjustment pads. The WJ-8 type fastener is suitable for ballastless tracks and comes with two types of height adjustment pad: the height adjustment pad under the rail and the height adjustment pad under the iron plate. The height adjustment pad under the rail is installed beneath the rubber pad and above the iron pad, and is available in four thickness specifications: 1 mm, 2 mm, 5 mm, and 8 mm. The height pad under the iron plate is installed beneath the iron plate and above the insulation buffer pad, and is available in two thickness specifications: 10 mm and 20 mm. According to the high-speed railway maintenance standards, when the rail elevation is less than or equal to 10 mm, only the height pad under the rail needs to be installed; when it exceeds 10 mm, the height adjustment pad under the iron plate will be installed. A single set of fastener systems can accommodate a maximum of two track-level fine-tuning pads, with a total thickness not exceeding 10 mm. These two types of height adjustment pad can raise the rail by a maximum of 30 mm.
A 3D imaging system composed of 3D linear lasers can scan a portion of the point cloud on the upper surface of the rubber pad. Therefore, the thickness of the height adjustment pad under the rail can be directly measured based on the height of the upper surface of the rubber pad, the height of the upper surface of the iron pad, and the thickness of the rubber pad.
The thickness of the height pad under the iron plate can be calculated with formulation
In the equation (4), is the height of upper surface of rubber pad, which has been marked as ⑥. is the height of upper surface of iron plate, which has been marked as ④. is the thickness of rubber pad (for the WJ-8 type fastener, this value is 10mm typically).
The thickness of height pad under the iron plate can be calculated using formula
In the equation (5), represents the height of the upper surface of the iron plate limit pillar, which has been marked with the red rectangle. represent the height of upper surface of the sleeper support, which has been marked as ⑧. represents the thickness of the butter pad beneath the iron plate and represents the height difference between the upper surface of the iron plate limit pillar and the middle-upper surface of the iron plate (is 38.2mm typically for the WJ-8 type fastener).
Bolt Height Analysis for structural defect detection. Fastening system deficiencies, including both under- and over-torqued conditions, cannot be reliably detected through conventional 2D image analysis. Fastener can be compromised through multiple mechanisms: cyclic loading from train-induced vibrations, environmental factors such as moisture infiltration and subsequent freeze-thaw cycles in bolt holes, which generate substantial longitudinal forces accelerating bolt loosening. Compromised bolt torque leads to reduced clamping force, potentially compromising rail restraint and threatening operational safety. Conversely, excessive torque application can result in over-stressed clips, potentially leading to fatigue failure under dynamic loading conditions. Regular monitoring of bolt torque conditions is therefore critical for maintaining system integrity. Deviations in bolt height relative to the base-plate surface serve as a quantitative indicator of improper torque conditions. This geometric parameter enables objective detection of fastener structural defects, facilitating detection of both under- and over-torqued conditions. The bolt height parameter (
) can be determined by
Where represents the upper surface of bolt, which has been marked as ①.
The value of represents the looseness of the fastener. If is higher than a threshold value, the fastener is considered loose; conversely, if is lower than a threshold value, the fastener is considered over-tight.