2.2.2. Image Preprocessing
In order to accomplish standardization and normalizing of picture representation for B-scan images, it is required to preprocess the original images at the beginning. The preparation procedure is presented in
Figure 6.
To reduce information loss during the grayscale conversion of B-scan color pictures, this work offers an information-lossless full-mapping grayscale conversion technique. The algorithm for calculating grayscale values from RGB values is presented in Equation (5):
Secondly, via experiments, median filtering was shown to reduce noise in grayscale pictures to the maximum degree, while the opening operation displayed the largest enhancing impact on defect edge characteristics in images. This study employs median filtering as the smoothing and denoising approach, and the opening operation as the morphological processing method.
ROI refers to the "Region of Interest", which is the key region of a picture for attaining the intended purpose. It frequently includes critical information that needs identification, analysis, and processing. By studying photos after grayscale conversion, smoothing/denoising, and morphological processing, it was observed that their row grayscale mean values display characteristics of changing gradually from top to bottom with regional variations. Calculating their row grayscale mean curves showed that for each B-scan picture, the top of the surface echo and the bottom of the bottom echo correspond to the places where the row grayscale mean value first considerably rises and last dramatically declines, respectively. By choosing suitable thresholds based on these parameters, ROI extraction of B-scan pictures may be achieved. The row grayscale mean curve of the picture is displayed in
Figure 7:
2.2.3. Phase Space Reconstruction Optimal Delay Time and Embedding Dimension Selection
According to the embedding theorem described in Section 1.2, given an indefinitely long noise-free time series, the delay time may be freely set. However, significant practical research have revealed that phase space representation is strongly reliant on the selection of τ. The main idea is to guarantee reconstructed vectors are as mutually independent as feasible.
This work applies the mutual information approach to estimate the best delay time τ. The mutual information coefficient (MI) is used to measure the correlation strength between state vectors following phase space reconstruction. Based on the aforementioned examination of recurrence plot theory, it can be determined that a lower mutual information coefficient implies poorer connection between state vectors, which better reveals rapid changes in system patterns.
The average information content of the system about the sum of variables is defined as the system's information entropy, called joint entropy. Its calculating formula is presented in Equation (6):
The spatial mutual information coefficient may be calculated as illustrated in Equation (7):
At this point, the delay time corresponding to the initial local minimum value of the mutual information coefficient is chosen as the ideal delay time for phase space reconstruction of the image entropy sequence in this segment.
The purpose of selecting an appropriate embedding dimension m is to fully unfold the geometric structure of the attractor as much as possible, thereby characterizing the system's attractor without losing information carried by the original image entropy sequence and revealing the characteristics of the system under analysis. If the embedding dimension m is too small, the attractor will overlap, allowing phase space points from various trajectories to appear in the same neighborhood. This leads in a substantial gap between the reconstructed attractor and the system's true attractor, reducing the accuracy of subsequent quantitative evaluations. Conversely, if m is too big, while guaranteeing the attractor's geometric structure is completely unfurled, it also amplifies noise and increases computing complexity.
This study adopts the enhanced False closest Neighbors (FNN) approach described by Cao's algorithm [
14], which provides a distance coefficient
based on the notion of false closest neighbors:
Define the metric (average rate of change of Euclidean distances between phase points induced by dimensionality increase)
:
Using the rate of change ratio
of the metric value as the requirement for the minimal embedding dimension:
At this point, examine the trend of as m varies, and pick the embedding dimension corresponding to the place where starts to converge to 1 as the minimal phase space reconstruction embedding dimension for this image entropy sequence.
2.2.4. binarization recurrence plot threshold selection
According to the definition of the binarized recurrence plot, it contains the following characteristics:
The primary diagonal values of the square matrix are always 1.
The square matrix is symmetrical along the diagonal (Line of identity, LOI).
For a binarized recurrence plot of a specific signal sequence, the choice of threshold
is critical, since it impacts the final appearance of the binarized recurrence plot. The selection of thresholds
considerably changes the display of the binarized recurrence plot and, additionally, affects the analysis and identification of its system properties. When the threshold value is too little, the number of recurrence points in the recurrence plot is inadequate, which cannot properly describe the system's features. Conversely, when the threshold value is too big, a large number of recurrence points will show as continuous block-like black regions in the recurrence plot, masking valuable recurring feature information. The binarized recurrence graphs under various thresholds are given in
Figure 8.
As can be seen from the locally expanded partial details of the binarized recurrence plot in
Figure 8, when the threshold is set to 5% of the recurrence points percentage, the binarized recurrence plot reveals unique local diagonal characteristics and white crosshair features. At this stage, the pattern changes in the B-scan image sequence may be most clearly portrayed, and it can be determined if the binarized recurrence plot of the sample sequence adheres to the macroscopic structural properties of recurrence plots in abrupt change modes. It is determined that there is an abrupt shift in the system state of this picture entropy series.
2.2.5. Faulty Image Screening Filter Design
By evaluating the features and defects of the binarized recurrence plot, it can be shown that the emergence of local diagonal features and white crosshair lines typically correlates to rapid changes in image entropy induced by defects. The existence and distribution of faults alter the distribution of recurrence spots in the recurrence plot. The more the white crosshair lines are spread, the more important the changes in the picture entropy sequence at the relevant time compared to the image entropy sequence at consecutive times.
Based on the foregoing analysis, a filter for the image entropy sequence may be created by merging the distribution density function of recurrence points along the time series in the binarized recurrence plot. This filter improves the image entropy intervals suspected of having defect features while weakening those that do not, in order to assist the screening of B-scan pictures suspected of containing flaws.
Assuming that the original picture entropy sequence comprises
sample points, the size
of the
recurrence plot is determined as follows:
For the
-th row or column in the recurrence plot, it shows if the phase distance reconstructed between the
-th instant and other
moments passes the threshold. This also depicts, to a certain degree, the chance of a pattern mutation happening at the
-th instant compared to other times. In this respect, paired with the diagonally symmetric qualities of the recurrence plot, the distribution density of recurrence points in rows or columns
may be utilized to quantify the probability density of pattern modifications.
The vector formed of
is the filter
to be solved.
To standardize the numerical range of the filter, each
in
is normalized.
At this point, the numerically normalized filter
is obtained.
Multiply the amplitude values
of the example image entropy sequence
at each time point by the normalized distribution density coefficients
of the filter at each time point to produce
, which is the primary non-systematic abrupt change component of the image entropy sequence. The amplitude value at a specific time point is
.
On this premise, the difference between
and
may be used to identify the variation in image entropy produced by system mutations (which can be defined as the variation caused by "defect characteristics + environmental interference")
, and the amplitude at a specific point
:
The normalized filter
is displayed in
Figure 9, the filtered image entropy sequence
is given in
Figure 10, and the image entropy variation sequence
)is presented in
Figure 11.
In Figures 9, 10, and 11, green and red vertical dashed lines are utilized to represent the sequence intervals of pictures with genuine fault features. It can be observed that the non-zero weight areas of the normalized filter nearly totally encompass the real fault intervals. At the same time, the positive areas of image entropy and image entropy variation after filtering likewise almost totally encompass the real fault intervals. Considering that the image entropy variation sequence most directly represents the system mutation induced by defect features, the B-scan pictures with image entropy variation larger than 0 are regarded as suspected defective images, therefore accomplishing the screening of suspected defective images.