Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing

Wei Li; Jiazhu Li; Shuyu Wang; Yan Chen; Jian Chen

doi:10.20944/preprints202604.0346.v1

Submitted:

06 April 2026

Posted:

06 April 2026

You are already at the latest version

Abstract

The health status of rolling bearings is critical to the normal operation of rotating machinery. To effectively extract vibration signal features and accurately identify different fault types, a novel method based on enhanced composite multi-scale slope entropy (ECMSE) and a honey badger algorithm-optimized kernel extreme learning machine (HBA-KELM) is proposed. Specifically, ECMSE integrates high-order differences into the composite multi-scale framework to capture high-frequency information while preserving low-frequency characteristics, thereby enhancing the discriminability of time-series representations. Meanwhile, an average coarse-graining strategy is incorporated to achieve a more comprehensive characterization of the signals. The extracted features are then input into the HBA-KELM classifier for fault identification. Experiments conducted on two public and private rolling bearing datasets demonstrate that our method achieves superior performance in distinguishing different fault types and damage levels compared with several existing approaches.

Keywords:

rolling bearing fault diagnosis

;

multi-scale entropy

;

slope entropy

;

kernel extreme learning machine

;

honey badger algorithm

Subject:

Computer Science and Mathematics - Signal Processing

1. Introduction

Rolling bearings are essential components in most rotating machinery, and their operating conditions have a considerable impact on the normal operation of mechanical systems. Rolling bearing failures often lead to serious safety issues and economic losses. Therefore, accurate and timely fault detection of rolling bearings has become increasingly important [1,2,3,4,5,6].

In practical scenarios, fault signals are typically non-stationary and nonlinear, and are often contaminated with significant noise, making early fault information difficult to detect. Traditional approaches usually analyze signals in the time or frequency domain, but they are less effective in identifying fault characteristics. Therefore, time–frequency analysis methods have been widely adopted for feature extraction, such as wavelet transform (WT) [7], complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) [8], variational mode decomposition (VMD) [9], and Fourier decomposition method (FDM) [10]. However, these methods still have limitations in practical applications. For instance, the performance of WT is highly dependent on the selection of basis functions. CEEMDAN may introduce noise and spurious components in the early intrinsic mode functions. VMD requires manual determination of decomposition layers and penalty factors, and is not suitable for signals with complex and dense modes. In addition, FDM may produce inconsistent results under different search strategies, and its iterative process is time-consuming. Consequently, effective application of these methods often requires substantial prior knowledge and experience, and their inherent limitations are difficult to overcome.

Entropy is an effective tool for measuring the complexity of time series and has been widely used in signal feature extraction across various domains [11,12]. Methods such as approximate entropy (AE) [13], fuzzy entropy (FE) [14], and permutation entropy (PE) [15] have been successfully applied in rolling bearing fault diagnosis. Among them, PE is widely used due to its computational simplicity and efficiency. However, PE only evaluates the complexity of a single-scale sequence. To address this limitation, multi-scale permutation entropy (MPE) [16] was proposed to extract information across multiple scales and improve robustness. Nevertheless, MPE still has several drawbacks: (1) the coarse-grained sequence becomes shorter as the scale factor increases, leading to the loss of useful information; (2) PE ranks subsequences based on relative amplitudes while ignoring their actual values, which may map significantly different sequences into the same pattern. To overcome these issues, Zheng et al. [17] proposed composite multi-scale weighted permutation entropy (CMWPE), which combines composite multi-scale processing with weighted permutation entropy (WPE) [18] to better distinguish sequence differences using variance. In addition to WPE, its variants such as fine-grained permutation entropy (FGPE) [19] and amplitude-aware permutation entropy (AAPE) [20] have also shown similar improvements.

Recently, Slope entropy (SE) [21] is proposed, which encodes the slopes between adjacent amplitudes into symbolic patterns to characterize the dynamic properties of time series. SE is more sensitive to amplitude fluctuations and requires fewer samples to achieve statistical significance. It has also been demonstrated that SE outperforms WPE, FGPE, and AAPE in various applications. Based on this, composite multi-scale slope entropy (CMSE) is obtained by combining SE with composite multi-scale analysis. However, the traditional coarse-graining method generates short sequences at different scales and mainly captures low-frequency information [22]. Moreover, sequences with identical average values may still differ significantly in amplitude distribution and arrangement.

To address these issues and capture high-frequency information, multi-order differences are introduced in this study to replace the average coarse-graining process, resulting in difference-based composite multi-scale slope entropy (DBCMSE). Furthermore, to achieve a more comprehensive representation of time series, an enhanced composite multi-scale slope entropy (ECMSE) method is proposed by combining CMSE and DBCMSE. ECMSE considers both amplitude information and multi-frequency characteristics, while being less sensitive to data length, thereby providing a more effective representation of rolling bearing health conditions. After feature extraction using ECMSE, the obtained feature vectors are input into a classifier for intelligent fault diagnosis. Kernel extreme learning machine (KELM) [23] extends extreme learning machine (ELM) by incorporating kernel functions and generally achieves better generalization performance. However, its performance is sensitive to the selection of regularization coefficients and kernel parameters. To address this issue, the honey badger algorithm (HBA) [24] is employed to optimize the parameters of KELM adaptively. Based on the above methods, a novel intelligent fault diagnosis framework for rolling bearings is proposed. Finally, two rolling bearing datasets are used to validate the effectiveness and reliability of the proposed method in identifying faults with different types and damage levels.

The remainder of this paper is organized as follows. Section 2 introduces the theory and parameter selection of ECMSE. Section 3 presents the HBA-KELM method. Section 4 provides experimental validation, and Section 5 concludes the paper.

2. Enhanced Composite Multi-Scale Slope Entropy

2.1. Slope Entropy

For a given one-dimensional time series

x = {x_{i}}_{i = 1}^{L}

of length L, the phase space of x can be reconstructed, and the subspace sequences are defined as

X_{m}^{t} = {x_{t}, x_{t + τ}, \dots, x_{t + τ (m - 2)}, x_{t + τ (m - 1)}},

(1)

where

t = 1, 2, \dots, L - τ (m - 1)

, m denotes the embedding dimension, and

τ

denotes the time delay.

Then, a vertical increment threshold

γ

(high threshold) is defined to measure significant differences between elements, and a zero-difference threshold

δ

(low threshold) is introduced to characterize cases with approximately equal amplitudes. Next,

X_{m}^{t}

is transformed into a symbolic subsequence based on adjacent differences and predefined thresholds. The procedure for defining slopes as symbols is illustrated in Figure 1.

The probability of each symbolic subsequence pattern is calculated as

p_{i} = \frac{h_{i}}{Z},

(2)

where

h_{i}

denotes the number of occurrences of the i-th pattern, and Z represents the total number of patterns.

Finally, based on Shannon entropy, slope entropy is defined as

S E = - \sum_{i = 1}^{Z} p_{i} {log}_{2} (p_{i}) .

(3)

2.2. Composite Multi-Scale Slope Entropy

The composite coarse-graining technique generates multiple subsequence combinations, thereby reducing entropy fluctuations and improving robustness compared with the traditional multiscale method. By combining it with slope entropy, composite multi-scale slope entropy (CMSE) is obtained. For a given one-dimensional time series

x = {x_{i}}_{i = 1}^{L}

of length L, the average coarse-grained sequences

y_{k}^{(s)} = {y_{k, 1}^{(s)}, y_{k, 2}^{(s)}, \dots, y_{k, q}^{(s)}}

are defined as

y_{k, j}^{(s)} = \frac{1}{s} \sum_{i = (j - 1) s + k}^{j s + k - 1} x_{i},

(4)

where

1 \leq j \leq q = ⌊ L / s ⌋

,

1 \leq k \leq s

, and

s = 1, 2, \dots, s_{max}

. Here,

⌊ \cdot ⌋

denotes the floor operator,

s_{max}

is the maximum scale factor, and

y_{k}^{(s)}

represents the k-th coarse-grained sequence at scale s.

Then, the SE values of each coarse-grained sequence are computed, and CMSE is defined as

C M S E (x, s, m, τ, γ, δ) = \frac{1}{s} \sum_{k = 1}^{s} S E (y_{k}^{(s)}, m, τ, γ, δ) .

(5)

2.3. Difference-Based Composite Multi-Scale Slope Entropy

To capture high-frequency information and reveal details that CMSE may overlook, difference-based composite multi-scale slope entropy (DBCMSE) is proposed by replacing the average coarse-graining method with a multi-order difference scheme. For a given one-dimensional time series

x = {x_{i}}_{i = 1}^{L}

of length L, the

(s - 1)

-order difference is applied under scale factor

s > 1

. The difference-based sequences

z_{k}^{(s)} = {z_{k, 1}^{(s)}, z_{k, 2}^{(s)}, \dots, z_{k, q}^{(s)}}

are defined as

z_{k, j}^{(s)} = \sum_{i = (j - 1) s + k}^{j s + k - 1} {(- 1)}^{s - 1 - a} (\binom{s - 1}{a}) x_{i},

(6)

where

a = i - [(j - 1) s + k]

,

1 \leq j \leq q = ⌊ L / s ⌋

,

1 \leq k \leq s

, and

s = 1, 2, \dots, s_{max}

. Here,

(\binom{s - 1}{a})

denotes the binomial coefficient, and

z_{k}^{(s)}

represents the k-th difference-based sequence at scale s.

Similar to CMSE, DBCMSE is defined as

D B C M S E (x, s, m, τ, γ, δ) = \frac{1}{s} \sum_{k = 1}^{s} S E (z_{k}^{(s)}, m, τ, γ, δ) .

(7)

2.4. Enhanced Composite Multi-Scale Slope Entropy

To incorporate both low- and high-frequency information, CMSE and DBCMSE are averaged. The enhanced composite multi-scale slope entropy (ECMSE) is defined as

E C M S E (x, s, m, τ, γ, δ) = \frac{C M S E (x, s, m, τ, γ, δ) + D B C M S E (x, s, m, τ, γ, δ)}{2} .

(8)

The implementation procedure of ECMSE is illustrated in Figure 2. The traditional coarse-graining method is analogous to an arithmetic mean filter, which can suppress noise but may obscure fine-scale details and reduce frequency information [25]. In contrast, the multi-order difference method can be regarded as a weighting operation and exhibits high-pass filtering characteristics. For example, the weights are

{- 1, 1}

at scale

s = 2

, and

{1, - 2, 1}

at scale

s = 3

. As shown in Figure 3, the normalized spectra indicate that the multi-order difference operation behaves as a high-pass filter, with the cutoff frequency increasing as the scale factor increases.

From a numerical perspective, the two methods may produce different outputs. For instance, subsequences

{1, 3, 5}

and

{2, 4, 3}

yield the same average value but different outputs under the difference-based method. Similarly, sequences with identical elements but different permutations can be distinguished by the difference-based method, while the coarse-graining method produces identical results. However, the difference-based method cannot distinguish sequences with the same linear trend. Therefore, the two approaches are complementary, and their combination leads to improved performance.

2.5. Parameter Selection of ECMSE

The performance of ECMSE depends on the embedding dimension m, time delay

τ

, maximum scale factor

s_{max}

, low threshold

δ

, high threshold

γ

, and sample length L. In this study,

τ = 1

and

s_{max} = 8

are set according to [26]. To determine appropriate thresholds, we set

δ \in {10^{- 1}, 10^{- 2}, 10^{- 3}, 10^{- 4}, 10^{- 5}, 10^{- 6}},

(9)

and

γ \in {30^{\circ}, 45^{\circ}, 60^{\circ}} .

(10)

The embedding dimension is selected from

m \in [2, 7] .

(11)

To determine the optimal parameters, the mean silhouette coefficient (MSC) is used to evaluate clustering performance. MSC ranges from

- 1

to 1, and higher values indicate better feature separability. It is defined as

M S C = \frac{1}{N} \sum_{i = 1}^{N} (\frac{b (i) - a (i)}{max {a (i), b (i)}}),

(12)

where

a (i)

represents intra-class cohesion and

b (i)

denotes the minimum inter-class distance.

Finally, different sample lengths are evaluated to analyze robustness. Considering both stability and computational efficiency,

L = 1024

is selected. As illustrated in Figure 4, the overall trends of the entropy curves for different sample lengths are generally consistent. However, when

L = 512

, the entropy values of both noise signals deviate from this trend, and the curve of the

1 / f

noise exhibits a more pronounced offset along with larger standard deviations.

3. Honey Badger Algorithm Optimized Kernel Extreme Learning Machine

3.1. Honey Badger Algorithm

HBA is a biological heuristic algorithm that simulates the activity of a honey badger colony searching for a beehive. In the search phase, if the smell of food is strong, their movement intensifies, and vice versa. The search procedure of HBA is as follows:

(1) Initializing the honey badger colony:

x_{i} = l b_{i} + r_{1} \times (u b_{i} - l b_{i})

(13)

where

x_{i}

represents the position of the i-th honey badger,

r_{1}

is a random number in

[0, 1]

,

u b_{i}

and

l b_{i}

represent the upper and lower search bounds, respectively.

(2) Defining intensity: Smell intensity

I_{i}

is determined by source strength S and the distance

d_{i}

:

\begin{matrix} I_{i} = r_{2} \times \frac{S}{4 π d_{i}^{2}} \end{matrix}

(14)

\begin{matrix} S = {(x_{i} - x_{i + 1})}^{2} \end{matrix}

(15)

\begin{matrix} d_{i} = x_{prey} - x_{i} \end{matrix}

(16)

where

r_{2}

is a random number in

[0, 1]

, and

x_{prey}

is the best prey location found so far.

(3) Updating density factor: The factor

ϕ

ensures search stability:

ϕ = C \times exp (\frac{- t}{t_{max}})

(17)

where C is a constant (default

= 2

) and

t_{max}

is the maximum number of iterations.

(4) Digging and honeyguide bird phase: The behavior is determined by r, a random number in

[0, 1]

. When

0 \leq r < 0.5

, the position

x_{new}

is updated as:

x_{new} = x_{prey} + F \times β \times I \times x_{prey} + F \times r_{3} \times α \times d_{i} \times | cos (2 π r_{4}) \times [1 - cos (2 π r_{5})] |

(18)

The direction factor F is defined as:

F = \{\begin{matrix} 1 & if r_{6} \leq 0.5 \\ - 1 & if r_{6} > 0.5 \end{matrix}

(19)

3.2. Kernel Extreme Learning Machine

According to Mercer’s theorem, the kernel function

Ω

is defined as:

Ω = H H^{T}, Ω (i, j) = h (x_{i}) h (x_{j}) = K (x_{i}, x_{j})

(20)

The output of the network can be expressed as:

g (x) = [K (x, x_{1}), \dots, K (x, x_{N})] {(\frac{I}{A} + Ω)}^{- 1} T

(21)

The radial basis function (RBF) is selected as the kernel:

K (x_{i}, x_{j}) = exp (- \frac{∥ x_{i} - x_{j} ∥^{2}}{2 σ^{2}})

(22)

3.3. HBA-KELM Algorithm (Continued)

(5) Honeyguide bird phase: When

r \geq 0.5

, the position is updated according to the following strategy:

x_{new} = x_{prey} + F \times r_{7} \times α \times d_{i}

(23)

Based on the HBA and KELM, the initial parameters

(γ, σ)

of the KELM are optimized using the HBA. The classification error rate of the training set is selected as the fitness function for the HBA-KELM:

fitness (x) = argmin (\frac{\sum_{i = 1}^{n} {error_test}_{i}}{N})

(24)

where

\sum_{i = 1}^{n} {error_test}_{i}

is the number of samples in the training set that are incorrectly classified.

3.4. Implementation of the Proposed Method

To accurately identify rolling bearings under different health conditions, an intelligent fault diagnosis method based on ECMSE and HBA-KELM is proposed. To improve the recognition accuracy of KELM, the global optimization capability of HBA is utilized to achieve automatic parameter selection of KELM, thereby establishing a fault diagnosis model for rolling bearings. The flowchart of the proposed method is shown in Figure 5, and the detailed procedure is as follows:

1.: Acceleration sensors are used to collect vibration signals of rolling bearings under different health conditions.
2.: For each fault type, M samples with length $L = 1024$ are obtained. Then, N samples are randomly selected as the training set, and the remaining $M - N$ samples are used as the test set.
3.: The MSC values of random feature samples are calculated under different parameter combinations $(δ, γ, m)$ to determine the optimal ECMSE parameters. First, the time delay is set to $τ = 1$ , and the maximum scale factor is set to $s_{max} = 8$ . The low threshold $δ$ is only used to categorize approximate amplitudes, which has a limited effect on SE. Therefore, to simplify parameter selection, m is fixed at 2. The optimal low threshold $δ$ is determined based on MSC under different high thresholds. Then, the optimal combination of $(γ, m)$ is selected according to the maximum MSC.
4.: The fault features of the training set and test set are extracted using ECMSE. Subsequently, the training features are input into the HBA-KELM classifier for model training, and the test features are used for classification and recognition.

4. Experimental Study

4.1. Experimental Setting

In this section, two bearing datasets under complex working conditions are selected to validate the effectiveness of the proposed method for rolling bearing fault diagnosis. In each case, the feature extraction quality and classification accuracy of the proposed ECMSE method are compared with its variants, CMSE and DBCMSE. In addition, several existing methods, including CMWPE, RCMFE, RCMDE, HFDE, and HWPE [17,27,28,29,30], are also considered for comparison.

In the experiments, all parameters of CMSE and DBCMSE are set consistently with those of ECMSE. Moreover, the same parameters (i.e., delay time

τ

and embedding dimension m) are adopted for other comparison methods. To ensure the same feature vector length, the decomposition level k of HWPE and HFDE is set to 3, and the maximum scale factor

s_{max}

of other methods is set to 8. In addition, the independent parameters of RCMFE are set as follows: similarity tolerance

R = 0.15 \times SD

and gradient

n = 2

[27]. The number of classes C for RCMDE and HFDE is set to 5 [28]. All methods are implemented in MATLAB 2018b and executed on a computer equipped with an Intel^® Core^™ i5-1135G7 CPU @ 2.40 GHz and 16.00 GB RAM.

4.2. Test Verification Case 1

In Case 1, the rolling bearing dataset [31] from Case Western Reserve University (CWRU) is employed to evaluate the performance of the proposed method. Figure 6 shows the experimental rig, which mainly consists of five components: a fan-end bearing, an induction motor, a drive-end bearing, a torque transducer/encoder, and a dynamometer.

In Case 1, the test objects are 6205-2RS JEM SKF deep groove ball bearings located at the drive end. The fault types include inner race fault, outer race fault, and ball fault. To simulate different damage levels of rolling bearings, electro-discharge machining (EDM) is applied to process healthy bearings. For each fault type, fault diameters of 0.1778 mm, 0.3556 mm, and 0.5334 mm are introduced. As a result, vibration signals under ten different working conditions are obtained. In this experiment, the motor power is set to 1 HP, the sampling frequency is set to 12 kHz, and the rotational speed is approximately 1772 rpm. For each working condition, 100 non-overlapping samples with a length of 1024 are obtained, among which 20 samples are selected as the training set and the remaining 80 samples as the test set. Therefore, there are 200 training samples and 800 test samples in total. The detailed information of bearings under different working conditions is presented in Table 1, and the corresponding vibration signal waveforms are shown in Figure 7.

After obtaining all signal samples, the MSC index is introduced in this paper to determine the optimal parameter combination

(δ, γ, m)

of ECMSE. To this end, 20 samples were randomly selected for each working state of the rolling bearing, and the MSC values of all samples under different parameters were calculated. To simplify the process of parameter selection, we first fix

m = 2

to observe the change of low threshold

δ

with different high thresholds. The results are shown in Figure 8(a). As we can see, with the decrease of

δ

the MSC under three groups of

γ

gradually increases. When

δ

decreases to

10^{- 3^{\circ}}

and lower, the MSC values in each group reach the maximum, so the optimal

δ

is confirmed to be

10^{- 3^{\circ}}

. Following that, keep the optimal low threshold unchanged and then calculate the MSC under different combinations

(γ, m)

. As shown in Figure 8(b), the MSC reaches the maximum value when

γ = 30^{\circ}

and

m = 4

. Consequently, the final parameters of ECMSE are determined to

m = 4

,

τ = 1

,

δ = 10^{- 3^{\circ}}

,

γ = 30^{\circ}

,

L = 1024

. Next, the ECMSE features of all samples are calculated. The feature vectors of the training set are fed into the HBA-KELM classifier for training, and then the feature vectors of the test set are fed into the optimized model for verification. As displayed in Figure 9, the proposed approach can effectively identify bearing faults with different types and damage degrees, and the general recognition accuracy achieves 100%.

To validate the superiority of the proposed method, feature extraction performance experiments are conducted, and the comparison methods include CMSE, DBCMSE, RCMFE, RCMDE, HFDE, HWPE, and CMWPE. t-SNE [32] is employed to reduce the multidimensional features of all methods to a two-dimensional space, and the visualization results are shown in Figure 11. It can be observed that the inter-class distances of OR2, B1, and B3 in CMSE are too close to be distinguished. In contrast, except for a few misclassified B3 samples, ECMSE and DBCMSE exhibit good separability overall, demonstrating the advantage of the proposed multi-order difference method in extracting detailed information. Additionally, the HFDE method also shows relatively good clustering performance; however, the B1 and B3 faults remain difficult to distinguish. The other methods exhibit issues such as large intra-class dispersion and severe feature overlap, making it difficult to effectively distinguish different fault types. In summary, DBCMSE and ECMSE achieve similar visualization results, with the best clustering performance among all methods. Following feature extraction, the feature vectors are fed into HBA-KELM for classification. To eliminate randomness and ensure reliability, all methods are executed 30 times, with the training and test sets randomly selected before each run. The recognition accuracy and total computation time of different methods are presented in Table 2 and Figure 10(a). As shown, the proposed ECMSE method achieves the highest recognition accuracy in all experiments.

Figure 10. Recognition accuracy: (a) for different methods; (b) for different training/testing set ratios.

Figure 11. t-SNE visualization results of different feature extraction methods: (a) ECMSE, (b) CMSE, (c) DBCMSE, (d) CMWPE, (e) RCMFE, (f) RCMDE, (g) HFDE, and (h) HWPE

Furthermore, although DBCMSE requires the shortest computation time, the integration of CMSE and DBCMSE in ECMSE increases the computational cost slightly, while significantly improving diagnostic accuracy and robustness.

In addition, to further analyze the influence of sample set partitioning on the experimental results, six scenarios (i.e., training set/test set ratios) are considered: 10/90, 20/80, 40/60, 60/40, 80/20, and 90/10. Figure 10(b) shows the average recognition accuracy for each scenario over 30 runs. As observed, increasing the size of the training set contributes positively to improving recognition accuracy. However, an excessive number of training samples may reduce the efficiency of classifier parameter optimization. On the other hand, variations in the sample ratio have little effect on the proposed method, and its accuracy reaches the maximum under all conditions. Notably, although DBCMSE performs similarly to ECMSE in Case 1, its classification performance degrades when the training/test ratio is 10/90, indicating that ECMSE is less dependent on the number of training samples. In summary, the proposed method exhibits more significant advantages.

4.3. Test Verification Case 2

In Case 2, the experimental data were obtained from the aero-engine bearing test bench in the laboratory, which mainly consists of a spindle testing machine, a refrigeration system, a hydraulic loading system, and a lubrication system. Figure 12 shows the experimental setup and the locations of the accelerometers. The test object is the NU1010EM (inner race detachable)/N1010EM (outer race detachable) single-row cylindrical roller bearing manufactured by NSK. The bearing fault types include inner race fault, outer race fault, ball fault, outer race and ball compound fault, and inner race and ball compound fault.

To simulate different degrees of bearing damage, a laser marking machine and a wire cutting machine were used to process healthy bearings to obtain single-point and multi-point fault bearings with damage dimensions of 9 mm (length) × 0.2 mm (width). In Case 2, a total of nine vibration signals of rolling bearings under different working conditions were collected. During the data acquisition process, the axial load was set to 2 kN, the motor speed was set to 2000 rpm, and the sampling frequency was 20.48 kHz. For each working condition, 100 non-overlapping samples were obtained, of which 20 samples were selected as the training set and the remaining 80 samples were used as the test set. The length of each sample is 1024. The detailed information of bearings under different working conditions is presented in Table 3, and the corresponding waveforms are shown in Figure 13. In this subsection, the parameter selection procedure of ECMSE is the same as that in Case 1. Twenty samples are randomly selected from each bearing type, and the MSC values of all samples are calculated under different parameter settings.

As shown in Figure 14(a), with the gradual decrease of the low threshold

δ

, the variation trends of MSC corresponding to different high thresholds

γ

are basically consistent. When

δ = 10^{- 3}

, the MSC values of the three groups reach their maximum and remain unchanged thereafter. Therefore, the optimal value of

δ

is set to

10^{- 3}

. The selection process of the low threshold in Cases 1 and 2 indicates that even if

γ

varies, the MSC trends with respect to

δ

remain similar. Hence, it is feasible to determine

δ

independently. Similar to Case 1, the MSC values under different parameter combinations

(γ, m)

are shown in Figure 14(b). It can be observed that the MSC reaches its maximum when

γ = 60^{\circ}

and

m = 3

. Accordingly, the final parameters of ECMSE are determined as

m = 3

,

τ = 1

,

δ = 10^{- 3}

,

γ = 60^{\circ}

, and

L = 1024

. Subsequently, the ECMSE features of all samples are extracted. The feature vectors of the training set are then fed into the fault classifier for model training, and the feature vectors of the test set are input into the optimized model for fault identification.

The remaining comparison methods suffer from fault clustering errors, severe fault aliasing, and large intra-class dispersion. It is worth noting that although some feature clusters of OBF and IBF in ECMSE are relatively close, the separability between these two fault types in ECMSE and DBCMSE is still the highest among all methods. Overall, the proposed method exhibits superior clustering performance.

Similar to Case 1, the features extracted by all methods are input into HBA-KELM for classification experiments. To avoid randomness and ensure reliability, each method is executed 30 times, with the training and test sets randomly selected each time. The experimental results are presented in Table 4 and Figure 15(a). It can be observed that the proposed method achieves the highest average recognition accuracy and the smallest standard deviation. Additionally, although DBCMSE requires the shortest computation time, its average accuracy is only 99.11%, which is lower than that of ECMSE (99.82%). Furthermore, the training/test sets are divided into different ratios (10/90, 20/80, 40/60, 60/40, 80/20, and 90/10). The average recognition accuracy is calculated over 30 runs for each ratio, and the results are shown in Figure 15(b). As can be seen, the recognition accuracy of the comparative methods does not reach or exceed that of the proposed method under any condition, demonstrating the superiority of ECMSE.

5. Conclusion

To accurately identify different fault types of rolling bearings, an intelligent fault diagnosis method based on ECMSE and HBA-KELM is proposed in this paper. In the feature extraction stage, the CMSE method is employed to capture low-frequency information from time series, while a multi-order difference strategy is introduced to construct the DBCMSE method for extracting high-frequency information. By integrating these two approaches, ECMSE is developed to effectively extract comprehensive fault features of rolling bearings. Subsequently, the MSC criterion is utilized to determine the optimal parameter combination of ECMSE. In the fault identification stage, HBA is used to optimize the regularization parameter and kernel parameter of KELM, thereby establishing an effective fault diagnosis model. Experimental results on two datasets demonstrate that the proposed method can accurately identify different fault types and damage severities of rolling bearings, as well as effectively distinguish composite faults. Furthermore, comparative experiments verify that ECMSE achieves superior feature extraction capability, higher recognition accuracy, and better stability than existing methods. Future work will focus on applying the proposed method to fault diagnosis of other rotating machinery and further investigating its performance under variable operating conditions.

Author Contributions

Methodology, W.L. and J.C.; Validation, W.L. and J.L.; Formal analysis, W.L. and J.C.; Investigation, Y.C.; Resources, S.W. and J.L.; Data curation, W.L. and S.W.; Writing—original draft, W.L.; Writing—review & editing, J.C. and J.L.; Supervision, J.C.; Project administration, J.L.; Funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

This study employed both publicly available and private datasets for experimental analyses. The publicly available dataset is Case Western Reserve University (CWRU) Bearing Dataset (https://engineering.case.edu/bearingdatacenter, accessed on 1 March 2025). In addition, a private dataset was collected using a custom experimental platform developed at Hefei University of Technology. Due to copyright and ownership restrictions, this dataset is not publicly available but can be provided upon reasonable request from the corresponding author.

References

Cerrada, M.; Sánchez, R.V.; Li, C.; Pacheco, F.; Cabrera, D.; De Oliveira, J.V.; Vásquez, R.E. A review on data-driven fault severity assessment in rolling bearings. Mechanical Systems and Signal Processing 2018, 99, 169–196. [Google Scholar] [CrossRef]
Yu, K.; Lin, T.R.; Ma, H.; Li, X.; Li, X. A multi-stage semi-supervised learning approach for intelligent fault diagnosis of rolling bearing using data augmentation and metric learning. Mechanical Systems and Signal Processing 2021, 146, 107043. [Google Scholar] [CrossRef]
Sun, B.; Sheng, Z.; Song, P.; Sun, H.; Wang, F.; Sun, X.; Liu, J. State-of-the-art detection and diagnosis methods for rolling bearing defects: A comprehensive review. Applied Sciences 2025, 15, 1001. [Google Scholar] [CrossRef]
Keshun, Y.; Puzhou, W.; Peng, H.; Yingkui, G. A sound-vibration physical-information fusion constraint-guided deep learning method for rolling bearing fault diagnosis. Reliability Engineering & System Safety 2025, 253, 110556. [Google Scholar]
Li, W.; Chen, Y.; Li, J.; Wen, J.; Chen, J. Learn then adapt: A novel test-time adaptation method for cross-domain fault diagnosis of rolling bearings. Electronics 2024, 13, 3898. [Google Scholar] [CrossRef]
Li, W.; Wang, Y.; Li, J.; Han, Z.; Chen, Y.; Chen, J. An Online Learning Framework for Fault Diagnosis of Rolling Bearings Under Distribution Shifts. Mathematics 2025, 13, 3763. [Google Scholar] [CrossRef]
Cheng, Y.; Lin, M.; Wu, J.; Zhu, H.; Shao, X. Intelligent fault diagnosis of rotating machinery based on continuous wavelet transform-local binary convolutional neural network. Knowledge-Based Systems 2021, 216, 106796. [Google Scholar] [CrossRef]
Chen, W.; Li, J.; Wang, Q.; Han, K. Fault feature extraction and diagnosis of rolling bearings based on wavelet thresholding denoising with CEEMDAN energy entropy and PSO-LSSVM. Measurement 2021, 172, 108901. [Google Scholar] [CrossRef]
Nassef, M.; Hussein, T.M.; Mokhiamar, O. An adaptive variational mode decomposition based on sailfish optimization algorithm and Gini index for fault identification in rolling bearings. Measurement 2021, 173, 108514. [Google Scholar] [CrossRef]
Tripathi, P.M.; Kumar, A.; Komaragiri, R.; Kumar, M. Watermarking of ECG signals compressed using Fourier decomposition method. Multimedia Tools and Applications 2022, 81, 19543–19557. [Google Scholar] [CrossRef]
Li, Y.; Wang, X.; Liu, Z.; Liang, X.; Si, S. The entropy algorithm and its variants in the fault diagnosis of rotating machinery: A review. Ieee Access 2018, 6, 66723–66741. [Google Scholar] [CrossRef]
Zhu, K.; Chen, L.; Hu, X. A multi-scale fuzzy measure entropy and infinite feature selection based approach for rolling bearing fault diagnosis. Journal of Nondestructive Evaluation 2019, 38, 90. [Google Scholar] [CrossRef]
Gao, X.; Yan, X.; Gao, P.; Gao, X.; Zhang, S. Automatic detection of epileptic seizure based on approximate entropy, recurrence quantification analysis and convolutional neural networks. Artificial intelligence in medicine 2020, 102, 101711. [Google Scholar] [CrossRef] [PubMed]
Fu, W.; Wang, K.; Tan, J.; Zhang, K. A composite framework coupling multiple feature selection, compound prediction models and novel hybrid swarm optimizer-based synchronization optimization strategy for multi-step ahead short-term wind speed forecasting. Energy Conversion and Management 2020, 205, 112461. [Google Scholar] [CrossRef]
Ruiz-Aguilar, J.J.; Turias, I.; González-Enrique, J.; Urda, D.; Elizondo, D. A permutation entropy-based EMD–ANN forecasting ensemble approach for wind speed prediction. Neural Computing and Applications 2021, 33, 2369–2391. [Google Scholar] [CrossRef]
Aziz, W.; Arif, M. Multiscale permutation entropy of physiological time series. In Proceedings of the 2005 Pakistan section multitopic conference. IEEE, 2005; pp. 1–6. [Google Scholar]
Zheng, J.; Dong, Z.; Pan, H.; Ni, Q.; Liu, T.; Zhang, J. Composite multi-scale weighted permutation entropy and extreme learning machine based intelligent fault diagnosis for rolling bearing. Measurement 2019, 143, 69–80. [Google Scholar] [CrossRef]
Deng, B.; Liang, L.; Li, S.; Wang, R.; Yu, H.; Wang, J.; Wei, X. Complexity extraction of electroencephalograms in Alzheimer’s disease with weighted-permutation entropy. Chaos: An Interdisciplinary Journal of Nonlinear Science 2015, 25. [Google Scholar] [CrossRef]
Xiao-Feng, L.; Yue, W. Fine-grained permutation entropy as a measure of natural complexity for time series. Chinese Physics B 2009, 18, 2690–2695. [Google Scholar] [CrossRef]
Azami, H.; Escudero, J. Amplitude-aware permutation entropy: Illustration in spike detection and signal segmentation. Computer methods and programs in biomedicine 2016, 128, 40–51. [Google Scholar] [CrossRef]
Cuesta-Frau, D. Slope entropy: A new time series complexity estimator based on both symbolic patterns and amplitude information. Entropy 2019, 21, 1167. [Google Scholar] [CrossRef]
Jiang, Y.; Peng, C.K.; Xu, Y. Hierarchical entropy analysis for biological signals. Journal of Computational and Applied Mathematics 2011, 236, 728–742. [Google Scholar] [CrossRef]
Fu, W.; Zhang, K.; Wang, K.; Wen, B.; Fang, P.; Zou, F. A hybrid approach for multi-step wind speed forecasting based on two-layer decomposition, improved hybrid DE-HHO optimization and KELM. Renewable Energy 2021, 164, 211–229. [Google Scholar] [CrossRef]
Hashim, F.A.; Houssein, E.H.; Hussain, K.; Mabrouk, M.S.; Al-Atabany, W. Honey Badger Algorithm: New metaheuristic algorithm for solving optimization problems. Mathematics and Computers in Simulation 2022, 192, 84–110. [Google Scholar] [CrossRef]
Li, M.; Wang, R.; Yang, J.; Duan, L. An Improved Refined Composite Multivariate Multiscale Fuzzy Entropy Method for MI-EEG Feature Extraction. Computational intelligence and neuroscience 2019, 2019, 7529572. [Google Scholar] [CrossRef] [PubMed]
Yin, Y.; Shang, P. Multivariate weighted multiscale permutation entropy for complex time series. Nonlinear Dynamics 2017, 88, 1707–1722. [Google Scholar] [CrossRef]
Gao, S.; Wang, Q.; Zhang, Y. Rolling bearing fault diagnosis based on CEEMDAN and refined composite multiscale fuzzy entropy. IEEE Transactions on Instrumentation and Measurement 2021, 70, 1–8. [Google Scholar] [CrossRef]
Chakraborty, M.; Mitra, D.; et al. Automated detection of epileptic seizures using multiscale and refined composite multiscale dispersion entropy. Chaos, Solitons & Fractals 2021, 146, 110939. [Google Scholar] [CrossRef]
Ke, Y.; Yao, C.; Song, E.; Dong, Q.; Yang, L. An early fault diagnosis method of common-rail injector based on improved CYCBD and hierarchical fluctuation dispersion entropy. Digital Signal Processing 2021, 114, 103049. [Google Scholar] [CrossRef]
Yun, K.; Chong, Y.; Enzhe, S.; Liping, Y.; Quan, D. Fault diagnosis method of diesel engine injector based on hierarchical weighted permutation entropy. In Proceedings of the 2021 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), 2021; IEEE; pp. 1–6. [Google Scholar]
Li, Y.; Wang, X.; Si, S.; Huang, S. Entropy based fault classification using the Case Western Reserve University data: A benchmark study. IEEE Transactions on Reliability 2019, 69, 754–767. [Google Scholar] [CrossRef]
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. Journal of machine learning research 2008, 9. [Google Scholar]

Figure 1. The symbolic definition of slope entropy.

Figure 2. The flowchart of ECMSE.

Figure 3. The normalized spectrums of weights under scale factors s=2-10.

Figure 4. (a) ECMSE value of WGN signals with different lengths. (b) ECMSE value of 1/f noise signals with different lengths.

Figure 5. Flowchart of the proposed method.

Figure 6. Experimental rig of the CWRU bearing dataset.

Figure 7. The bearing vibration signal waveforms under different working conditions in Case 1.

Figure 8. MSC under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 8. MSC under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 9. Classification results: (a) classification results; (b) confusion matrix.

Figure 12. (a) the experimental device, and (b) accelerometer measurement locations.

Figure 13. The bearing vibration signal waveforms under different working conditions.

Figure 14. MSC under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 14. MSC under different parameter settings: (a) different

(δ, γ)

; (b) different

(γ, m)

.

Figure 15. (a) The recognition accuracy of different methods. (b) The recognition accuracy of different Training set/Test set ratios.

Table 1. Description of different working states in Case 1.

Working state	Abbreviation	Fault diameter (mm)	Training samples	Test samples	Label
Normal	NOR	\	20	80	1
Inner race fault 1	IR1	0.1778	20	80	2
Inner race fault 2	IR2	0.3556	20	80	3
Inner race fault 3	IR3	0.5334	20	80	4
Outer race fault 1	OR1	0.1778	20	80	5
Outer race fault 2	OR2	0.3556	20	80	6
Outer race fault 3	OR3	0.5334	20	80	7
Ball fault 1	B1	0.1778	20	80	8
Ball fault 2	B2	0.3556	20	80	9
Ball fault 3	B3	0.5334	20	80	10

Table 2. Details and the recognition accuracy of different methods in Case 1.

Different methods	Recognition accuracy (%)				Computing time (s)
	Max	Min	Mean	SD
ECMSE+HBA-KELM	100	99.63	99.95	0.1	47.098
CMSE+HBA-KELM	98.38	94.75	96.5	0.95	28.336
DBCMSE+HBA-KELM	100	99.38	99.84	0.18	18.592
CMWPE+HBA-KELM	85	80.38	83	1.14	94.958
RCMFE+HBA-KELM	94.13	90.63	92.67	0.76	165.929
RCMDE+HBA-KELM	99.5	97.5	98.06	0.49	47.689
HFDE+HBA-KELM	91.88	86.88	89.06	1.06	29.966
HWPE+HBA-KELM	88.25	77.75	82.80	2.39	28.485

Table 3. Description of different working states in Case 2

Working state	Abbreviation	Fault size (mm)	Training samples	Test samples	Label
Normal	NOR	\	20	80	1
Inner race fault 1	IRF1	9 × 0.2 (1 defect)	20	80	2
Outer race fault 1	ORF1	9 × 0.2 (1 defect)	20	80	3
Ball fault 1	BF1	9 × 0.2 (1 defect)	20	80	4
Outer race & ball compound fault	OBF	9 × 0.2 (3 defects)	20	80	5
Inner race & ball compound fault	IBF	9 × 0.2 (3 defects)	20	80	6
Inner race fault 2	IRF2	9 × 0.2 (3 defects)	20	80	7
Outer race fault 2	ORF2	9 × 0.2 (3 defects)	20	80	8
Ball fault 2	BF2	9 × 0.2 (3 defects)	20	80	9

Table 4. Details and the recognition accuracy of different methods.

Different methods	Recognition accuracy (%)				Computing time (s)
	Max	Min	Mean	SD
ECMSE+HBA-KELM	100	99.17	99.82	0.18	42.719
CMSE+HBA-KELM	98.89	96.25	97.78	0.75	25.659
DBCMSE+HBA-KELM	99.58	98.19	99.11	0.32	16.576
CMWPE+HBA-KELM	91.53	87.36	89.40	0.99	45.626
RCMFE+HBA-KELM	98.75	96.25	97.97	0.54	169.540
RCMDE+HBA-KELM	97.22	93.89	95.90	0.73	23.486
HFDE+HBA-KELM	78.61	71.67	75.47	1.47	26.250
HWPE+HBA-KELM	54.58	46.53	50.13	1.76	22.995

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Enhanced Composite Multi-Scale Slope Entropy and Its Application to Fault Diagnosis of Rolling Bearing

Abstract

Keywords:

Subject:

1. Introduction

2. Enhanced Composite Multi-Scale Slope Entropy

2.1. Slope Entropy

2.2. Composite Multi-Scale Slope Entropy

2.3. Difference-Based Composite Multi-Scale Slope Entropy

2.4. Enhanced Composite Multi-Scale Slope Entropy

2.5. Parameter Selection of ECMSE

3. Honey Badger Algorithm Optimized Kernel Extreme Learning Machine

3.1. Honey Badger Algorithm

3.2. Kernel Extreme Learning Machine

3.3. HBA-KELM Algorithm (Continued)

3.4. Implementation of the Proposed Method

4. Experimental Study

4.1. Experimental Setting

4.2. Test Verification Case 1

4.3. Test Verification Case 2

5. Conclusion

Author Contributions

Data Availability Statement

References

MDPI Initiatives

Important Links

Subscribe