1. Introduction
In recent years, with the increasing popularity of smart wearable devices, more and more people can now access real-time biometric signals conveniently and non-invasively. The wearable devices such as smartwatches, fitness trackers, and health monitors have democratized the monitoring of vital physiological parameters, enabling users to track key indicators like electrocardiogram (ECG), electroencephalogram (EEG), body temperature, and blood pressure. Among these indicators, ECG serves as a fundamental non-invasive tool for monitoring cardiac activity, playing a critical role in the early detection and diagnosis of cardiovascular diseases such as arrhythmias and myocardial infarctions.
Cardiovascular diseases, predominantly heart disease, remain the leading global cause of mortality, accounting for approximately 18 million annual deaths as per the World Health Organization's 2025 report. Accurate prediction of ECG signals is crucial for early detection of heart conditions like arrhythmia. This enables timely medical interventions, preventing disease progression and significantly reducing mortality rates.
With the development of deep learning technology, deep learning models are increasingly widely used for time series prediction. As a typical deep learning model, RNN (Recurrent Neural Network) has been applied in time series prediction [
1,
2,
3,
4]. Compared with the traditional time series prediction methods, RNN improves the prediction accuracy, but it still has the shortcomings of gradient disappearance and gradient explosion. LSTM is an improved RNN that overcomes the shortcomings of traditional RNN and is well suited for processing and predicting time-series data with time delays. Relevant literature shows that LSTM is indeed more effective than the traditional RNN model in the analysis and prediction of time series data [
5,
6]. At present, LSTM is widely used in natural language recognition [
7,
8,
9], machine translation [
10,
11,
12,
13], and financial time series prediction [
14,
15,
16]. In addition, LSTM has also been applied to time series prediction in other fields. In order to improve the accuracy of wind power forecasting, Sarkar et al. [
17] proposed a novel deep learning model for wind power forecasting. The proposed prediction model combines an adaptive spectral block (ASB), an interactive convolution block (ICB), LSTM networks, and self-supervised learning (SSL). The experimental results show that the forecasting model improved the forecasting accuracy, with average reductions in MAE and RMSE of 5-8% and 8-12%, respectively. Jin et al. [
18] used four popular Artificial Intelligence Neural Networks, namely CNN, RNN, LSTM and GRU, to predict random sea waves and conducted systematic comparative analysis. The results demonstrate that Compared with the other three kinds of artificial intelligence neural networks, LSTM shows good accuracy in short-term prediction, but it is more time-consuming. To enhance earthquake prediction precision, Kaushal et al. [
19] predicted the earthquake by four deep learning models, namely RNN, LSTM, AdaBoost, and a hybrid RNN-LSTM model. Among these four models, the RNN-LSTM hybrid achieves an impressive accuracy rate of 98 %, significantly surpassing the other models. Khan et al. [
20] proposed a traffic flow forecasting scheme based on the Tree-structured Parzen Estimator. The proposed scheme uses the Tree-structured Parzen Estimator to tune the hyperparameters of LSTM deep learning framework. Simulation results show that the proposed scheme exceeds the benchmark scheme in terms of prediction accuracy. To improve the precision of solar irradiance prediction, Liu et al. [
21] proposed a VMD-FE-LSTM-Transformer forecasting model. The proposed model is optimized by Bayesian optimization and employs variational mode decomposition (VMD) and fuzzy entropy (FE) to decompose and reconstruct solar irradiance data. Compared with multiple forecasting models, the proposed model has the superior capability in single-step and multi-step forecasting.
This paper focuses on the prediction of ECG signals. Because an ECG signal is a nonlinear and non-stationary time series signal with an inherent random feature, it is difficult to predict accurately. To our knowledge, there are not many papers on predicting ECG signal at present, but there have been some papers that are explicitly related to ECG signal prediction [
22,
23,
24,
25]. In order to eliminate redundant data transmission within the body area network and reduce energy consumption, Wang et al. [
26] proposed an ECG data fusion algorithm with synchronous prediction. The proposed algorithm is based on wavelet transform and LSSVM (least squares support vector machine). The experiment shows that the proposed algorithm significantly improves the accuracy of ECG prediction, with an RMSE of 0.0529. We proposed two ECG signal prediction methods, one based on ARIMA (Autoregressive Integrated Moving Average) model [
27] and the other based on RBF (Radial Basis Function) neural network [
28]. The prediction accuracy of the latter is higher than that of the former. To improve the accuracy of ECG signal forecasting, Ratna Prakarsha et al. [
29] proposed an artificial neural network method for predicting ECG signals and compared it with a conventional method of using LMS (Least Mean Square) filter. The experiment proves the superiority of the method of artificial neural network over the method of LMS filter. Dudukcu et al. [
30] proposed a hybrid deep neural network model for chaotic time series prediction. The proposed model is formed by combining different neural network layers, the temporal convolutional neural network (TCN) with RNN. The simulation results show that the proposed model has an average MAE of 0.0051 and an average RMSE of 0.0082 on ECG arrhythmia dataset. In the study carried out by Zacarias et al. [
31], a model based on LSTM neural network with two hidden layers was proposed for ECG signal forecasting. The results show that the average RMSE of the proposed forecasting model is 0.0070±0.0028, and the average MAE is 0.0522±0.0098.
Due to the generally low prediction accuracy of ECG signal prediction methods mentioned above, we propose a novel deep learning prediction method for ECG signals using VMD, Cao method, and a LSTM neural network. The rest of this paper is organized as follows:
Section 2 presents the materials and methods used to address ECG prediction.
Section 3 presents a discussion of the results;
Section 4 concludes the paper.
2. Materials and Methods
2.1.VMD
Variational mode decomposition (VMD) decomposes an input signal into a series of discrete band-limited intrinsic mode functions (IMFs) around the center frequency. In time series prediction, the function of VMD is to reduce the non-stationary character of time series, which is helpful to improve the accuracy of prediction. Each IMF component is obtained through the following three steps:
Step 1: Calculate the analytic signal of each modal function
by Hilbert transform
Step 2: Multiply the analytical signal by the estimated center frequency
, and move it to the base frequency spectrum, which is
Step 3: Estimate the bandwidth of each mode by Gaussian smoothing of the demodulated signal, i.e., the L
2 norm of the gradient. The constrained variational model is
where
x is the input signal and
is the Euclidian distance. In order to find the optimal solution of the above problem, turn the constrained variational model into an unconstrained variational model by introducing the quadratic penalty factor
and Lagrange multiplication operator
. The extended Lagrange expression is
Find the saddle point of the extended Lagrange expression using the alternating direction multiplier method (ADMM) [
32] to solve the extended Lagrange problem. The saddle point is obtained by alternating renewal
. The specific implementation process of VMD is as follows:
Step 1: Initialize , and set n = 0.
Step 2: Update
, and
. The formulas for this are
Where , and are the Fourier transforms of the signals , , and , respectively. β is the step update coefficient.
Step 3: Repeat step 2 until the convergence condition is reached
where
ε is a judgment threshold.
Before VMD, the number
K of IMFs needs to be predetermined.
K can be determined according to the ratio
Rres of residual energy to the original signal energy.
Rres is defined as follows:
where
X(n) is the original signal,
uk (n) is the IMF, and
N is the sample number. When
Rres is less than 1% and there is no significant downward trend, the number
K can be determined [
33]. For the No. 100 ECG data, the
Rres of VMD with different
K are shown in
Table 1.
Table 1 shows that the
Rres has no obvious downward trend when
K =14. Therefore, we set
K =14 in the experiment.
The No. 100 ECG was decomposed into 14 IMFs by VMD as shown in
Figure 1. IMF1 is the residual, and IMF2-IMF14 is the component, sorted from low to high frequency.
2.2. Cao Method
Cao method was proposed to determine the minimum embedding dimension of a time series by Cao [
34]. In Cao method, the time-delay parameter
τ is necessary before the minimum embedding dimension is determined. For a time series
,the time-delay vectors can be reconstructed as follows:
Cao method is described as follows:
where
m is the embedding dimension,
τ is the time delay,
is the
ith reconstructed vector with embedding dimension
m,
is Euclidian distance,
is an integer which
is the nearest neighbor of
. If
stops changing when
m is greater than the value
m0,
m0+1 is the minimum embedding dimension.
The embedding dimension
m of IMF5 is determined by Cao method, as shown in
Figure 2.
According to
Figure 2, we obtain the embedding dimension
m = 7 of IMF5. In the experiment, we set
τ = 1 and obtain the minimum embedding dimensions
m of each IMF, as shown in
Table 2.
The minimum embedding dimension m is the input dimension of LSTM neural network.
2.3. LSTM Neural Network
In 1997, Hochreiter and Schmidhuber proposed LSTM based on RNN neural network [
35]. As an improved RNN model, LSTM solves the problems of gradient disappearance and gradient explosion that RNN cannot overcome. With the rapid development and popularization of deep learning, LSTM is widely used in time series applications.
Each LSTM unit is composed of a memory cell and three gates: an input gate, a forget gate, and an output gate. The functions of these three gates are: the input gate decides the information that should be input; the forget gate determines the information that should be discarded; the output gate decides the information that should be output. The architecture of LSTM unit is shown in
Figure 3.
The forget gate is expressed in the following formula:
The input gate is expressed in the following formulas:
The output gate is expressed in the following formulas:
where
and
denotes weight matrices and bias vectors of gates, respectively. In addition,
and tanh are the activation functions between different layers,
is the current state of the cell,
is the unit state of the current input, and
is the current output of the cell. The expressions of
and tanh are as follows:
By introducing cell state and three gate structures of forgetting gate, input gate and output gate, LSTM has the ability of long-term and short-term memory, thus solving the problems of gradient disappearance and gradient explosion of RNN. Compared with other neural networks, LSTM is more suitable for time series data prediction.
2.4. Sliding window approach
ECG prediction is a predictive modeling technique that utilizes historical ECG data to make projections about future occurrences. A typical network structure of LSTM for ECG prediction is shown in
Figure 4.
In
Figure 4,
is the predicted value of
. In order to obtain
, time series data
are required as inputs to the LSTM input layer. A sliding window approach is used to solve the input size of LSTM prediction model [
29,
31]. For example, when the sliding window size is 5, the sliding window method is used to predict sequences
as shown in
Table 3.
The size m of the sliding window affects the accuracy of prediction. Different m results in different prediction accuracies. In this paper, Cao method is used to determine the optimal sliding window size m. The optimal sliding window size m is the input size of LSTM input layer.
2.5. Proposed Method
Based on the study of ECG signal prediction, this paper proposes a deep learning prediction method of ECG signal. Its flowchart is shown in
Figure 5.
The prediction steps of the proposed method are as follows:
Step 1: Denoising processing before ECG prediction is necessary. In this paper, as the ECG data from the MIT-BIH Arrhythmia Database have been denoised, there is no need for denoising processing.
Step 2: Decompose an ECG signal into K IMFs by VMD. In the experiment, we use K = 14 to get a better prediction result.
Step 3: Determine the embedding dimension by Cao method. The embedding dimension is the input size of the LSTM neural network.
Step 4: Establish a LSTM neural network according to the trained set and use it to predict the test set of each IMF.
Step 5: Add the prediction results of all LSTM neural networks to obtain the final ECG signal prediction result.
Step 6: Analyze the prediction error and compare it to other prediction methods.
3. Results and Discussion
The common performance measures of prediction methods are root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and R-square (R
2), defined as
where
is the predicted value of
,
is the length of
, and
is the mean of
.
All ECG data in the simulation experiment are from the MIT-BIH Arrhythmia Database [
36]. We selected the No.100 ECG signal, which consists of 2,768 data points, for the experiment. We used two-thirds of the No.100 ECG signal as the training set (i.e. 1845 data points) and the remaining one-third as the test set (i.e. 923 data points). The original No. 100 ECG signal is shown in
Figure 6.
All experiments were carried out in MATLAB and Python compiling environment. The LSTM model was implemented in a Theano framework based on Keras deep learning tools. The value of the parameter epochs in LSTM neural network has a significant impact on the prediction results. The impact of different values of the parameter epoch on prediction accuracy in the case of LSTM parameter batch_size=4 is shown in
Table 4.
Table 4 shows that with the LSTM neural network parameter batch_size=4, good prediction performance is achieved with the LSTM neural network parameter epochs=250, where RMSE=0.001326 and MAE=0.001044.
In the experiment, the values of each parameter in LSTM are epochs=250, batch_size=4, optimizer=Adam, loss=mean_squared_error, activa-tion=Relu. The prediction method proposed in this paper was used to predict the test set of 923 ECG data, as shown in
Figure 7.
In
Figure 7, raw signal represents the original ECG signal and test prediction represents the predicted ECG signal of the test set.
Figure 7 shows that the predicted ECG waveform overlaps almost perfectly with the original ECG waveform. In order to observe the degree of fit between the original ECG signal waveform and the predicted ECG signal waveform, local amplification is shown in
Figure 8.
In
Figure 8, The local magnification range is [165,185].
Figure 8 shows that even after amplification, the predicted ECG signal waveform still overlaps well with the original ECG signal waveform. The prediction indexes of the test set are shown in
Table 5.
Table 5 shows that the RMSE, MAE, and MAPE are very small and R
2 is close to 1. This indicates that the prediction method proposed in this paper has high prediction accuracy.
To evaluate the performance of the prediction method, the proposed method in this paper was compared with the prediction methods in the existing literature [
23,
24,
25,
27,
28,
30,
31] on the same dataset (the No.100 ECG data). The comparison results are shown in
Table 6.
In
Table 6, The values of 0.0470 and 0.0070 in Zacarias et al. [
31] are approximate values read from the bar chart in Zacarias et al. [
31].The bar charts of MAE and RMSE in
Table 6 are shown in
Figure 9 and
Figure 10, respectively.
Figure 9 and
Figure 10 visually demonstrate that the MAE and RMSE of this paper are significantly smaller than those of the existing literature [
23,
24,
25,
27,
28,
30,
31]. This illustrates that the prediction accuracy of the prediction method proposed in this paper is higher than that of the prediction methods in the existing literature [
23,
24,
25,
27,
28,
30,
31].
In addition, the prediction method proposed in this paper was compared with several popular hybrid prediction methods, such as the method based on WT, PSR, and RBF neural network (WT-PSR-RBF for short); the method based on empirical mode decomposition (EMD), PSR, and RBF neural network (EMD-PSR-RBF for short); the method based on VMD, PSR, and BP neural network (VMD-PSR-BP for short); and the method based on VMD, generalized regression neural network (GRNN) , and PSR (VMD-PSR-GRNN for short). The experimental data are the No. 100 ECG data. The trained set has 1845 data points and the test set has 923 data points. The experimental results are shown in
Table 7.
It is obvious from
Table 7 that the prediction performance of this paper is superior to that of WT-PSR-RBF, EMD-PSR-RBF, VMD-PSR-BP and VMD-PSR-GRNN.
We also compared the prediction method proposed in this paper with single deep learning prediction methods. The comparison results are shown in
Table 8 and
Figure 11.
Table 8 shows that the RMSE and MAE of this paper are obvious smaller than those of LSTM, MLP and CNN.
In
Figure 11, VMD_CAO_LSTIM represents the prediction method in this paper.
Figure 11 shows that the prediction error of this paper is smaller than that of LSTM, CNN and MLP.
Table 8 and
Figure 11 show that the prediction performance of this paper outperforms that of LSTM, CNN and MLP.
Author Contributions
Conceptualization, T.Q. and F.H.; methodology, F.H.; software, L.W. and F.H.; validation, G.O., X.Z. and F.H.; formal analysis, X.Z.; investigation, D.W.; resources, T.Q.; data curation, F.H.; writing—original draft preparation, F.H.; writing—review and editing, L.W.; visualization, X.Z.; supervision, G.O. and D.W.; project administration, X.Z.; funding acquisition, T.Q. All authors have read and agreed to the published version of the manuscript.