Comparing Performance of Nonlinear Complexity 2 Metrics for Assessing Quality of ECGs Collected in 3 Real Time 4

We compared performance of a novel encoding Lempel-Ziv complexity (ELZC) with 10 approximate entropy (ApEn), sample entropy (SmpEn) and permutation entropy (PerEn) as 11 nonlinear metric to assess ECG quality. Firstly to compare performance of discerning randomness 12 and inherent nonlinear properties within time series, this study calculated the aforementioned 13 four nonlinear complexity values on several typical artificial time series i.e., Gauss noise, two 14 kinds of noisy time series, two kinds of Logistic series and periodic series, respectively. Then for 15 analyzing sensitivity of the aforementioned four complexity methods to content level of different 16 types noise within ECG recordings, we investigated variation trend of ELZC, ApEn, SmpEn and 17 PerEn in several synthetic ECG recordings containing different types noise (i.e., baseline wander, 18 muscle artefacts, electrode motion, power line and mixed noise) and different signal noise ratios 19 (i.e., 15, 10, 5, 0, −5 and −10 dB). Finally, the four complexity methods were employed to classify 20 the quality of real ECG recordings from the PhysioNet/Computing in Cardiology Challenge 2011 21 (CINC 2011) of the MIT databases, then receiver operating characteristic curves (ROC) and their 22 corresponding area under curve (AUC) were yielded. The results showed ELZC could not only 23 distinguish randomness and chaotic within time series but also reflect content level of noise within 24 time series, and the highest AUC of PerEn, ELZC, SmpEn and ApEn were 0.850, 0.695, 0.474 and 25 0.461, respectively. The results demonstrated PerEn and ELZC were more effectively than ApEn 26 and SmpEn for assessing ECG quality. 27


Introduction
In practice, considerable amount of ECG recordings collected via wearable device and smart phone cannot be used in clinic because their quality is so poor that their main waveform cannot be identified, so ECG recordings in real time have to be classified their quality before diagnosis [1][2][3].
At present, most methods of assessing ECG quality employed time and frequency domain features extracted from ECG recordings.Langley et al. [4] just relied on time domain features i.e., flat lines, saturation, baseline drift, high and low amplitude and steep slope, to assess quality of ECG recordings, and the corresponding assessment accuracy was 85.7% on test data.Similarly Kužílek et al. [5] summarized six waveform rules to assess ECG quality and obtain 83.6% on test data.In most cases, waveform features are easy to be identified and calculated, however an assessing quality method has poor generation ability when it just relies on waveform features.Frequency domain features i.e., power spectrum density (PSD) can reflect energy distribution within time series, so it helps to extract some detailed inherent information from physiological series.So PSD was usually used for assessing quality of ECG recordings [6][7][8].In [6], Zaunseder et al. proposed an assessment method based on thirty five frequency domain features generated from PSD, and assessment result of the method was 90.4% and higher than that of the other methods just based on time domain features.In fact some ECG recordings can be classified as unacceptable signals by some obvious waveform features i.e., flat line and high amplitude.So some assessment methods combining waveform features with frequency features not only improve assessment accuracy but also help for calculation efficiency.Clifford et al. [7] and Zhang et al. [8] proposed an assessing method based on combination of time and frequency features, respectively.Comparing with the previous methods, the combination methods of multi features can provide relatively more comprehensive information within the ECG recordings.However, the methods are difficult to further improve their classification accuracy, and their generation ability also needs to be improved.So it is necessary to find new features that can reflect the inherent nature of physiological series.Actually nature of ECG signal is nonlinear, so nonlinear complexity approaches can be used for extracting the inherent nonlinear properties of ECG recordings to assess quality of ECG signals [8,9].Some nonlinear complexity approaches i.e., classical Lempel-Ziv complexity (CLZC), approximate entropy (ApEn), sample entropy (SmpEn) and permutation entropy (PerEn) have been extensively used for biomedical signal analysis [10][11][12][13].Zhang et al. [14] evaluated performance of CLZC on ECG quality assessment and concluded that the CLZC was only sensitive to high frequency noise.However in practice ECG signals are usually contaminated by the mixed noise including high frequency, low frequency and power line noise, so performance of CLZC is not satisfactory when it is used as quality assessment metric.In addition, the ApEn, SmpEn and PerEn has been not yet reported to be used for assessing ECG quality.Zhang et al. [15] proposed an encoding Lempel-Ziv complexity (ELZC) to analyze the nonlinear complexity of physiological signal, and the approach can better discern random and chaotic character within time series than other LZ complexity approaches.In this study, we compared the ELZC with ApEn, SmpEn and PerEn as quality metric for assessing ECG quality.
In this study, we firstly calculate ELZC, ApEn, SmpEn and PerEn of six typical artificial time series i.e., Gauss noise, two kinds of noisy series contained different level of noise, two kinds of Logistic series (different μ) and periodic series so as to comparing performance of the four complexity methods for discerning randomness and the inherent nonlinear properties within time series.Then we calculated the four aforementioned complexity values on artificial synthetic ECG signals with different types noise (i.e., baseline wander (BW), muscle artefacts (MA), electrode motion (EM), power line (PL) and mixed noise) and Signal to Noise Ratio (SNR) so as to not only analyze the four complexity methods whether can reflect different types of noise, but also evaluate sensitivity of the four methods to noise content level of the artificial synthetic ECG signals.Finally we used the ELZC, ApEn, SmpEn and PerEn on real ECG recordings for assessing quality by by receiver operating characteristic curve (ROC) and the corresponding area under curve (AUC).The study is organized as follows: Section 2 describes the four complexity methods and dataset including artificial time series and real ECG signals.The experimental results and discusses are described in Section 3. Section 4 concludes the current study.

Artificial synthetic and Real ECG siginals
For presenting comprehensive study, we used three types of datasets including six typical artificial time series, the artificial synthetic ECG recordings and the real ECG recordings from the MIT database of the PhysioNet/Computing in Cardiology Challenge 2011 (CinC 2011).First, six typical artificial time series were employed in this study.Gau noise represented a pure noise, MIX(0.4)means that 40 percent of a period series of length N were randomly chosen and replaced with independent identically distributed random noise, and it was used as a mixed noise series, MIX(0.2) was similar.Logistic (Logi) mapping series represent nonlinear series and are defined as 1 ( 1) where 1<μ≤4.In this study, Logi(4.0)represented the Logistic series that μ were set as 4.0, and Logi(3.8)was similar.Logi(3.5) was defined as the same way and represented the periodic series.

Encoding Lempel-Ziv complexity (ELZC)
The classical LZ complexity consists of two steps.Firstly, an original time series is transformed into a new binary symbolic sequence by comparing with the mean or median of the original series, and then the LZ value from the binary sequence is calculated.In this study, the original series was transformed into an 8-state symbolic (3-bit binary) sequence by an encoding way.
Each xi within the original signal X=x1, x2, …, xn is transformed into a 3-bit binary symbol b1(i)b2(i)b3(i), and the detailed process consists of three steps and is described as follows [15]: Step 1, the b1(i) is determined by comparing xi with the mean of signal X, and b1(i) is set 0 when the xi is less than the mean, otherwise the b1(i) is 1.
Step 2, the b2(i) is 0 when the difference between xi and xi-1 is less than 0, otherwise the b2(i) is set to 1. Initially, b2( 1) is set to 0.
Step 3, calculated process of the third digit b3(i) is relatively complex, a variable Flag is first denoted as follows: where dm is the mean distance between adjacent points within signal X.Subsequently, b3(i) is calculated as follows: where b3(i) is 0.
After the symbolic process, the LZ value of the new symbolic sequence will be calculated, and the detailed calculation process is detail in [15].In fact, the LZ complexity counter c(n) is closed related to the new subsequence of consecutive characters within the symbolic sequence.For Thereafter S=s(1), s(2), …, s(r+i) and Q=s(r+i+1).This above procedure is repeated until Q is the last character of B. in fact the LZ complexity counter c(n) may be normalized as the where n is the length of signal X, α is the number of possible symbols contained in the new sequence.
Generally, the normalized complexity C(n) is consider instead of c(n) in practice.

ApEn and SmpEn
Pincus proposed ApEn as a metric to quantified regularity of a time series [19], and it means the probability of new pattern within time series when the dimension increases from m to m+1.The process of calculating ApEn is described as follows: Let S be a time series of length N and S= s1, ...,

max ( ) (
) , 0,1,..., 1 For each vector xi, the number of the distance dij within r×SD is found where SD is the standard deviation of the time series S, and the ratio of the number to the total number of vector N-m+1 is calculated as Ci m (r).
Then the average degree of similarity for all of i is defined as ( ) ( ) Similarly, when the embedded dimension is m+1, the corresponding Ci m+1 (r) and φ m+1 (r) can be obtained.
( ) ( ) Then, the ApEn is described as follows: In fact, the length N is not infinite, so the ApEn can be calculated by Eq. 12 when N is a finite number.

( ) ( )
In fact, the SmpEn is a modified complexity method based on ApEn.Comparing with ApEn, the SmpEn does not include self-similar patterns, and it also does not depend on data size.Finally the SmpEn can be given as where B m (r) is the probability that two m-dimension vectors will match, and similarly B m+1 (r) is the probability that two m+1 dimension vectors will match, and m is the embedding dimension [20].
In this study, both of ApEn and SmpEn were calculated with r as 0.15×SD where SD was the standard deviation of the data series S, and m as 2.

• PerEn
The PerEn algorithm was detailed in [21][22][23][24].A time series {si, i = 1,2, …, n} firstly is reconstructed to generate the following matrix in phase space ( 1) where m is the embedding dimension and τ is delay time, and k=n-(m-1)×τ.Each a row of the matrix represents a reconstructed component, so there are a total number of k reconstruction components in the matrix.The ith reconstructed component Si contained m number of real values can be arranged in an increasing order as If there exists two or more reconstructed components are equal, e.g.xi+(j1-1)τ= xi+(jm-1)τ, they can be sorted according to the values of j1 and j2, that is, xi+(j1-1)τ<< xi+(j2-1)τ when j1<j2.Accordingly, each of reconstructed components can be transformed into a symbol series as where l=1, 2, …, k and k<<m!, and m! is the largest number of distinct symbols.The symbol sequence Sl is one kind of arrangement that is mapped onto the m number symbols (j1, j2, …, jm).If the probability of the occurrence of each symbol sequence is p1, p2, …, pk, respectively, the PE of k kinds of different symbol sequences of time series si in terms of Shannon entropy can be defined as [24]: When all the symbol sequences have the same probability namely pl=1/m!, Hp(m) can generate the maximum value ln(m!).Finally Hp(m) can be normalized as [24]: The value of Hp represents the randomness degree of the time series si, and it described local order structure of the time series.The smaller the value of Hp, the more regular and inerratic the time series is.The change of Hp can reflect and magnify the minute details of the time series [21].

Testing performance of ELZC, ApEn, SmpEn and PerEn for discerning randomness and nonlinear properties within time series
In fact, a good quality metric should be able to discern randomness and the inherent nonlinear where true positive is number of identified unacceptable recordings among the unacceptable ECG recordings, and ture negative is number of identified acceptable recordings among the acceptable ECG recordings.Actually in this study normalization of data was necessary because the range of values obtained by the aforementioned methods (i.e., ApEn, SmpEn, PerEn and ELZC) was different, so the Min-Max normalization approach was employed to normalize the data in this study.The approach is described as follows: min( ) * max( ) min( ) where x and x* are an original time series and the new series after normalizing respectively.Complexity of pure PL noise is nearly zero because PL is periodic signal, so complexity of the clean ECG recordings plus PL noise should be higher than that of periodic signal.Furthermore in contrast to complexity of the periodic signal, that of the clean plus PL noise should increase with the increase of SNR because nonlinear properties within the signal increases.

Conclusions
In this study, we compared a novel ELZC complexity method with ApEn, SmpEn and PerEn for ECG quality assessment.

Figure 1 showsFigure 1 .
Figure 1.The real ECG recordings and the artificial synthetic noisy ECG recordings with SNR from -10 to 15 dB: (a) The real and the clean plus BW noise; (b) The real and the clean plus EM; (c) The real and the clean plus MA; (d) The real and the clean plus PL and (e) The real and the clean plus mixed noise.
calculating the counter c(n), the symbolic sequence B is scanned from left to right and the c(n) is increased one unit when a new subsequence is encountered.First, S and Q are represented two subsequences of B respectively, and SQ is the concatenation of S and Q.The subsequence SQπ yielded from SQ after its last character is deleted (π is the operation deleting the last character in a sequence).v(SQπ) is the vocabulary of all subsequences of SQπ.Initially, c(n)=1, S=s(1) and Q=s(2), then SQπ=s(1).Generally, S=s(1), s(2), s(3), …, s(r) and Q=s(r+1), and so SQπ=S.Q is a subsequence of SQπ instead of a new sequence when it belongs to v(SQπ).Then Q is replaced with s(r+1), s(r+2) and used to judge if it belongs to v(SQπ) or not.The aforementioned processes are repeat until Q= s(r+1), s(r+2), …, s(r+i) and it is a new sequence instead of a subsequence of SQπ, then c(n)=c(n)+1.
sN, and reconstruct a vector xi of the embedded dimension m and xi= si, si+1, si+2, …, si+m-1 for 1≤ i≤ N-m+1 where m indicates the embedding dimension.The distance dij between the two vectors xi and xj is calculated where 1≤ i, j≤ N-m+1.

Figure 2 showsFigure 2 .
Figure 2 shows ApEn, SmpEn, PerEn and ELZC values of six aforementioned artificial time series including Gau, MIX(0.4),MIX(0.2),Logi(4.0),Logi(3.8) and periodic series on three time lengths 100, 500 and 2000, respectively.The PerEn and ELZC values exhibit monotonically decrease in order of Gau, MIX(0.4),MIX(0.2),Logi(4.0),Logi(3.8) and periodic series on all type time lengths.The ApEn values exhibit fluctuations on MIX(0.2) and Logi(4.0)when time length is 100, and monotonically decrease in aforementioned order on 500 and 2000 length.Figure2shows the SmpEn values monotonically decrease in aforementioned order on 100 of time length, and exhibits fluctuations on MIX(0.2) and Logi(4.0)when time lengths are 500 and 2000 points respectively.The results indicated the ELZC and PerEn approaches had better performance for discerning randomness and nonlinear properties within time series on all type time lengths than ApEn and SmpEn.In fact Both of ELZC and PerEn do not be performed on the original time series to calculate complexity values.The ELZC approach employs firstly the encoding coarse grain method to transform the original series into a new symbolic sequence then to calculate its complexity values.Similarly PerEn calculates complexity values of a symbolic sequence generated from a reconstructed matrix instead of the original series.For the ELZC and PerEn approaches, the symbolic sequence can reflect properly nonlinear properties within time series instead of randomness because the symbolic process preserves the inherent properties within the original series, losing random component i.e., noise and unexpected components.Random components within the original series affect calculation accuracy of ApEn and SmpEn approaches when the two approaches are performed on the original time series.
Se and Sp, we calculated ELZC of each lead of the real 12-lead ECG recordings from the training set A of the CinC 2011, and the thresholds were selected from 0 to 1 in steps of 0.005.An ECG recordings was marked as unacceptable when its ELZC value was higher than these threshes, otherwise it was recognized as an acceptable recording.For comparing assessement performance with ELZC, we also calculated the Sp and Se of ApEn, SmpEn and PerEn using the same method in this test.The two indexes Se and Sp are usually used for evaluating classification results, and the two concepts are described as [15]iological signals instead of confusing because physiological signal are nonlinear time series instead of random series.ELZC has a satisfied performance on discerning randomness and nonlinear properties within time series[15].For comparing performance of the four aforementioned complexity methods, this test was designed to calculate ELZC, ApEn, SmpEn and PerEn on six typical artificial time series mentioned in Section 2.1.In this test, 20 samples for each type of series were employed, and length of each sample was 100, 500 and 2000 points.2.4.Analyzing sensitivity of ELZC, ApEn, SmpEn and PerEn to different types noise and different SNRIn ECG quality assessment, a satisfied quality metric should be able to reflect different types and content level of noise within physiological series, meaning the metric is sensitive to different SNR of signal.Aiming to evaluate sensitivity of ELZC to noise levels of ECG signals, we compared ELZC, ApEn, SmpEn and PerEn values of the aforementioned synthetic noisy ECG signals in Section 2.1 i.e., the real ECG signals plus BW, EM, MA, PL and hybrid noise, respectively.In this test, 50 samples for each type of synthetic ECG signals were used.2.5.Using Receiver Operating Characteristic (ROC) Curve to verify classification performance of ELZC, ApEn, SmpEn and PerEnIn this test, we calculated sensitivity (Se) and specificity (Sp) of the ApEn, SmpEn, PerEn and ELZC methods on all leads of the real ECG recordings for generating ROC curve to compare classification performance of the aforementioned nonlinear methods.For getting

Table 1
shows the ELZC, ApEn, SmpEn and PerEn values for five artificial signals i.e., the clean ECG plus BW, the clean plus EM, the clean plus MA, the clean plus PL and the clean plus hybrid noise on several SNR from -10 to 15 dB, in steps of 5 dB.Table 1 shows ELZC values of all synthetic ECG except that of the clean ECG plus PL noise exhibit monotonically decreasing with increase of SNR.ApEn values of the clean ECG plus MA noise yield monotonically decreasing with increase of SNR but those of other synthetic ECG signals are not.SmpEn and PerEn values of the clean ECG plus BW noise exhibit monotonically decreasing with increase of SNR but those of other synthetic ECG signals are not.In fact the clean ECG should be nonlinear signal instead of random signal, so randomness within the clean ECG mixed several types noise i.e., BW, EM, MA and the hybrid noise should increase with decreasing in SNR.The change trend of ELZC values, namely decreasing with increasing in SNR, because nonlinear properties within ECG signals are more obviously with SNR increasing instead of randomness.Table 1 shows the ELZC values of the clean plus PL is 0.0919 when SNR is -10 dB, and the ELZC values increase with increasing of SNR.

Table 1 .
The ELZC values for clean ECG plus BW, EM, MA and mixed noise on several SNR from -10 to 15 dB with steps of 5 dB.Figure3shows ROC curves and its corresponding area under the ROC curve on each lead of the 12-lead ECG recordings.Figure3shows the ROC curves of ApEn and SmpEn on all leads of 12-lead ECG recordings are relatively overlap and lower than the ROC curves of PerEn and ELZC, so the corresponding AUCs of ApEn and SmpEn are also relatively smaller than that of PerEn and ELZC.Figure3shows that the PerEn generates the highest ROC curves, so it obtains the largest AUCs on leads I, II, III, aVR, aVF, aVL, V1 and V2 that are 0.824, 0.744, 0.675, 0.850, 0.647, 0.736, 0.647 and 0.662, respectively.The ELZC generates the second highest ROC curves on the aforementioned 8 leads, and Figure3also shows it obtains the second largest AUCs on aforementioned leads.In fact the ELZC can also get the largest AUCs of 0.609, 0.336, 0.695 and 0.532 on leads V3, V4, V5 and V6, respectively.PerEn and ELZC have relatively larger AUCs for classifying physiological series quality, especially PerEn, because the PerEn and ELZC have a satisfied performance to distinguish randomness and the inherent nonlinear properties within time series.Conversely, the ApEn and SmpEn keep smaller AUCs so that the two complexity methods cannot effectively classify quality of physiological series because they cannot discern accurately randomness and chaotic within time series.In practice, many other factors cause yield poor quality of physiological series except the lower SNR.For example poor electrode contact can cause poor quality of physiological time series so that the main waveform of the series cannot be recognized.However these physiological series are not contaminated by noise, so the series keep higher SNR.Additional, some unacceptable time series have lower SNR because noise, however the time series keep a quasi-periodic property, so that complexity of the series keeps a relatively lower level.This reason cause that classification performance of ELZC is relatively lower than that of PerEn.
The experiment results indicate that ELZC and PerEn have satisfied performance for distinguishing randomness and inherent nonlinear properties within time series, and ELZC can efficiently reflect types of noise and content level of noise contained in physiological signal because the ELZC values of the synthetic noisy signals decrease monotonically with increase of SNR except that of the clean ECG plus PL noise.The experiment results on real ECG recordings indicate that the ELZC and PerEn are a satisfied quality metric for assessing quality of physiological series.In practice, the ELZC and PerEn metirc have to be combined with the other quality metrics i.e., waveform and frequency metrics for physiological signal quality assessment.