Machine Learning Techniques and Syntactic Pattern Recognition based Heart Disease Prediction for Smart Health

Cardiovascular disease (CVD) may sometimes unexpected loss of life. It affects the heart and blood vessels of body. CVD plays an important factor of life since it may cause death of human. It is necessary to detect early of this disease for securing patients life. In this chpter two exclusively different methods are proposed for detection of heart disease. The first one is Pattern Recognition Approach with grammatical concept and the second one is machine learning approach. In the syntactic pattern recognition approach initially ECG wave from different leads is decomposed into pattern primitive based on diagnostic criteria. These primitives are then used as terminals of the proposed grammar. Pattern primitives are then input to the grammar. The parsing table is created in a tabular form. It finally indicates the patient with any disease or normal. Here five diseases beside normal are considered. Different Machine Learning (ML) approaches may be used for detecting patients with CVD and assisting health care systems also. These are useful for learning and utilizing the patterns discovered from large databases. It applies to a set of information in order to recognize underlying relationship patterns from the information set. It is basically a learning stage. Unknown incoming set of patterns can be tested using these methods. Due to its selfadaptive structure Deep Learning (DL) can process information with minimal processing time. DL exemplifies the use of neural network. A predictive model follows DL techniques for analyzing and assessing patients with heart disease. A hybrid approach based on Convolutional Layer and Gated-Recurrent Unit (GRU) are used in the paper for diagnosing the heart disease.


Introduction
The foremost reasons for high mortality rate over the globe are due to CVD. As per World Health Organization (WHO) statistics nearly 17.7 million people pass away every year in the globe [1][2]. Human heart along with blood vessels is known as Cardiovascular system [3].
Coronary artery disease (CAD), heart failure, cardiac arrest, and unexpected cardiac death are due to disorders of Cardiovascular system. It affects humans mostly due to uncontrolled behaviour in their daily life. The interior part of arteries of the heart consumes fatty deposits or plaque.). It is mainly cholesterol deposits within the arteries and it is known as atherosclerosis. These deposits may thicken and cause the coronary arteries to narrow. Due to this the amount of blood and oxygen flows in a reduced rate through the arteries to the heart. The narrowing of the arteries prevents blood and oxygen from flowing easily to the heart muscle. This effect will be happened to human as he/she grows ages.
Angina (pain, discomfort, or pressure in the chest) caused due to these symptoms. If blood flow is completely blocked by plaque or a blood clot that forms inside the narrowed coronary artery, a heart attack may occur. Coronary artery disease symptoms may include weight gain, weakness and fatigue, etc. of the patient [4].
Based on the above discussion, it can be infered thatCVD plays significant role in human"s life. Early detection of this disease is necessary for saving patients life. CVD is often dependent on mental anxiety, daily lifestyle, working profile of people. Symptoms of anxiety ,depression and stress may often lead to CVD [5]. For detection of CVD using two heterogenous approaches such as syntactic pattern recognition based approach and predictive modeling using deep learning method.
Grammatical approach in the first process is used cardiac disease diagnosis [6]. In this method the patient data matrix was constructed initially [6]. It is used for classification of diseases. Based on diagnosis criteria, pattern primitives are identified. It is obtained from the updated diagnostic criteria published by American Heart Association and also from the medical literatures [7]. Based on patient data matrix, an input string is generated. One of the context free language i.e. Chomsky normal form are used to form the production rules for six diseases including Normal. For parsing the input string Cocke-Younger-Kasami (CYK) algorithm is used [6]. At the end, the parsing table will highligt the occurance of disease. In the second approach, an automated predictive model is favoured for CVD detection. Early heart disease can be predicted by utilising supervised machine learning approaches those takes patient"s record as input. To explore the problem of heart disease detection, classification methods are implemented. It associates input variable for finding target classes based on training data. Attributes comprise of patient"s details such as serum, cholestoral, etc. These features can form a good feature space while recognizing patients with cardiac symptoms. The proposed models acts for analyse the information of patients about their past health history records and predict their chances of affecting in cardiac trouble. This prediction will in turn benefit the doctors to provide well-versed decision and prescribe medicines and surgeries accordingly [8].
By means of machine learning approach, heart disease detection is focused in this chapter as one of the approaches. In order to diagonse CVD, it is necessary to extract knowledge from patient"s health history database and identify relationship between interfereing factors and heart disease probability. The proposed methods capture relevant health records of patient and discovers the tendency of heart disease. Timely detection and screening play leading role in prevention of heart attacks. Deep learning (DL) [9] is implemented in this chapter for the heart trouble prediction by a means of medical data. Two models are exemplified for this purpose. This paper proposes Recurrent Neural Network(RNN)-based which assembles multiple Long Short Term Memory (LSTM) [10] layers where LSTM is known to be a variation of RNN. This neural network classifier receives all interfering factors as features and identifies patients with heart disease troubles. The second model consists of multiple GRU layers. For finding superior model a comparative study is drawn among the both specified models. Lastly the best model for CVD classification problem is selected on the comparitive study.

Related Works
CVDs are the principal reason of mortality worldwide per year that may reach an approximation of 23.6 million in 2030 [11]. The largest contributor of CVDs is Coronary heart disease (CHD). The damage of arterial wall is the main reason. The leading common indicator of CHD is Myocardial Infarction (MI). Angina pectoris is the former symptom of the pathology for 50% of patients [11]. Immediate diagnosis of CHD patients can save life. Image processing techniques can help early detection of CHD.
For heart disease detection initially the electrocardiogram (ECG) is performed. It was started late in the 1950's. The diagnosis of disease is made by researchers using non-syntactic methods as well as syntactic methods and hybrid methods [6,12]. The syntactic method is used for analyzing ECG pattern. This method is not much used in pattern analysis and a few works have been done till date. Only specific aspects of these areas are looked upon by researchers. For peak recognition in ECG's using Context-free grammar is described in [12].
A pattern in syntactic approach is considered to have a complex construction, which is decomposed into sub-patterns that in turn are decomposed into simpler sub-patterns, etc. In cardiology an ECG signal pattern is also treated as a linear structure, which consists of separable substructures describing the different phases of human heart's beating (e.g. P wave, T wave, ST segment, QRS complex), A set of various structures is perceived as a formal language. Words (structural patterns) are analyzed by formal automata which not only are able to identify proper categories (diseases) for patterns, but also can characterize their structural features. Therefore, syntactic pattern recognition seems to be convenient, if a descriptive structural characterization is a goal of ECG analysis rather than only its classification (i.e. assigning an ECG signal to one of classes of heart dysfunction phenomena) [13]. Electrocardiogram (ECG) is often utilized as common but vital sign from the clinical environment perspective. Analyzing ECG often reveals many cardiac disorders.
Existing literatures on automatic ECG classification are clubbed into different clusters for the review of classification process. In ECG-based computer-aided-diagnosis system unwanted information in ECG waves, parts of ECG wave detection, heartbeat classification, etc, are necessarily removed for proper diagnosis of disease. Here two approaches are discussed. One is grammer based classification of diseases and other is machine learning based hybrid approach for classification of diseases [14][15].
Machine Learning (ML), specialized field of AI, can be used in healthcare that analyzes numerous different data points, recommends outcomes, provides well-timed risk scores, defined resource allocation, and delivers many other applications. The opportunities for improving clinical decision support can be made by ML. ML techniques are often related data mining procedure. From the data mining point of view, ML techniques can be said that data mining examines an enormous amount of data and sets a particular outcome based on those examined data. ML focuses on achieving that goal by using harvested data for modeling smart intelligent automated tool. By implementing data mining rules, data related to coronary illness is extracted from a large database. For this purpose, weighted association implemented in [16]. Using rule mining algorithms on patients" dataset, heart disease is predicted. Prediction results achieved 61% training accuracy and 53% testing accuracy.
Historical medical data is utilised in order to predict Heart Disease using ML techniques [17]. 462 instances of South African Heart Disease dataset used for prediction purpose. All these algorithms used validation method. It is 10-fold cross validation. The probabilistic Naïve bayes classifier performed better in comparison to other classifiers [17].
Heart Failure (HF) is classified into categories such as HF with preserved ejection fraction (HFPEF) and HF with reduced ejection fraction (HFREF) [18]. Various classification methods are used for detecting patients with heart failure. Several classification methods such as classification trees, random forests, bagged classification trees, boosted classification trees, and SVMs and for prediction, logistic regression, regression trees, bagged regression trees, random forests, and boosted regression trees are utilised for detecting patients with aforementioned three categories of heart failure. These are tree-based methods and regression trees for predicting and classifying HF subtypes.
K. Gomathi et al [19] predicted heart disease using Naïve Bayes Classifier and J48 classifier. They have concluded that Naïve Bayes classifier reaches an accuracy of 79% where J48 classifier reaches an accuracy of 77%. P.Sai Chandrasekhar Reddy et al used ANN for predicting Heart disease by considering relevant features such as heart rate, blood pressure etc [20]. Boshra Brahmi et. al. [21] employed several classification techniques such as J48, KNN, SMO, and Naïve Bayes for diagnosing heart disease. Instead of focusing on feature selection, emphasis is given on all relevant features for heart disease diagnosis and prediction [22]. This prediction modeling is implemented by assembling Random Forest with a linear model. Another study considered Arrhythmia which is irregular changes of normal heart rhythm as a prediction field [23]. Arrhythmia prediction is accompanied by implementing CNN which accepts ECG signals as input.

Datasets
Datasets having ECG waves are collected from hospitals of West Bengal. The middle aged people with the range from 40 to 70 are considered. Others are taken from the American Heart Association [12].
This study implements deep learning based study for implementing computer aided classification. UCI machine learning repository is used for predicting cardiac disorder of a patient. Various attributes are in the dataset [24]. However, the attribute "target" is utilized as output class of the prediction. Figure1 presents the overall histogram representation of the dataset. For obtaining a balanced dataset, preprocessing techniques are performed. After collecting the dataset some pre-processing techniques such as NaN values handling, scaling and transformation of some attributes such as age, cholesterol level etc are performed. This will assist the classifier in obtaining better predictive results. This pre-processed data is divided into 67:33 as training and testing dataset. Training data is given as input to the classifier model for learning process and after that testing dataset is used for obtaining prediction results. The distribution of cardiac and non-cardiac patients on the dataset is shown in Figure 2 and Table 1.

Proposed Method 1: Pattern Recognition with Syntactic Recognition based Approach
While detecting coronary artery disease, contaminated recordings create major problem. So, pre-processing steps are highly recommended before processing ECG waves. For removing noise low-pass filter as well as high-pass filter is used. For power source interference 50 Hz notch filter is used. Figure 3 shows a normal ECG wqave. The ECG wave is taken through 12 lead systems. Six electrodes art placed on the limbs. On chest six are placed. Figure 3: Normal ECG waveform and its feature patterns [25]. P, Q, R, S, and T waves are called PQRST complex. "R-R interval", corresponds to a cardiac cycle. The following parameters are also used for diagnosis of heart disease from ECG Wave. 1.
The end of P wave to the beginning of Q wave is denoted by PR interval. The starting of P to the end of Q is specified as PQ interval.The horizontal portion from the end of P wave to the beginning of Q wave is known as PR segment. The depolarization wave is identified by this segment.

2.
The end of S wave to the start of T eave is ST segment.

3.
The staring of Q wave to the end of T wave is called QT interval [25].
The syntactic methods of pattern recognition for cardiac diseases diagnosis is the main aim of the first approach [6]. Initially for generating the grammar in CYK normal form patient data matrix is used. The grammar is context free since there is no dependency of consecutive pattern primitives. These primitives are treated as terminals of the grammar. Non-terminals are created based on the terminals for classification of heart diseases. Here pattern primitives are terminals of the grammar. The updated diagnostic criteria published by the American Heart Association and from a review of the medical literatures are used as diagnostic criteria. [14][15]. Initial string is based on the patterns primitives and patient data matrix. It is always required that the input string is parsed by the Production Rules of the grammar. Production rules of grammar are generated using terminals and non-terminals of the grammar. Production rules are in the form of CYK normal form. It is as per definition of context free language and it is developed by using diagnostic rules for describing five cardiac diseases besides the normal ECG. The Cocke-Younger-Kasami algorithm is used to parse the input string [6]. If the patient ECG has a sign of any abnormalities then the first column of the top of the parsing table will show the disease. Right-and left-bundle branch block, leftventricular hypertrophy, left-anterior hemi block, and left-atrial hypertrophy-hereafter abbreviated to RBBB, LBBB, LVH, LAN and LAHI are choosen as the five diseases. Normal ECG wave is considered as six one for classification of disease. Four different areas of the ventricles are taken since these represents one of the vital part of the heart. Left-atrial hypertrophy is considered since it is common`. It is aimed here for diagnosing a patient with normal symptom or abnormal symptom. For simplicity the five diseases are denoted as D1, D2, D3, D4 and D5. The selection of primitive selection is both problem-oriented and pattern-dependent. There is no general solution to this problem as yet [12]. The diseases and abnormal findings are identified as relationship for forming decisions regarding pattern and it is then transformed into pattern primitives. The patient data matrix is checked against Diagnostic criteria and patient data matrix are checked and transformed it into binary decision i.e. satisfied primitive and unsatisfied primitives. Either of two types is based on whether the particular condition is satisfied or not. A set of ten primitives in terms of notations A1, A2, B, C and D have been selected for the five diseases.

D2:
(A) QRS duration (T QRS ) is greater than 0.12s. (B) A R, A S and A R each complex is greater than 6mm in at least one of the leads I, AVL, V 5 and V 6 or notched R wave (i.e. the duration of R wave is greater than 044s) present in at least one of the leads I, AVL, V 5 or V 6 .

D3:
(A 1) A R is greater than 27 mm in lead V 5 or V 6 . (A 2) Q wave amplitude A Q or A S in lead V 1 plus A R in lead V 5 or V 6 is greater than or equal to 35 mm.
(A3) A R is greater than or equal to 13 mm in lead AVL. (A4) A R in lead I plus A S in lead I11 is greater than or equal to 26mm. (B) Patient's age is above 30 years.

D4:
(A) Left-axis deviation (LAD) is between -45" and -60". (B) Q wave duration is less than or equal to 0.02s in lead I and aVL.
(C) A, is less than 5 mm in leads I, 11,111 and AVF.
(D) Normal QRS duration (pure LAH can increase the QRS duration no more than 0.02 s, thus a QRS duration of 0.1 1 s indicates the coexistence of RBBB or some other form of ventricular conduction abnormality).

D5:
(A 1) Notched P wave amplitude (A P ) and P' wave amplitude (A P ) is greater than 1 mm in leads I, I1 or AVL. (A 2) A P is greater than 3 mm in lead I or in lead AVL, or equal to 3.5 mm in lead 11. (B) Overall P wave duration (D P ) is greater than 0.11 s. Considering the diagnostic criteria as specified above the pattern primitives are selected based on whether the diagnostic criteria is satisfied or not. This is shown in Table 2.  An input string is generated based on the diagnostic pattern primitives. This input string defines complete characterization of the disease structure taken into consideration. This representation will produce a disease pattern that comprises of basic elements those are related the disease present in the input ECG wave. The string generation algorithm is described below.

Algorithm
Input: Patient data matrix. Output: A string Z of symbols is formed by the alphabet set = {a, b, c, e, h, j, k, I, m, n}. The symbols indicate pattern primitives either 'satisfied' or 'unsatisfied' condition for the considered disease.
Step 3. If the condition 'q' for the disease p is satisfied, then consider z P = satisfied primitive and go to Step 5.
Step 5. If q is greater than the total number of criteria in disease p, then go to Step 7.
Step 6. Set q = q + 1 and go to Step 3.
Step 7. Now concatenate z P with z p-1 , to form the complete string z.
Step 8. If p is greater than the total number of considered diseases, then continue, otherwise set p = p + 1 and go to Step 2.
Step 9. Exit. Assume that the sample electrocardiogram is retrieved from a 58-year-old male patient. It has already been processed by using the described algorithm and this will yield the patient data matrix which is presented in Table 3. To shorten the discussion, we denote the following partial information from Table 3.
After collecting the above information, we obtained the following string using  After the string generation operation is completed, the immediate task to be accomplished is to specify diseases by a means of syntax analysis. The proficiency of a syntax analyser is dependent mainly on the grammar that generates the language and also depends on the parser that evaluates the syntactic correctness of an input string. In order to describe the considered normal and disease patterns, Context-free language in Chomsky normal form has been utilized. The names of the diseases with other symbols are taken as non-terminals. The names used for such non-terminals are the same as those used in conventional ECG nomenclature so that they can be easily understood. The diagnosis grammar describing the normal as well as the five disease patterns is given below.
The production rules are in the meta language BNF (Backus normal form). Many variants of BNF are in use. WSN (Wirth syntax notation) is used for convenience [32]. It is important to know whether the string belongs to L G Diagnosis or not.
The proposed method uses the Cocke-Younger-Kasami (C-Y-K) bottom-up parsing algorithm and it produces a structured table. The tabular form is well suited for Physicians to have a quick look about the condition of patient. The patient data matrix is now validated against diagnostic criterias and a string comprising of primitives is formed. Suppose the input string is composed of the following: x= m 4 k c e 3 I e j 2 n 2 a "m" means single occurrence and "m 2 " represents the occurrence of primitive m twice and so on. Table 4 shows the parsing table. It indicates the disease or normal as per convention used in the conventional parsing table. The occurrence of diseases is investigated on inspection of the first column and particular location namely the third row, (total number of diagnostic criteria present in N-number of diseases-the number of criteria present in LAHI) plus one of the Parsing Table. 'NORMAL' indicates the left atrial hypertrophy diseases are not present in the disease pattern. The presence of any disease may occur more than once in the first column. It is the basis of parsing table that the final diagnostic report is made by merging those diseases into one. It is declared "NORMAL" since the top row of the first column of the parsing table contain none of the considered diseases.

ProposedMethod 2: Deep Learning based method
Deep Learning (DL) is a specialized area of Machine Learning (ML) which enforces automatic learning of abstract information from large database without incorporating manual feature engineering methods. Deep neural networks are capable to compute complex functions by extracting features from input data. These computations are dependent on number of hidden layers and other parameters. For accompanying the complex computations, activation functions are used. Activation functions are advantageous in executing complicated computations and associates input signal into output signal within a certain range [26,[33][34][35][36][37][38].
Long short-term memory (LSTM) neural network is a category of RNN that performs context based prediction which is not taken care by traditional RNN. LSTM is efficient in regulating gradient flow and better preservation of long-range dependencies. Every cell in LSTM is comprised of input gate, forget gate, and output gate. Use of input gate estimates when to remember input value, and when to remember or forget the value is determined by forget gate. The output gate identifies when the unit should output the value in its memory [10]. The similar concept is employed by Gated Recurrent Unit (GRU). But as compared to LSTM, GRU receives less number of parameters.
Over-fitting is a serious problem that is faced mostly by neural network based model. This problem occurs when a model learns noise present in the training data which in turn negatively impacts the efficiency of the model on unknown data. This problem can be eliminated by incorporating drop out layers. During each of the training iterations, Dropout layer randomly deactivates a fraction of the units or connections in a network [27]. One the neural model is configured, it undergoes through a training process. The training process is executed through one cycle which is known as an epoch. In an epoch, the dataset is partitioned into smaller sections. For completing execution of each epoch, an iterative process is carried out by a means of batch size that considers subsections of training dataset for completing epoch execution [28]. The training process is also accompanied by a training criterion, known as binary cross entropy function as a binary classification problem is implemented in this study. Binary cross entropy finds out the difference between the true value (which is either 0 or 1) and the prediction for each of the classes and then class-errors are averaged out to measure the final loss [29].
Any machine learning models depend on some predefined metrics such as accuracy, precision, recall, f1-score, MSE and cohen-kappa statistics. These metrics help in identifying the best problem solving tactic. Accuracy [30] determines the percentage of true predictions over the whole number of instances considered. However, accuracy evaluation may not be enough since it does not reflect wrong predicted cases. For resolving the above mentioned problem, two more metrics known as, Recall and Precision can be yielded. Precision [30] ascertains the fraction of correct positive results over the number of positive results predicted by the classifier. The number of correct positive results divided by the number of all relevant samples is measured by recall [30]. F1-Score or F-measure [30] is another parameter that is basically the harmonic mean of both precision and recall. Mean Squared Error (MSE) [30] is another evaluating metric that can differentiate the prediction observation from actual observation of the test samples. A model having higher values of accuracy, F1-Score and lower MSE value indicate best problem-solving technique. Cohen-Kappa Score [31] is a statistical parameter that discovers inter-rate agreement for qualitative items for classification technique.
The objective of any classifier model is to map input variables into target variables considering the training dataset. The proposed classifier employs deep learning techniques in order to recognize whether a patient has heart disease or not. The proposed method uses LSTM-BRNN model for such prediction. A stacked LSTM-BRNN model is implemented as the second approach that stacks four Bidirectional LSTM layers and four dense layers. This stacked LSTM-BRNN model is built up using 256, 128, 64, 16 nodes respectively in every single layer. To avoid over-fitting problem, each layer is incorporated with 20% of dropout regularization. Next, four fully connected layers are stacked by including 8,4,2,1 number of nodes respectively. The first four LSTM layers and the final dense layers are activated using sigmoid activation function. Finally, the above mentioned layers are assembled using "adam" optimizer. This model is accompanied by binary cross entropy loss function. Construction of this model is dependent on epoch size of 100 and batch size of 32. The mentioned hyperparameters have undergone through a series of possible values and the mentioned values are picked up. This fine-tuning operation will support in attaining the best problem-solving approach. Once this model is constructed, training data is fitted into the proposed model.
During the training phase, the presented neural network model accepts a total of trainable 1,367,993 parameters to retieve prediction. An in depth description in terms of layers, type of layers, activation function used, output shape produced by each layer, number of parameter accepted by each layer is summarised in Table 5.The proposed model consists total of 12 layers, out of which 4 layers are of LSTM neural network. The same configuration is used by stacked bi-directional GRU model. Table 6 describes the detailed construction of the second model. Description of all the hyper-parameters for Stacked Bidirectional LSTM model as well as Stacked Bidirectional GRU model is summarized in Table 5 and Table 6 respectively.   During training process of the Stacked Bidirectional LSTM model, accuracy and loss are calculated for each epoch as depicted in Figure 4. As the quantity of epochs grows, the accuracy increases gradually and reaches around a value of 0.95. In contrast, the loss gradually decreases and attains lowest value around 0.12. Once the training process is done i.e., after completing 100 epochs, accuracy, f1-score, cohen-kappa score and MSE rate for unlabelled dataset. Table 7 provides the prediction efficiency for the presented model. It is to be noted that the proposed stacked bi-directional LSTM model has 4 LSTM layers. In Table 8 it is also shown as the model efficiency is increased over 1, 2 and 3 LSTM layers. Increasing more than 4 LSTM layers is not enhancing much substantial efficiency. Hence it is restricted to have 4 LSTM layers as model component. Figure 5, the Stacked Bidirectional GRU model is trained for 50 epochs. Increasing the number of epochs more than 50 is not contributing the efficiency of the model. Hence, it is restricted till 50 epoch size. The training loss declines rapidly within 10 epochs and later decreases gradually as number of epochs increases. After 50 th epoch it approaches a loss of 0.394. During training this model starts from obtaining a lower value accuracy which is increased till 0.8542 after certain epochs. Table 9 provides the performance of prediction for the proposed GRU based model. Comparative study among the 1, 2, 3 and 4 GRU layers is also described in Table 10.   As shown in Table 9 and Table 10, it is clear that the stacked bidirectional GRU model does not show promising efficiency as that of stacked bidirectional LSTM model. Hence, this model can be regarded as the best one for pursuing the CVD classification problem. Early prediction of heart disease may increase life span of the heart patient due to arised anxiety for numerous reasons. Considering past health record of a patient, the proposed Stacked Bidirectional LSTM Model can predict cardiac disease probabilities efficiently. This will assist the medical care units as well as accompany the doctors so that counter measures such as surgeries, medicines can be suggested. This proposed method reaches a promising and significant result that is dedicated towards heart disease prediction. Experimental results have shown prediction accuracy of 93.22%, F1-score of 0.93, kappa score of 0.87 with MSE of 0.07.

As shown in
It is importantant to note that the following types of noises are considered in the ECG wave. It is shown in Figure 6. The following Table 11 will show types of cardiovascular diseases with symptoms, cause and prevention methods [39].

Conclusions
Healthcare shows a significant key for perceiving the health related aspects of the humans around the globe. This chapter focuses on identifying CVDs in human heart from two perspectives. These two approaches cover syntactical pattern discover from ECG reports as well as construction of a predictive modeling using deep learning technique. The pattern discovery approach is an interesting domain yet challenging to perform because of its dependency on formal language generation. The predictive modeling is based on deep neural network. Multiple neural networks are utilized as the second approach. Use of neural network requires to be exemplified as it simulates human brain like tasks. Construction of an intelligent computerized tool is favored in this study as it facilitates the CVD classification task. Separating the CVD patients may assist the medical care unit to put more attention for their treatment. This task will definitely benefit the clinicians to assist in taking informed decisions.