Cardiac Diagnostic Feature and Demographic Identiﬁcation Models : A Futuristic Approach for Smart Healthcare using Machine Learning

: Around the world, every year, about 17 million people death cause happen due to CardioVascular Diseases (CVD). As per clinical records, primarily sufferers exhibit myocardial infarctions and Heart Failures (HF). Creatinine is a Musculo - skeletal waste product. The kidneys ﬁlter creatinine from the blood and excrete it through the urine in a healthy body. High creatinine levels can suggest renal problems. Elevated Serum Creatinine (SC) has been well established in the HF. Patients’ electronic medical records can be used to quantify symptoms and other related clinical laboratory test values, which would then be utilized to direct biostatistics exploration to uncover patterns and associations that doctors would otherwise miss. The latest American Heart Association guidelines for 1500 mg/d sodium tend to be sufﬁciently relevant for patients with stage A and B with HF. In this article, we used a dataset of the year 2015 of heart patients records of 299 patients. The present paper used the data analytic and statistical tools to verify the signiﬁcant differences between alive and dead patients’ SC and Serum Sodium (SS). It also demonstrates the impact of signiﬁcant features on abnormal SC and SS on the Survival-Status levels. The Age-Group feature, which is derived from age attribute and, Ejection Fraction (EF), anemia, platelets, Creatinine Phosphokinase (CPK), Blood-Pressure (BP), gender, diabetes, and smoking-status were utilized to determine the potential contributing features to mortality with Cox regression model. The Kaplan Meier plot was used to investigate the overall pattern of survival concerning age-group. During pre-processing of the dataset, Age and SS were removed due to multicollinear features during performing machine learning algorithms experiments. This paper also predicted patients’ survival, age group, and gender using supervised machine learning classiﬁers. Detection of signiﬁcant features would help in making informed decisions to balance the lifestyle of heart patients. The author revealed that the patient’s follow-up months, as well as SC, EF, CPK, and platelets, are sufﬁcient ejection were found to be signiﬁcant factors in predicting the patient’s age-group. Smoking habits, CPK, platelets, follow-up month, and SC of each patient were discovered to be signiﬁcant predictors of patient gender. The hypothetical study proved that SC and SS making substantial differences in the survival of patients ( p < 0.05) and failed to reject that anemia, diabetes, and BP making a signiﬁcant impact on the creatinine and sodium of each patient ( p > 0.05). With χ 2 (1) = 8.565, the Kaplan Meier plot revealed that mortality was high in the extremely elder age-group. The ﬁnding has possible effects on clinical practice and becomes a new medical support system when predicting whether a patient can survive a heart attack or not. The doctor should primarily concentrate on follow-up month, SC and EF, CPK, and platelet count since the aim is to understand whether a patient survives after HF. p -values for both (Alive and Dead) levels. This paper also found signiﬁcant differences with unique µ rank in SC ( µ rank : 128.10 & 196.31). The signiﬁcant p value of SC suggested to consider signiﬁcant differences ( U = 5298, Z = -6.398, p < 0.05). In case of SS, signiﬁcant differences could be observed in mean rank ( µ rank : 162.40 & 123.78). Further, the SS signiﬁcantly different towards Survival-Status levels ( U = 7226.50, Z = -3.622, p < 0.05).


Introduction
HF is a condition in which the muscles in the heart wall deteriorate and swell, limiting the heart's ability to pump blood. The heart's ventricles can become rigid and stop filling correctly between beats. With time, the heart fails to meet the good demand for blood in the body, and the individual begins to have trouble breathing. Coronary heart disease, diabetes, high BP, and other illnesses such as HIV, alcoholism or drug abuse, thyroid disease, an excess of vitamin E in the body, radiation or chemotherapy, among others, are the most common causes of HF [1,2]. Occasionally, the symptoms of CVD differ by gender. For example, a male patient is more likely to experience chest pain. In contrast, a female patient is more likely to experience other symptoms associated with chest pain, such as nausea, excessive exhaustion, and shortness of breath [3]. Researchers have been experimenting with a range of strategies to predict Heart Disease (HD), their age-groups and the gender of HD patients. Still, prediction is challenging at any stage due to a variety of variables, including but not limited to difficulty, execution time, and approach precision [4]. Early detection can also help avoid HD, which can lead to death. Angiography is the most precise and effective tool for predicting cardiac artery disease [5], but it is costly, making it out of reach for low-income families. When extracting valuable information from large amounts of data, data mining plays a critical role. It's used in nearly every field of life, including medicine, engineering, industry, and education. Data mining is a technique for examining data to retrieve important decision-making information that has been concealed in the past repository. Several machine learning algorithms have been used to understand the complexity and non-linear interaction between various variables by reducing prediction and natural results. We need to use machine learning algorithms to assist medical healthcare practitioners in analyzing data and make accurate and precise diagnostic decisions due to the ever-increasing amount of medical data. Different classification algorithms are used in medical data mining to estimate CVD in patients and death predictions due to heart attacks [6]. In older people, age plays a vital role in the degradation of cardiovascular function, which leads to an increased risk of CVD [7,8]. The American health association recorded CVD in men and women in the United States were 40 to 59 years old and also found 75% and 86% in 60 to 79 and over 80 years old in men respectively [9].
In this study, we examine a dataset of medical records from patients with HF that were published in July 2017 and 2020 by Ahmad T. et.al [14] and Chico D. [31] . From the medical records of 299 Pakistani patients with HF, Ahmad T. used conventional biostatistics time-dependent models (Cox regression) to predict mortality and identify key features. Following that, Chico D. used the same dataset to predict Survival-Status various state-of-the-art machine learning algorithms. We hope to close this gap by employing a variety of data mining approaches to first estimate patient's gender,age-group and Survival-Status with statistical methods to determine the impact of SC and SS 3 of 43 on various health-related problems like diabetes, anemia, High BP level and also confirm the effect on the Survival-Status levels. Eight machines learning models Table5 were used: Decision Tree (DT), Logistic Regression (LR), and RF, Gaussian Naïve Bayes (GNB), Gradient boosting Machines (GBM), Support Vector Machine (SVM),k-Nearest Neighbour (k-NN), and Extreme Gradient boosting (XGB). The HD study literature motivates us to do a study on the following points on the target dataset: • Motivated to develop a decision-making method that accurately predicts the age-group (Adult and Very Old) and Gender (Woman/Man) and Survival-Status of cardiac patients. • 10-fold cross-validation technique can be used for extracting out best-performing predictors. • Significant features from the dataset that influence the machine learning algorithm's output can be identified to examine the critical risk factors.

Effect of Abnormal SC and SS on HF Patient
As we know that healthcare professional can assess the quantity of sodium in our blood with a sodium blood test (also known as a serum sodium test). SS is routinely measured in order to examine electrolyte, acid-base, and water balance, as well as renal function. If the patient is not in renal failure or has severe hyperglycemia, sodium accounts for approximately 95% of the osmotically active chemicals in the extracellular compartment.The ideal SS reference range is 135-147 mmol/L.
Since the widely accepted idea that increased sodium consumption contributes to increased fluid retention during cardiac failure, a sodium-free diet as proposed for the general public is expected to boost the results of HF sufferers. There is more evidence present which linking to sodium intake with BP [10], the occurrence of hypertension, [11], CVD, [12] other HF risk factors, even the latest American Heart Association guidelines for 1500 mg/d sodium tend to be sufficiently relevant for patients with stage A and B HF. Yash Patel et.al [13] have found in their review study that due to reduced renal perfusion, sodium, and water reabsorption from renal tubules, HF is characterized by sympathetic system activation and renin-angiotensin-aldosterone system (RAAS) activation. The activation of the antidiuretic and anti-natriuretic systems has been linked to a sodium-restricted diet in HF patients. Latest Cochrane reviews from 185 clinical trials of low vs. high-sodium diets found a statistically significant rise relative to high intake groupings of sodium in the range of renin, aldosterone, noradrenaline, adrenaline, cholesterol, and triglycerides. Tanvir Ahmad et al. [14] studied and focused on heart failure patient's survival and used cox regression model. They found 32% mortality rate due to CVD and found EF, age, creatinine (creatinine > 1.5 renal dysfunction), sodium, anemia, and BP as significant mortality rates. Smoking and diabetes were not found necessary toward HF death. Sagar B. Dugani et al. [15] examined a coronary HD dataset w.r.t gender to see whether risk profiles differed by age at the start and diabetes, lipoprotein insulin resistance had the highest relative risk of more than 50 clinical and biomarker risk factors. In the women, diabetes had the highest aHR of any clinical factor. Mohammed W Akhtar et. al [16] finding was on renal insufficiency (abnormal SC) in HF patients and found patients had a five-fold increased risk of death after being discharged from the hospital. Several studies have generally identified and supported the close connection between hypertension and dietary sodium intake. Reducing dietary sodium not only reduces BP and hypertension but also leads to decreased cardiovascular risk. Regardless of sex or ethnic group, both hypertensive and normotensive people experience a significant drop in BP after a sustained slight reduction in salt intake, with more significant decreases in systolic BP for more considerable reductions in dietary salt. Water accumulation, increased systemic peripheral resistance, changes in endothelial function, structural changes and function of broad elastic arteries, changes in sympathetic activity, and autonomic neural regulation of the cardiovascular system are all linked to a high sodium intake and an increase in BP [17]. Also, in its pathophysiological context and clinical implications, the subject of salt-sensitivity, which refers to individual susceptibility concerning BP variations following changes in dietary salt intake [18]. Blood sugar levels in people with diabetes can rise to dangerously high levels, causing health problems such as kidney disease [19].
The study by Abede Tamrat et al. [20] included 370 patients. Anemia was shown to be expected in 41.90 % of the study cohorts, with most of the participants being female (64.59 % ). Between anemic and non-anemic patients, there was a substantial difference in hemoglobin, creatinine, and salt levels. Angiotensin-converting enzyme inhibitors were used less commonly by anemic patients with HF.
• Objective 1: To explore an impact of SC and SS on Survival-Status level of the patient.
1. H 01 : No significant difference between Alive and Dead towards SC and SS.
• Objective 2: To explore an impact of SC and SS on anemia, diabetes, and High BP levels of the patients.

Smoking Habits exists among Gender specific HF Patients
Ambuj Roy et. al [21] researched atherosclerotic CVD and found it caused by frequent smoking habits. They found a relation between various types of tobacco use and cardiovascular manifestation was high, and magnitude was significant. In younger people who smoke more cigarettes a day, between women and men, and in some ethnic groups, such as the South Asians, the risk seems to be higher. Huxley RR et. al [22] examined the sexual disparity of smokers in the risk of coronary HD and underlined uncertainty in gender-specific smoking habits towards HD.
• Objective 3: To explore the association of the gender and smoking habit of the patient.
1. H 03 : No significant association between gender with smoking habits.

HF Patients among Particular Age-Group
Majidur Rahman et.al [23] research mainly focused on detecting a specific age-group based on cancer diagnosis and another related factor. They found the highest accuracy (59.09%) with Artificial Neural Network (ANN) among proposed classifiers (LR, SVM, ANN  [25] researched seven different racial/ethnic groups by using their blood, saliva, and brain samples tested. Researchers examined blood's intrinsic epigenetic aging (regardless of the number of blood cells) and extrinsic epigenetic aging rates w.r.t blood cell counts and by tracking the immune system age). Their findings were sex, race/ethnicity, and Cardio HD (CHD) risk factors to a lesser degree with epigenetic aging rates, but not to incident CHD outcomes. Hispanics, older African-Americans, and women have lower mortality rates than predicted. Jennifer L. Rodgers et. al [26] research findings were on gender and aging-related to HF risks. They found that CVD was more common in the elderly and those above the age of 65. In adults, age was an independent risk factor for CVD, but other factors such as frailty, obesity, and diabetes compound these risks. They also found gender-related findings that older females were confirmed to have a higher risk of CVD than age-matched men. The chances of CVD rise with age in both men and women, which corresponds to a general decrease in sex hormones, especially estrogen and testosterone. Benjamin et. al [27] found that several variations in CVD risk factors were associated with sex in aged adults. They found that although age was a risk factor for CVD in both men and women, it was clear that older women were more susceptible to some HD-related complications. Villa et. al [28] examination on HD patients that women were generally safe from CVD before menopause, but their risk increases dramatically after menopause. In both men and women, the reduction in sex hormones had been shown to play an essential role in developing CVD with the onset of advanced age. Edward Korot et. al [29] used the Auto ML model with images of the retinal fundus and predicted the gender from the UK Biobank dataset with accuracy (88.8%).
• Objective 6: To predict the gender of HF patient based on significant features

HF Patients' Survival-Status Identification
Jian Ping Li et. al [30] designed an HD identification method using supervised ML classification algorithms Table 5

Problem Statement
For a better understanding, the limitations and significance of the proposed HD diagnosis approaches have been summarised in Table 1. Many of these existing approaches used various 6 of 43 techniques to detect HD and other related demographics early. However, all of these methods have a low prediction accuracy, precision, and recall rate for predictions. According to Table 1, the HD and other mentioned identification method's prediction performance metrics needs to be improved for more efficient and reliable early detection for better care and recovery. The study also found women are more vulnerable to CVD and SC found the most significant feature in every mentioned classification. Therefore, it is imperative to study SC, SS, and gender-related features and found the significant impact of smoking, BP, and diabetes. The Cox regression method for survival analysis may be employed to examine the effect of various existing demographic dataset features on time-specific occurrence. New methods for accurate identification of CVD and other features from given dataset are needed to address these issues. Prediction accuracy without using SMOTE and further related performance enhancement is a significant challenge and gap of the research.

Research Organization
The remainder of the paper is structured as follows: Section 5 elaborates the state-of-art-research design and methodology. Section 6 explains the machine learning Models with Hyperparameter Tuning. Section 7 focuses on the basics of applied machine learning algorithms. Section 8 discusses the feature ranking and selection mechanism of machine learning algorithms. Section 9 debates the on the various performance evaluation metrics. Section 10 reflects the results of seven experiments. Section 11 discusses the findings of exhibited in Section 10 with the existing literature. Section 12 concludes the real crux of the extant research with the future proceeding.

Design and Contribution
A hybrid-python-based application has been designed and created for automating the inferential, differential, and predictive analysis of the patterns of heart disease data. The present model called the Cardiac Diagnostic Feature, and Demographic Identification (CDF-DI) can be used as a cardiac diagnostic aid. Figure 1 depicting the pictorial view of the CDF-DI. These predictive algorithms are implemented, and their results are compared in terms of several performance metrics. The model can be used for confirming the impact of survival status, anemia, diabetics, and BP levels towards SC and SS of the patient. The proposed CDF-DI can assist doctors in efficiently diagnose patients, consequently enhancing the clinical decision-making processes for heart disease. Early treatment can therefore be done to reduce deaths resulting from the detection of late cardiac disease. Our research contributions can be summarised as follows.
-It is enhancing the diagnostic accuracy of Survival-Status, age-group, and gender of heart disease patients. The present paper proposed a model -CDF-DI, in which it integrates the 9 of 43 VIF technique for eliminating multi-collinearity among attributes. Further, it is adopting a 10-fold CV method for finding a robust predictive model among eight machine learning model classifiers based on MCC, F1-Score, and accuracy ranking performance metrics. -Analysis the performance of machine learning models and comparison with state-of-the-art models. The presented CDF-DI is compared to the findings of previous studies and evaluated against another classification model. The present study also included a statistical analysis to confirm the proposed model's significance compared to other models. -Verifying the impact of patients' complication levels towards SC and SS. In identifying the effects and association, the present article used patient's Survival-Status levels, BP levels, anemic level, and diabetic level towards SC and SS with Mann-Whitney U Test (non-parametric test). The gender and smoking level association was also verified with χ 2 (non-parametric test). Survival analysis is also done to ascertain the impact of age-group levels on Survival-Status levels with patients' follow-up months.

Dataset Description
The HF clinical record dataset used in this study was obtained from the UCI Machine Learning repository [45,46]. Every patient profile has 13 clinical characteristics, and the dataset includes medical reports of 299 patients who had heart problems that were collected during the follow-up months. There are 194 men and 105 women among the 299 records. The average follow-up period was 130 days, ranging from 4 to 285 days. A cardiac echo study or specialist notes is used to diagnose the disease. Renal dysfunction is indicated by an SC level higher than the average level(1.5). There are no missing values in the dataset. Table 2 shows a high-level description of the dataset, including narrative, ranges, and their units of measurement.
All six(6) categorical variables are binary types with 0 and 1, in Gender 0 representing the woman and 1 representing man. In smoking level 0 representing the patient has no habit of smoking, and 1 describes addiction to smoking. Diabetes level, anemic level, and High BP level-0 represent patients with no complications of diabetes, anemic and high BP, and values 1 illustrate suffering from diabetes, anemic, and high BP. In Survival-Status 0 level -representing patient is Alive, and 1 representing patient has died from cardiac arrest. In the dataset, patients' age lies between 40 to 95, with a mean value of 60 Figure 2(a). Figure 2(b) is displaying SC level w.r.t anemic levels. It is found that 170 non-anemic patients had mean SC was 1.35±0.06, and 129 anemic patients had a mean SC was 1.46±0.11. SC's minimum value was 0.50 and maximum was 6.80 w.r.t non-anemic level and anemic patients had SC value between 0.60 to 9.40. Figure 2(c) is showing SC level w.r.t diabetic level. It is found that non-diabetic patient means SC was 1.43±0.09 and diabetic patient mean SC was found 1.34±0.07. Non-diabetic patients SC was found in between 0.50 to 9.40, and diabetic patient SC range found 6.20. Also, 194 non-BP patients' mean SC level was 1.40±0.06, and 105 high-BP patients' mean SC was 1.39±0.13 found. Non-BP patient minimum SC was 0.60, and maximum 6.80 found, and high-BP patient SC range was found between 0.50 to 9.40 as depicted in Figure 2(d). In Figure 2(e), SS is displayed w.r.t Survival-Status also. It is found that alive patients had mean SS was 137.22±0.28, and dead patients had 135.38±0.51. Alive patients' SS range was found between 113 to 148, and dead patients' SS range was 116 to 146. As depicted in Figure 2(f), 170 non-anemic patients' mean SS was 136.46±0.34 found, and 136.84±0.39 mean SS value was found in 129 anemic patients. Minimum 113 and maximum 148 SS were found in non-anemic patients.
For patients who are suffering from anemia, minimum and maximum SS were found between 116 to 145.
SS w.r.t diabetic level whisker-plot in Figure 2(g) showing non-diabetic and diabetic patients overall summary. It is found that non-diabetic patient means SS was 139.96±0.29 and diabetic patients had SS was 136.16±0.46. High-BP patient means SS was 136.85±0.40 found and non-BP patient 136.51±0.33. It is a very minute difference between them, as depicted in Figure 2(h). High-BP patient SS range found between 124 to 148, and non-BP patient SS was in between 113 to 146. Distribution plot displayed in Figure 3 of metric variables available in dataset. It is found in Figure 3(a) that EF had non-normal distribution (p < 0.05) with mean 38.08 and 11.83 SD was found with 36.74 lower bound and 39.43 upper bound at 95% confidence interval. The non-Normal distribution was also found in the platelets with a mean 263358.03 with 6.21 kurtosis and 1.46 skewness with 97804.24 SD in Figure 3 Patients' Follow-up months distribution plot is displayed in Figure 3(f). From day 4 to day 285, the follow-up month mean was 130.26 found. The absence of normality was also there with 77.61 SD.

Features Correlation
The correlation between attributes can influence the machine learning model's performance. Data correlation can be used as a measurement tool to evaluate the relationship among features using Pearson's correlation. These values range from -1 to +1, indicating a negative or positive relationship between the attributes. Figure 4, a value close to zero indicates a low correlation between features for the dataset. The light-blue color implies that the correlation is close to zero, while the dark blue and dark orange colors mean that the correlation is close to +1 and -1, respectively. Diabetes and Sex are observed to have near zero, indicating very low or no correlation with the target attribute (Survival-Status). Accordingly, Survival-Status and SC attribute also have close to zero relationships with the Gender (Sex) target class. Thus, we could remove these features to improve the performance of our proposed model.

Preprocessing
This derived feature is used as the target feature in this article. The Variation Inflation Factor (VIF) in Equation (1) is a measure of how multicollinear the dataset's features are concerning a given target feature. The VIF value of an attribute a ∈ A is determined from a dataset D = (A, X) using a standard linear regression with an as the prediction target. Then, given R, the linear regression coefficient of determination, we have: A variance inflation factor more significant than 10 implies high multicollinearity of the attribute with other dataset attributes. This arbitrary number 10 is used as a norm in several publications [48]. Furthermore, when a feature is excluded from the dataset, the VIF of the multicollinear features decreases. By calculating the VIF of each feature (considered a target) in the dataset and comparing it to a new VIF calculation with a feature removed, we can detect groups of attributes automatically. The paper also used backward elimination used for feature selection among the aforementioned dataset features. Tests for correlation and multicollinearity among features were tested using the VIF.
In this article, the present research article created a new derived attribute, "Age-group" from on age attribute from our target HD dataset. More patients in the very old (170) age-group level than "Adult" (129), the level found in derived Age-group attribute.
The model has significant multicollinearity based on VIF (>10) when gender and age group as dependent variables (Table 3).

Models Hyperparameter Tuning
ML model's efficiency can be increased on the given dataset by tuning hyperparameters. The method of hyperparameter selection is one of the most critical characteristics of ML models. More time to adjust the hyperparameters for an effective result is generally required. Optimization of hyperparameters may be described as follows in Equation (2): Here f (x) denoting the objective score to minimize the validation set errors, x* is the minimum score hyperparameters collection, and x may assume any domain value of X. The present article using the objective score to maximizing the MCC score by minimizing the validation set errors using grid search with 10-fold cross-validation. The following listed parameters were utilized for ml models. The RF works by adopting various decision tree classifications for different dataset sub-samples and uses averages to increase predictive precision and track overfitting. The max sample parameter is managed if bootstrap=True (default), otherwise the entire dataset will be used to make a tree each [51]. For all proposed classifiers (Age-Group, Gender, Survival Prediction) Gini was used as a criterion and used to measure the quality of a split, max-feature set to 7 for the best split, min-sample-leaf set to 2 as the least sample number necessary for the leaf node. The dividing point is to be considered at any depth only when at least min samples leaf samples are left on each branch of the left and right. Min-sample-split set to 2 to split an internal node and number of trees in the forest set to 50 by setting n-estimators. In the decision tree (DT) classifier, a criterion is the same as the random forest set as Gini with max-features is set to log. The max depth and minimum sample split are set to 50. SVM library in machine learning is popular due to high-dimensional space and uses a subset of training samples in decision function. It has a dynamic approach to solve the problem due to kernels exist [52]. The SVM classifier is used in this paper for classification purposes. This C (10) parameter is used here for regularization parameter and must always be greater than zero and with radial basis kernel. Gamma value is used kernel coefficient set as 0.001 (Table 13). Gradient boosting classifier (GBM) creates a forward-looking additive model; it enables arbitrary differentiable loss functions to be optimized. The n classes regression trees in each step are fitted to the negative gradient of the binomial or multinomial default function. Binary classification is a special case in which only one tree is caused [53]. Learning rate (0.001) reduces each tree's contribution by learning rate. A compromise exists between the learning rate and n estimators. The number of nodes in the tree is limited by the maximum depth. This parameter should be tweaked for optimal performance; the interaction of the input variables determines the best value is set as 3 with a number of an estimator (nestimator) is 1000 (Table 13). SVM Linear classifier with stochastic gradient descent (SGD) learning to implement regularised linear models: the gradient of the loss is measured one sample at a time, and the model is modified with a decreasing intensity schedule along the way (aka learning rate) [54]. The alpha value of sgd classifier uses as multiple of regularization term set here as 0.1 and loss used for training time in an adjustment of several weight updates here set as log value. Penalty used for regularization term set to as none Table 4. Tree boosting is a common and successful machine learning technique. We explain XGBoost (XGB), a scalable end-to-end tree boosting method commonly used by data scientists to achieve state-of-the-art results on a variety of machine learning challenges [55]. To avoid overfitting, step size shrinkage was used in the update (eta aka learning rate) set as 0.1 and gamma set as 0. If the max delta step is zero, then there is no constraint to be followed but here set as 2 updates to follow more conservative. Maximum depth of tree set as 6 and increase this result model become more complex with minimum child weight set as 4 with 200 trees in the forest (nestimator). k-NN works on the principle of nearest neighbor methods on the training samples closest to the new point and uses them to predict the mark and the number decided by nneighbour set as 3 with mahanntan metric and weight set as uniform. Despite its name, logistic regression is not a regression but an algorithm of classification. It is used to estimate discrete values (0 or 1, yes/no. true/false) based on a given set of independent variables. It is also known as logit or MaxEnt [56,57]. Newton-cg set as a solver to handle multiclass problem and newton-cg handle only l2 penality with regularization parameter c set as 1.0.
The experiment trained and tested all classifiers with it 10-fold cross-validation using the grid search CV method.  DT is an algorithm of classification which works well on categorical and numerical forms of data. It is generally used to build tree-like structures. Medical data can be analysed easily with good accuracy. [33] RF RF is a model of tree-based ensemble learning that produces exact prediction by combining several weak learners. This model uses the bagging technique for training a range of decision tree with different bootstrap samples [34] LR LR typically predictive analysis based on the concept of probability. Binary categorical variable is predicted by one or more independent variable using sigmoid function [35] GBM Many weak classifiers work together to build a powerful model for learning on the GBM. It usually time taking process due to creation of many independent tree. It has ability to deal with missing values [36] GNB GNB is a naive bayes variant that works with gaussian distributions and is used for continuous data. The prior and posterior likelihood of the class in the data are involved in conjunction with a function that has constant values.
All of the features are often assumed to obey a gaussian or regular distribution [37] SVM SVM is a mathematical model-based supervised learning technique. It is used to solve problems including regression and classification problems. It classifies data by creating high-dimensional hyperplanes, also known as decision planes. Hyper planes are used to separate one form of data from another [38] XGB The XGB is a popular ensemble learning algorithm that uses DT models in the background for computation. It is a highly effective scalable machine learning algorithm. It combines multiple weak-learner to build a strong classifier proved a better classifier. [43] k-NN When compared to a collection of known data, the k-NN method allows us to identify unknown data by calculating the distance or similarity of an unknown datum. It assigns a class to the datum based on the number of neighbors with the same class who are the nearest to it. k controls or indicates the number of neighbors used in the decision.

Feature Ranking and Selection
A subset of features reflects the characteristics of the original number of features and helps calculate the target feature. The feature importance help to reduce the computational cost by removing irrelevant features [58]. Fisher scores feature selection method uses a filter-based method by computing large distances between data points in different classes and a small distance between datapoints in the same classes approach [59]. Another popular feature ranking and selection method are Chi-square(χ 2 ) which works on χ 2 statistics w.r.t class labels. Higher the χ 2 value, the higher the related feature [60]. The χ 2 statistics and its related p-value is computed by cross-tabulation method [61]. RF offers two feature ranking techniques: mean accuracy reduction and Gini impurity reduction. As we know, RF generates various DT during training for working on the subsets of data and features. It observes all the outcomes of DT and selects its outcome based on the majority of votes. Mean accuracy reduction feature ranking technique tally the prediction accuracy on dropping of particular feature with the rest features' accuracy results. Rank the feature accordingly after observing the difference. It works on the principle that the bigger the precision decrease, the larger the feature importance is [62]. Another method of feature ranking also works on the same principle by using Gini as a metric instead of on the accuracy [63]. Regarding machine learning feature ranking, we used the RF, XGB inbuilt feature ranking algorithms ("Feature Ranking Results" Section) to extract the top most common features. RF and XGB are used to rank the feature as they proved to be a better classifier with the highest accuracy among all used classifiers in the present work.

Performance Evaluation Metrics
Scientific researches use a variety of performance matrices to assess prediction accuracy [64]. Still, no broad consensus is reached on a single elective measure yet despite the various best machine learning methods. In binary classification problems in machine learning, accuracy and F1-Score derived from Confusion matrics (CMs) have been (and continue to be) among the most widely used metrics for performance evaluation. These statistical measures with h imbalanced dataset could, however, dangerously demonstrate over optimal inflated results.
The proportion of actual negatives that were predicted as negatives is known as specificity (or true negative).This means that a part of true negatives will be predicted as positives, which could be considered as false positives. Equation described in Equation (3).
Our model's recall is a metric for how well it can detect True Positives. As a result, recall tells us how many patients we accurately identified as having HD out of all those who have it. Mathematical Equation (4) describing this.
Precision is the ratio of True Positives to all Positives in its most primitive sense as described in Equation (5).
The ratio of the overall number of right predictions to the total number of predictions is known as accuracy as described in Equation (6).
The Harmonic Mean of Precision and Recall is the F1-Score as described in Equation (7).When we want to strike the right balance between precision and recall, we'll need the F1-Score.
The popular Mathews Correlation Coefficient (MCC) in Equation (8) would be a more accurate statistical measure that only yields a high score if the prediction performed well in all four CMs groups (true positives, false negatives, true negatives, and false positives) despite imbalance dataset [49]. The MCC would produce a high score only if the binary predictor were able to correctly predict the majority of positive data instances and the majority of negative data instances while working with binary classification [50]. It has the worst value of -1 and the best value of +1.

Experiment-1
This experiment was conducted to measure the SC and SS levels differences towards Survival-Status levels (0: Alive, 1: Dead) due to HF of 299 patients (H 01 ). The test was an experiment with a non-parametric Mann-Whitney U-test. Table 6 displays the U test statistics values differences between Alive and Dead patients towards SC and SS. Surprisingly, across both populations, in case of SC and SS, the present paper found significant p-values for both (Alive and Dead) levels. This paper also found significant differences with unique µ rank in SC (µ rank : 128. 10 & 196.31). The significant p value of SC suggested to consider significant differences (U = 5298, Z = -6.398, p < 0.05). In case of SS, significant differences could be observed in mean rank (µ rank : 162. 40

Experiment-2
The experiment is conducted to test the three null hypotheses (H 02 -H 04 ) to verify the differences in SC and SS w.r.t anemic levels, diabetic and high BP levels. All the hypotheses were tested with a Mann Whitney U test.   As Figure 6(a) displaying, Non-Anemic patients had more excellent SC as compare to anemic patients (µ = 151.22 > µ = 148.40). But in the case of SC, anemic non-patient had less SS as reach anemic patients (µ = 145.4 < µ = 156.06). Finding suggests that there are no significant differences. From Figure 6(b), Non-diabetic patients had non-remarkable differences found in SC with diabetic patients with mean rank (µ = 149.89 < µ = 150.20). But in the case of SS, non-diabetic patients had a little-bit greater SS level as compared to diabetic patients (µ = 154.03 > µ = 144.38), but results were not statistically significant. As Figure 6(c) shows, Non-BP patients had more significant differences towards SC as compare to BP patients with mean rank (µ = 155.67 > µ = 139.52). But in the case of SS non-BP patients had minor differences toward SS level as compare to BP patients (µ = 148.78 < µ = 152.25) but experiment results unable to prove significance in results.

Experiment-3
This experiment was conducted to verify the association between gender and smoking levels of CHD patients. The experiment used a non-parametric χ 2 test to explore the association of gender level with smoking levels. We have two nominal categorical variables as the χ 2 test considers the non-metric variables and uses cross-tab as an input to explore the link.  Table 8 shows the cross-tab of observed values of gender and smoking levels, and Table 9 shows the expected values with χ 2 of each cell. Residuals (Observed-Expected) are also marked in this table. A positive residual cell χ 2 value means that the observed value is higher than the expected value. A negative cell residual χ 2 value (e.g., -29.7) means the observed cases are less than the expected number of cases. Source: Own elaboration. Table 9, it is observed that the most significant χ 2 value 26.17 is found in the second cell. It is because the observed female smoker patients were 04, whereas 33.7 was expected. Therefore, the 2 nd cell consists of a much larger number of expected cases than observed. This means that the number of observed female smoker patients was significantly less than expected. The second-largest χ 2 value of 14.16 is located in the Male Smoking patient's cell. However, we find that the number of observed cases in this cell was significantly greater than expected (Observed = 92, Expected = 62.3). This indicates that a substantially higher number of male smokers is observed than what is actually expected. The third-largest cell χ 2 value of 12.37 is located in a non-smoker female's cell. The observed value of 101 and the predicted value of 71.30 for a non-smoker female were found. This means observed female non-smoker was significantly greater than expected. The last χ 2 value 6.70, in which expected non-male smokers (131.70) were considerably smaller than observed (102).
Further, It has been evident that the two groups were significantly associated (p < 0.05) with χ 2 (1) = 59.45. therefore, the findings suggest we reject the null hypothesis H 03 that no significant association of gender with smoking level.

Experiment-4
In this experiment, the research article checks the age-group significance of association to the Survival-Status levels using follow-up months with the help of the survival analysis-cox regression model.  When a patient is censored, it can be due to the follow-up study period is over and the patient has not experienced the event, or it can be due patient's follow-up is lost. Since the Cox regression is a semi-parametric model, model fitting did not estimate the intercept (baseline hazard). Diabetes, platelets, gender, and smoking were all shown to be non-significant factors in the Cox model in below Table 11. Source: Own elaboration.
The negative age-group coefficient suggested that the risk of death from CVD in "Adults" have lower than "Very Old" in these data. The hazard ratio (HR) of Age-Group is exp (-0.60) = 0.55 here indicated that there are 55 deaths due to age group levels for every 100 deaths caused by CHD at each observation point. According to Table CPK, anemia, EF, high BP found significant towards Survival-Status levels (p < 0.05).  Figure 7 visualizes the survival analysis of patient's follow-up months and age-group levels. Each level of Age-Group using the Kaplan Meier Survival Curve. It is evident that the survival rate for the Age-Group "Very Old" was lower than that of the "Adult.". It also indicates that difference between the two levels was statistically significant with log-rank p-value of .003 (p < 0.05) with χ 2 (8.565) at df = 1. Cross at curve indicating censored patients. Therefore experiments findings reject the null hypothesis H 04 that no significant association between age-group with Survival-Status levels.

Experiment-5
This experiment predicted the gender (Woman, Man) of heart patients using various classifiers in      Figure 9 shows the TPR is plotted against the False Positive Rate (FPR) at various threshold values to construct ROCs, which is an essential tool for diagnostic test evaluation. The AUC (area under the ROC curve) is another method for determining a classifier's predictability. A classifier's superiority is measured by its AUC value, which is larger, the better. The best mean RF AUC (Figure 9(b) is 0.97 with SD of ±0.07 followed the AUC for SVM (Figure 9(h) is 0.79 with SD ±0.05. In a 10-fold cross-validation test, RF outperformed other classifiers with 96% accuracy, 97% sensitivity, and 95% specificity. The RF ROC appears to be more performing than the other proposed models based on their greater AUC, as shown in the ROC chart. The worst mean k-NN AUC (Figure 9(d)) is 0.55 with a SD of 0.12, followed by GNB AUC (Figure 9 with a SD of 0.05. According to experimental results, the proposed approach RF outperformed prior approaches addressed in the literature in terms of cross validiation accuracy.

Experiment-6
This experiment was predicting the Age-Group (Adult, Very Old) of heart patients using various classifiers (Table 13). In fact, in terms of (MCC = +0.92) with SD 0.23, F1 score (0.96) with SD 0.11, and accuracy (0.96) with SD 0.11, the RF classifier is the best performing classifier. SVM classifier performs the worst of all listed algorithms in terms of (MCC=0.02) with SD 0.17, but k-NN classifier performs poorly in F1-Score ranking (0.62) with SD 0.08. In the accuracy, the ranking GNB classifier performing poorly with accuracy (0.55) with 0.05 SD. Indeed, as shown by our experiment, in both recalls (TP ratio = 0.97) with SD 0.07 and specificities (TN ratio = 0.95) with SD 0.16, with top MCC rating classification, RF was performing outstandingly. A significant number of patients' age groups could not be accurately predicted by k-NN(TPR=0.63) with SD 0.14 and adult group couldn't be predicted significantly (TNR=0.17) with SD 0.15 GNB classifier. Nevertheless, the researchers were again misled by higher accuracy values: a closer review of the findings showed that the SVM-SVC was badly negative (TNR = 0) in the case of GNB(TNR=0.17), with fewer patients correctly observed.   (Figure 11(a))-TPR, which is representing here very old age-group (0.86) is higher followed by the RF classifier. A substantial number of adult patients (107) were misclassified as Very-Old in the GNB (Figure 11(f)) classifier. In contrast, a considerable number of Very-Old patients (62) were misclassified as Adults in the k-NN classifier in Figure 11(b). As previously stated, the RF classifier has an extremely low rate of misclassification in  AUCROC score represents the capability of the model to distinguish among classes. From Figure 12(b), it can be clearly observed that RF (AUC =0.94) with 0.16 SD is best classifier followed by GNB (AUC = 0.67) and LR (AUC = 0.64) in Figure 12(f) and (g).

Experiment-7
This experiment identified the survival (Survival-Status) of patients using various classifiers Table 14. In reality, RF classifier (MCC = +0.91) with ±0.11 SD is the best performing classifier in the MCC ranking, F1 score (0.94) with ±0.07 SD and accuracy (0.96) with ±0.06 SD rankings followed by DT in (MCC=+0.63) with ±0.11 SD as same to XGB (+0.63) but with high ±0.12 SD, F1-Score (0.75) with ±0.07 SD and accuracy (0.83) with ±0.05 SD. k-NN classifier performs worst among all specified algorithms by (MCC=+0.06) with ±0.16 SD, accuracy (0.61) with ±0.06 SD, but in F1-Score ranking, SVM-SVC is serving poor with 031 with ±0.11 SD. As previously mentioned, we tend to concentrate on the MCC rating for binary classifications like this because this rate only produces a high score if the classifier correctly predicted the most positive data instances and the majority of negative data instances. Indeed, the top MCC ranking classifier RF performed admirably on both recall (TP rate = 0.93) and specificity (TN rate = 0.98) with ±0.08 and ±0.07 SD, respectively. In conclusion, the F1 score and accuracy rankings hide a fundamental error in the top classifier: k-NN could not predict a large percentage of patients correctly. The MCC rating, on the other hand, takes this knowledge into account. However, accuracy values will deceive the researcher once more: a closer examination of the results reveals that the radial SVM performed poorly on the true positive (TP rate = 0.13), correctly observing fewer patients.    Table 14, the least accuracy (61%) was achieved by the k-NN algorithm. Among all specified proposed algorithms, the RF classifiers' CM displaying remarkable results. Here, TP represents the total number of right-predicted Dead patients, and TN represents the total number of right-predicted alive patients. Figure 14(c), The RF has achieved a greater number of TP and TN with 89 and 198, respectively. Further, the second-highest TN rate is achieved by the LR algorithm ( Figure 14(e)), in which TN is 184, and also 2nd higher TP rate is achieved by the DT algorithm ( Figure 14(d))-in which TPR can be noted down as 77 followed by the RF classifier TN and TP rate. A substantial number of alive patients (31) are misclassified as dead in the k-NN (Figure 14(b)) classifier, and a considerable number of patients killed (67) are misclassified as alive. As previously stated, the RF classifier has a shallow rate of misclassification in Table 14, the benchmark also followed in CM with only (07) dead patients misclassified as alive and only (05) alive patients misclassified as dead. As observed, TPR and TNR of RF are 93%, and 97% is higher. Therefore, RF can be considered as the best performing classifiers.  In Figure 15, the ROC curve represents the TPR plot against the FPR at various threshold values. As we know, the classifier's superiority can also be measured by a larger AUC. The RF's best mean AUC (Figure 15(b) can be noted down as 0.97 with SD of ±0.08 followed the AUC for LR (Figure 9(c) is 0.96 with SD ±0.05. The worst mean k-NN AUC (Figure 15(d)) is 0.50 with a SD of 0.06, followed by next higher algorithm ,DT AUC (Figure 15  Both classifiers considering SC and EF as significant features for predicting the Survival-Status, but XGB treating gender at 3 rd rank and RF treating at 5 th rank.

Discussion
The significant p-value is critical for evaluating the hypothesis in statistical tests. To test the first two hypotheses, a significant Mann-Whitney U test is used in this study. It is playing a vital role due to the absence of normality in data. The present paper investigated the impact of SC and SS on Survival-Status levels, as well as it also verified the complications such as anemia, diabetes, and high BP impact of SC and SS; on the other hand, this also validated the association between gender and smoking habit among CVD patients, as well as the association between age-group and survival-status. In addition to this, an association χ 2 test is performed to investigate the relationship between gender and smoking habits. It also demonstrates the association between age-group and Survival-Status which is verified with a cox-regression model.
The first null hypothesis, "H 01 : No significant difference in Alive and Dead towards SC and SS," is found to be rejected (p < 0.05). There are statistically significant differences found between the SC and SS w.r.t Survival-Status levels. The second null hypotheses "H 02a : No significant differences between non-anemic and anemic levels towards SC and SS", "H 02b : No significant differences between non-diabetic and diabetic levels towards SC and SS", and "H02c: No significant differences between non-BP and BP levels towards SC and SS" are all not found significant (p < 0.05). Furthermore, the gender of the patients and their smoking habits are found to be statistically significantly associated (p < 0.05). As a result, the third null hypothesis, "H 03 : No significant relationship between gender and smoking level," is found rejected. During the study, it is found that actual Female Smoker patients (04) were significantly less than expected (33.7) as compared to non-smoker female patients. It is also observed that actual male smoker patients(92) were substantially more significant than expected(62.3) but not the same in male non-smoker patients. An actual male who is not habitual to smoking habits (102) was found significantly less than expected(131.70) during follow-up months. The present study also uses a cox regression model to explore the association between age-group and survival-status levels and demonstrate a statistically significant association between these two attributes. Therefore  [16]. In the case of (H 02a , H 02b , and H 02c ) complications such as anemia, diabetes, and high blood pressure, the influence SC and SS were not found significant (p > 0.05), contradicting [10,17], and [19] but supporting [14]. The finding of the gender-smoking-habits association (H03) found significant (p < 0.05) support [21]. The age-group Survival-Status association (H 04 ) finding is significant (p < 0.05), rejected [24] but supported [15].
The reported CDF-DI was performed on a heart disease dataset and showed promising results compared to previous models in improving prediction accuracy. For comparison, we used eight state-of-the-art MLAs (GNB, LR, GBM, SVM, DT, XGB, k-NN, and RF) throughout the study that have an established track record for accuracy and efficiency in the research community. All models were subjected to 10-fold cross-validation, and six performance metrics were collected: accuracy, precision, TPR, F1-measure, MCC, TNR. RF classifier was found superior in all mentioned performance ranking (MCC score ranking, F1-Score ranking, and accuracy score ranking) in all machine learning prediction goals.
The proposed model outperformed with RF machine learning model in predicting patients' gender another model by obtaining accuracy up to 94%, and 95% in all rest performance metrics, i.e., precision, TPR, and F1-Score, respectively. The proposed CDF-DI model has the highest MCC values, up to 0.87, proving its superiority over other models. Furthermore, the proposed model had the lowest FPR and the highest TNR by up to 9% and 91%, respectively. The suggested model's low FPR and high TNR values demonstrate the CDF-DI model's capacity to reduce miss rates and improve prediction accuracy for both negative and positive subjects. Table 12 displays the comprehensive performance findings for predicting patients 'gender. GNB found the worst performer in MCC, F1, and Accuracy score ranking with 0.06, 71%, and 59% respectively.
Further, the predicting age group of patients has displayed encouraging results with the RF model. The proposed model was found best with RF in all key performance criteria, such as precision, TPR, F1-Score, and accuracy by up to 95%, 97%,96%, and 96%, respectively. The RF's most significant MCC values, up to 0.92, are found in the CDF-DI model, demonstrating its superiority over the rest proposed MLAs. Furthermore, the proposed model exhibited the lowest FPR and the greatest TNR by up to 5% and 95%, respectively. GNB model found the worst performing model in terms of MCC ranking and accuracy ranking with 0.02 and 55% scores, respectively. The k-NN model was found worst in F1-Score during the prediction of age-group of patients with 62% scores. The entire performance findings are shown in Table 15. Furthermore, the RF model has also shown promising results in predicting the Survival-Status of patients. All six significant performance measures, such as precision, TPR, F1-Score, accuracy, MCC, and TNR, are found to be better with RF by up to 95%, 94%, 96%, 0.91, and 97%, respectively, in the CDF-DI model. the k-NN model proved w.r.t. MCC and accuracy score ranking and in F1-Score ranking SVM model found weak model. Table 14 summarizes all of the performance findings.
This is incredibly encouraging for hospital settings: even if several laboratory test data and health conditions were absent from a patient's electronic health record, doctors could still predict patient survival by evaluating the EF, SC, and CPK values alone. The present research also yielded several intriguing outcomes that varied from the findings of the same dataset study [37]. Davide Chicco et al. identified EF, SC, age, CPK, and gender chosen as the top five features for predicting Survival-Status while Tanvir Ahmad et. al [16] also identified age, SC, High BP, EF, and anemia as top essential features. This study found SC, EF, platelets, CPK, and gender as important features which play an essential role in predicting Survival-Status with RF Classifier as depicted in Figure 16. We found EF at 2 nd position and also found platelets as an essential feature which the previous study had not found. The present paper also improves the accuracy, F1-Score, and MCC by 0.22, 0.39, and 0.53 respectively in RF classifier and other models as depicted in Table 15.
The experiment results displayed that the supervised machine learning model performed a best role in predicting the age group and gender of heart failure patients very efficiently. Tree-based algorithms performed well on the imbalanced dataset using the 10-fold cross-validation method. As displayed in Figure 13, RF identified CPK, SC, follow-up month, platelets, and EF as significant features while predicting the age group (adult, very old) of patients. Also, the RF classifier identified smoking, CPK, platelets, follow-up month, and SC. These methods became beneficial in-patient care because doctors can predict a patient's age group based on only five significant characteristics. RF and XGB commonly extracted out SC and EF as crucial features in predicting the age-group target variable. Also, smoking, CPK, and platelets are found vital features in predicting the gender of patients commonly finding RF and XGB.
As it can be derived very quickly the top five input features which are playing vital role in predicting Survival-Status from Figure 16. The top 5 features selected by RF and XGB, feature selection techniques, are follow-up month, SC, EF, CPK, and anemia. The RF feature selector and the XGB feature selection have a lot in common. Follow-up month has the highest ranking (0.49), whereas anemia has a lower score (0.01) extracted by the RF feature selector. XGB also treats the follow-up month feature as the most important (0.40) and lowest rank to the anemia (0.02). As finding suggests that patients' SC, EF, platelets, and CPK needs special attention for their survival during their follow-up months.

Conclusion
The present research d confirmed that our traditional biostatic analysis finding signifies the importance of an appropriate level of sodium and normal creatinine level in the human body. The paper results revealed the significance of SC and SS towards the Survival-Status (Alive/Dead) of CVD patients. Also, it validated that health complications like anemia level, High BP level, and diabetic level have no significant effect on the SC and SS enzyme levels. This paper also found the significant association of smoking habits with specific patients towards gender. On the one hand, the authors found that the real count of non-smoker and smoker females is lower than the expected count. On the other hand, the actual male smoker and non-smoker female found significantly more significant than the expected count. The study also observed that the elderly patients (very old age-group) are more susceptible to HD mortality. Figure 7 confirmed that the patients who belong to the very old age-group are more mortality prone than adult patients. Further, the authors applied eight machine learning algorithms to identify the gender, age-group, and Survival-Status of the patients with improved accuracy as compared to the previous study. The smoking, platelets, CPK, SC, and EF are found the most prominent predicting features. In addition to the earlier study's features EF, and SC [14,31], the authors recommended three more features: platelets, CPK, and gender to identify the Survival-Status of the patient.
The modest size of the dataset (299 patients) is a constraint of this study; a more extensive dataset would have allowed us to achieve more reliable results. Other information on the patients' physical characteristics (height, weight, BMI, hormones, etc.) and their work history might have helped detect additional risk factors for CVD.
Future work may include the principal component analysis [65], to transform the existing features to enhance the classification accuracy and apply the statistical algorithms to prove the comparative strength of the results [66]. Moreover, the real-time implementation of the extant research would support the medical support systems and helpful to the doctors for examining the cardiac patients. A novel diagnostic system can also be designed and developed for the IoT-enabled CDF-DI models.