Submitted:
10 November 2024
Posted:
11 November 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
3.1. Datasets
- Public Dataset 1: A heart disease classification dataset with 70,000 records, widely used for predictive modeling in cardiovascular health.
- Public Dataset 2: A smaller dataset containing 1,190 records, which includes various features related to heart disease.
- Locally Collected Dataset: A dataset with 600 records collected from local hospitals, containing demographic and clinical data related to cardiovascular conditions.
3.2. Proposed Model
- Deep Learning Models: - Convolutional Neural Networks (CNN): CNNs are employed for feature extraction from the input data, learning patterns from raw ECG signals or other features. - Long Short-Term Memory (LSTM): LSTMs are used to capture temporal dependencies in the data, making them suitable for sequential data like ECG signals.
- Machine Learning Models: - K-Nearest Neighbors (KNN): KNN is used as a non-parametric method for classifying the data based on proximity to other data points. - XGBoost (XGB): XGBoost is an ensemble machine learning algorithm that is used to build a powerful classifier by combining the predictions of multiple decision trees.
- Majority Voting Ensemble: The final prediction is made through majority voting, where each classifier casts a vote on the predicted class, and the class with the most votes is selected as the final output.
3.3. Model Training and Evaluation
3.4. Preprocessing Techniques
- Data Cleaning: Missing values were handled using imputation methods, and categorical variables were encoded into numeric representations.
- Normalization: Features were scaled to a consistent range to prevent model bias towards variables with larger magnitudes.
- Data Augmentation: For limited datasets, synthetic data generation techniques were applied to enhance model training.
4. Equations and Algorithm
4.1. Majority Voting
4.2. Algorithm for Model Prediction
| Algorithm 1:Hybrid Model for Cardiovascular Disease Prediction |
|
5. Results
| Dataset | Accuracy | Precision | Recall | F1-score |
|---|---|---|---|---|
| Dataset 1 | 0.92 | 0.91 | 0.93 | 0.92 |
| Dataset 2 | 0.88 | 0.87 | 0.90 | 0.88 |
| Dataset 3 | 0.94 | 0.93 | 0.95 | 0.94 |
6. Conclusion
References
- E. H. Shortliffe, "Clinical Decision Support Systems: A Knowledge-Based Approach," Addison-Wesley, 2001.
- M. A. Hasan, M. K. Y. R. Rao, and B. G. McNally, "Cardiovascular Disease Prediction Using Machine Learning Algorithms," Journal of Computational Biology, vol. 57, no. 6, pp. 189-202, 2019.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).