Preprint
Article

This version is not peer-reviewed.

Designing Algorithms for Wearable Sensors: Insights from the Framingham Heart Study Dataset for Enhanced Cardiovascular Health Monitoring

A peer-reviewed article of this preprint also exists.

Submitted:

25 February 2025

Posted:

26 February 2025

You are already at the latest version

Abstract
Wearable sensors hold promise for advancing cardiovascular health monitoring by enabling continuous, real-time risk assessment. This study leveraged the Framingham Heart Study dataset to develop and evaluate machine learning algorithms for predicting mortality risk based on key cardiovascular parameters. Five ML models were implemented: XGBoost, Random Forest, Logistic Regression, Ensemble Learning, and Ensemble Stacking. Among these, XGBoost and Ensemble Stacking demonstrated the highest predictive performance, with an area under the curve (AUC) value of 0.83. And feature importance analysis identified coronary artery disease, glucose levels, and diastolic blood pressure (DIABP) as significant risk factors for mortality. Given the advancements in wearable sensor technology for measuring and estimating glucose levels and blood pressure, these findings underscore the potential of wearable devices for effective cardiovascular risk prediction. This study highlights the feasibility of integrating machine learning algorithms with wearable sensors to enhance cardiovascular health monitoring and facilitate early intervention.
Keywords: 
;  ;  ;  ;  

1. Introduction

Identifying predictors of cardiovascular disease (CVD) is essential for effective prevention and management. Numerous studies have already explored the risk factors for CVD, and it has been well-established that factors such as smoking, hypertension, diabetes, and dyslipidemia contribute to its development [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]. Furthermore, a study by Yuda E et al. (2021) investigated the redundancy between these predictors, highlighting that multiple indicators often overlap in their predictive ability [22], and research is also progressing on heart rate indicators and CVD risk [23,24,25,26]. However, these previous studies primarily identified risk factors based on clinical measurements obtained during hospital visits, rather than through data collected in free-living conditions using wearable devices. As a result, the practical application of these findings for continuous, real-time cardiovascular risk monitoring in daily life has not yet been fully realized. Recent technological advances in wearable sensors have enabled the non-invasive and continuous monitoring of physiological parameters, such as heart rate, blood pressure, and blood glucose levels, providing an opportunity to shift risk prediction from clinical settings to everyday environments.
The Framingham Heart Study (FHS) provides a valuable resource for investigating cardiovascular disease risk factors. This study, which began in 1948 in Framingham, Massachusetts, USA, initially enrolled more than 5,000 participants and has since expanded to include multiple generations [27,28,29,30,31,32,33,34,35,36]. Its primary objective is to identify common risk factors for cardiovascular disease by following participants over an extended period and collecting comprehensive health data, including lifestyle factors, clinical measurements, and laboratory results. This dataset includes parameters such as blood pressure, cholesterol levels, blood glucose levels, and educational background, along with long-term follow-up data on cardiovascular events and mortality. As a result, the Framingham dataset is one of the most influential resources in cardiovascular epidemiology.
Several studies have already utilized the Framingham dataset to evaluate models for predicting cardiovascular-related mortality. For example, prior research includes comparative studies of machine learning algorithms, Bayesian analyses that account for time-varying treatments and heterogeneity, evaluations of methods for imputing missing data, and multi-model data mining approaches for predicting heart failure [37,38,39,40]. Many of these studies have focused on improving the accuracy of risk prediction by addressing missing data using techniques such as multiple imputation and deep learning-based approaches. While these studies have advanced cardiovascular risk stratification, they have not been specifically aimed at developing wearable sensor-based applications derived from their findings.
Therefore, the objective of this study is to analyze the Framingham Heart Study dataset using various machine learning approaches and identify cardiovascular risk factors that can be effectively monitored and estimated using wearable devices. By focusing on parameters measurable with current wearable technology, this study aims to bridge the gap between traditional epidemiological research and continuous cardiovascular risk monitoring in real-world settings.

2. Materials and Methods

2.1. Study Design and Population

This study utilized the Framingham Heart Study (FHS) dataset, a well-established longitudinal cohort study designed to identify risk factors for cardiovascular disease (CVD). The dataset includes comprehensive health data collected from multiple generations of participants, with long-term follow-up on cardiovascular events and mortality.

2.2. Data Collection

The dataset included 41 variables encompassing demographic information, clinical measurements, lifestyle factors, and cardiovascular outcomes. The key features used in this study were as follows (Table 1).
Among these variables, DEATH was used as the primary outcome variable, representing all-cause mortality. Independent variables included age, sex, blood pressure (SYSBP and DIABP), cholesterol levels (TOTCHOL, HDLC, and LDLC), glucose levels (GLUCOSE), smoking status (CURSMOKE and CIGPDAY), body mass index (BMI), diabetes status (DIABETES), and medication use (BPMEDS). Data preprocessing involved handling missing values through multiple imputation techniques, standardizing continuous variables, and encoding categorical variables as appropriate. The dataset was then split into training and testing sets using a stratified approach to maintain the proportion of outcomes across both sets.

2.3. Machine Learning Analysis

To identify key risk factors associated with cardiovascular-related mortality and evaluate predictive performance, five machine learning algorithms were applied:
  • XGBoost
Gradient boosting algorithm known for its high predictive accuracy and ability to handle complex interactions between variables.
  • Random Forest
Ensemble learning method that constructs multiple decision trees and outputs the mode of the classes (classification) or mean prediction (regression).
  • Logistic Regression
Traditional statistical model used as a baseline for binary classification tasks.
  • Ensemble Learning
Voting-based approach that combines the predictions of multiple models to improve overall accuracy.
  • Ensemble Stacking
Meta-learning technique where multiple base models' predictions are combined using a higher-level model to enhance prediction accuracy further.
Each model was trained using the training dataset and evaluated on the test dataset. Model performance was assessed using the area under the receiver operating characteristic curve (AUC) as the primary metric. Feature importance was analyzed to identify the most significant predictors of mortality.
Hyperparameter tuning was conducted using grid search and cross-validation to optimize each model's performance. The final evaluation was based on the test set, ensuring an unbiased assessment of predictive accuracy.

3. Results

Among the 5 machine learning models used in the analysis—XGBoost, Random Forest, Logistic Regression, Ensemble Learning, and Ensemble Stacking—XGBoost and Ensemble Stacking demonstrated the highest performance in predicting mortality, with an area under the curve (AUC) value of 0.83 (Figure 1).
Among the 15 parameters extracted from the Framingham dataset, coronary heart disease (PREVCHD), blood glucose (GLUCOSE), and diastolic blood pressure (DIABP) were identified as significant risk factors associated with increased mortality (Figure 2, Figure 3 and Figure 4).

4. Discussion

The results of this study confirmed that glucose levels and blood pressure are significant factors associated with mortality risk. Recent advancements in wearable sensor technology have made it possible to continuously monitor these parameters in daily life. Notably, minimally invasive blood glucose monitoring has progressed considerably, with technologies now available to estimate long-term glucose levels from interstitial fluid using shallow needles [41]. Similarly, non-invasive methods for estimating blood pressure via optical sensors embedded in smartwatches and activity trackers are becoming increasingly practical [42,43,44,45,46,47,48]. These advancements enable real-time health monitoring without relying solely on spot measurements taken at medical facilities. The strong association between glucose levels, diastolic blood pressure, and mortality risk highlighted by this study underscores the importance of using wearable sensors for continuous monitoring. Elevated blood glucose levels are known to increase the risk of heart disease through diabetes and metabolic syndrome. By tracking glucose levels daily, early intervention and lifestyle modifications can be more effectively implemented. Likewise, sustained increases in diastolic blood pressure place chronic stress on the cardiovascular system, increasing the risk of arteriosclerosis and heart failure, making real-time monitoring highly valuable.
Wearable sensors are especially beneficial for older adults and individuals living in remote areas who may have difficulty accessing healthcare facilities regularly. These devices facilitate early medical intervention by detecting health risks before symptoms become apparent. Additionally, self-reported information, such as the presence of coronary artery disease, can be integrated with real-time data to provide personalized risk assessments and health management. Long-term data collection via wearable sensors offers several advantages over conventional cross-sectional studies, including risk assessment that accounts for temporal fluctuations. This approach not only enables tailored lifestyle guidance and timely treatment but also has the potential to identify new predictive markers. However, this study has several limitations, primarily related to the dataset used for analysis. The findings were based on the Framingham dataset, which, while demonstrating the utility of machine learning models for heart disease risk prediction, has inherent biases.
The Framingham Heart Study began in 1948, primarily involving middle-class white individuals in Framingham, Massachusetts. Consequently, the participants' socioeconomic background, lifestyle, and genetic factors were relatively homogeneous, limiting the generalizability of the results to other ethnic groups and diverse social environments. Addressing health disparities in low-income countries and among underrepresented populations will require analysis using more diverse datasets. Another limitation concerns gender bias and the impact of sex differences. Although the Framingham dataset includes substantial data on women, the male-to-female ratio is uneven, potentially influencing risk factor analysis. Cardiovascular risk is known to increase in postmenopausal women due to hormonal changes, a factor not fully captured by the dataset [49,50,51,52,53,54]. Our reanalysis of gender and mortality showed a minimal effect, but previous studies have reported that women face higher heart disease risks from diabetes and hypertension compared to men. Additionally, the 15 parameters used in this study were derived from the Framingham dataset, excluding other potential risk factors such as inflammatory markers, mental stress, and genetic predispositions. The lack of detailed lifestyle data—such as diet, exercise, and sleep—further limits the comprehensiveness of the risk assessment. Integrating these factors through continuous wearable monitoring could significantly enhance prediction accuracy. Given that health conditions change over time, wearable sensors address the limitations of static data by enabling dynamic risk estimation that accounts for temporal variations. Achieving personalized risk assessment tailored to an individual's living environment and behavior will require a comprehensive approach, integrating a broader range of biological signals and contextual data. Furthermore, optimizing machine learning models to effectively process and interpret the vast datasets generated by wearable sensors will be essential for advancing precision health monitoring.

5. Conclusions

In this study, we used the Framingham dataset to apply five types of machine learning models (XG Boost, Random Forest, Logistic Regression, Ensemble Learning, and Ensemble Stacking) and compared their accuracy in predicting the risk of death from heart disease. The results showed that XG Boost and Ensemble Stacking had the highest prediction performance (AUC= 0.83). Furthermore, among the 15 parameters extracted from the dataset, it was confirmed that coronary artery disease (PREVCHD), glucose level (GLUCOSE), and diastolic blood pressure (DIABP) were important factors strongly associated with the risk of death. The results of this study show that parameters that can be measured by wearable sensors, such as glucose levels and blood pressure, play an important role in predicting the risk of heart disease, suggesting the usefulness of wearable technology in the management of heart disease risk in the future.

Supplementary Materials

The Framingham dataset used in this study is publicly available and can be accessed through Kaggle. The dataset can be accessed via the following link: Kaggle Framingham Heart Study Dataset (https://www.kaggle.com/datasets/aasheesh200/framingham-heart-study-dataset) For further details on the dataset, including variable definitions and additional descriptive statistics, please refer to the accompanying Kaggle page.

Author Contributions

Conceptualization, E.Y.; methodology, E.Y. and I.K; software, D.H.; validation, D.H.; formal analysis, D.H.; investigation, Y.E.; resources, D.H. and I.K; data curation, E.Y.; writing—original draft preparation, E.Y.; writing—review and editing, E.Y.; visualization, D.H.; supervision, E.Y.; project administration, E.Y.; funding acquisition, E.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study involves the analysis of open data. The open data does not contain personally identifiable information (such as addresses or names), and since it does not involve human subjects, ethical review board approval is not required.

Informed Consent Statement

In this study, informed consent is not required as the research involves the analysis of open data that does not contain personally identifiable information. The data used in this study is anonymized and does not pertain to human subjects directly, thus making the informed consent process unnecessary.

Data Availability Statement

The data used in this study is publicly available from Kaggle's Framingham Heart Study dataset. The dataset can be accessed at the following URL: https://www.kaggle.com/datasets/. The data is provided under the terms of the relevant license and can be freely accessed for research purposes.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The abbreviations used in this paper are explained in Section 2, Methods.

References

  1. Teo, K.K.; Rafiq, T. Cardiovascular Risk Factors and Prevention: A Perspective from Developing Countries. Can. J. Cardiol. 2021, 37, 733–743. [Google Scholar] [CrossRef]
  2. Pirzada, A.; Cai, J.; Cordero, C.; Gallo, L.C.; Isasi, C.R.; Kunz, J.; Thyagaragan, B.; Wassertheil-Smoller, S.; Daviglus, M.L. Risk Factors for Cardiovascular Disease: Knowledge Gained from the Hispanic Community Health Study/Study of Latinos. Curr. Atheroscler. Rep. 2023, 25, 785–793. [Google Scholar] [CrossRef] [PubMed]
  3. Godijk, N.G.; Vos, A.G.; Jongen, V.W.; Moraba, R.; Tempelman, H.; Grobbee, D.E.; Coutinho, R.A.; Devillé, W.; Klipstein-Grobusch, K. Heart Rate Variability, HIV and the Risk of Cardiovascular Diseases in Rural South Africa. Glob. Heart 2020, 15, 17. [Google Scholar] [CrossRef] [PubMed]
  4. Rosenthal, T.; Touyz, R.M.; Oparil, S. Migrating Populations and Health: Risk Factors for Cardiovascular Disease and Metabolic Syndrome. Curr. Hypertens. Rep. 2022, 24, 325–340. [Google Scholar] [CrossRef]
  5. Lopez-Jaramillo, P.; Lopez-Lopez, J.P.; Tole, M.C.; Cohen, D.D. Muscular Strength in Risk Factors for Cardiovascular Disease and Mortality: A Narrative Review. Anatol. J. Cardiol. 2022, 26, 598–607. [Google Scholar] [CrossRef]
  6. Artola Arita, V.; Beigrezaei, S.; Franco, O.H. Risk Factors for Cardiovascular Disease: The Known Unknown. Eur. J. Prev. Cardiol. 2024, 31, e106–e107. [Google Scholar] [CrossRef] [PubMed]
  7. Quesada, O. Reproductive Risk Factors for Cardiovascular Disease in Women. Menopause 2023, 30, 1058–1060. [Google Scholar] [CrossRef]
  8. Miller, D.V.; Watson, K.E.; Wang, H.; Fyfe-Kirschner, B.; Heide, R.S.V. Racially Related Risk Factors for Cardiovascular Disease: Society for Cardiovascular Pathology Symposium 2022. Cardiovasc. Pathol. 2022, 61, 107470. [Google Scholar] [CrossRef]
  9. Saba, P.S.; Parodi, G.; Ganau, A. From Risk Factors to Clinical Disease: New Opportunities and Challenges for Cardiovascular Risk Prediction. J. Am. Coll. Cardiol. 2021, 77, 1436–1438. [Google Scholar] [CrossRef]
  10. Whelton, S.P.; Post, W.S. Importance of Traditional Cardiovascular Risk Factors for Identifying High-Risk Persons in Early Adulthood. Eur. Heart J. 2022, 43, 2901–2903. [Google Scholar] [CrossRef]
  11. Hutchesson, M.; Campbell, L.; Leonard, A.; Vincze, L.; Shrewsbury, V.; Collins, C.; Taylor, R. Disorders of Pregnancy and Cardiovascular Health Outcomes? A Systematic Review of Observational Studies. Pregnancy Hypertens. 2022, 27, 138–147. [Google Scholar] [CrossRef] [PubMed]
  12. Freak-Poli, R.; Phyo, A.Z.Z.; Hu, J.; Barker, S.F. Are Social Isolation, Lack of Social Support or Loneliness Risk Factors for Cardiovascular Disease in Australia and New Zealand? A Systematic Review and Meta-Analysis. Health Promot. J. Austr. 2022, 33, 278–315. [Google Scholar] [CrossRef]
  13. Bergami, M.; Scarpone, M.; Bugiardini, R.; Cenko, E.; Manfrini, O. Sex Beyond Cardiovascular Risk Factors and Clinical Biomarkers of Cardiovascular Disease. Rev. Cardiovasc. Med. 2022, 23, 19. [Google Scholar] [CrossRef]
  14. Kato, M. Diet- and Sleep-Based Approach for Cardiovascular Risk/Diseases. Nutrients 2023, 15, 3668. [Google Scholar] [CrossRef]
  15. Hauer, R.N.W. The Fractionated QRS Complex for Cardiovascular Risk Assessment. Eur. Heart J. 2022, 43, 4192–4194. [Google Scholar] [CrossRef] [PubMed]
  16. Thayer, J.F.; Yamamoto, S.S.; Brosschot, J.F. The Relationship of Autonomic Imbalance, Heart Rate Variability and Cardiovascular Disease Risk Factors. Int. J. Cardiol. 2010, 141, 122–131. [Google Scholar] [CrossRef]
  17. Greiser, K.H.; et al. Cardiovascular Disease, Risk Factors and Heart Rate Variability in the Elderly General Population: Design and Objectives of the CARLA Study. BMC Cardiovasc. Disord. 2005, 5, 36. [Google Scholar] [CrossRef]
  18. Wekenborg, M.K.; Künzel, R.G.; Rothe, N.; Penz, M.; Walther, A.; Kirschbaum, C.; Thayer, J.F.; Hill, L.K. Exhaustion and Cardiovascular Risk Factors: The Role of Vagally-Mediated Heart Rate Variability. Ann. Epidemiol. 2023, 87, S1047–2797. [Google Scholar] [CrossRef]
  19. Nakayama, N.; Miyachi, M.; Tamakoshi, K.; Morikawa, S.; Negi, K.; Watanabe, K.; Moriwaki, Y.; Hirai, M. Increased Afternoon Step Count Increases Heart Rate Variability in Patients with Cardiovascular Risk Factors. J. Clin. Nurs. 2022, 31, 1636–1642. [Google Scholar] [CrossRef]
  20. Malik, M. Heart Rate Variability. Curr. Opin. Cardiol. 1998, 13, 36–44. [Google Scholar] [CrossRef]
  21. Møller, A.L.; Andersson, C. Importance of Smoking Cessation for Cardiovascular Risk Reduction. Eur. Heart J. 2021, 42, 4154–4156. [Google Scholar] [CrossRef]
  22. Yuda, E.; Ueda, N.; Kisohara, M.; Hayano, J. Redundancy among risk predictors derived from heart rate variability and dynamics: ALLSTAR big data analysis. Ann. Noninvasive Electrocardiol. 2021, 26, e12790. [Google Scholar] [CrossRef] [PubMed]
  23. Carney, R.M.; Blumenthal, J.A.; Freedland, K.E.; Stein, P.K.; Howells, W.B.; Berkman, L.F.; Watkins, L.L.; Czajkowski, S.M.; Hayano, J.; Domitrovich, P.P.; Jaffe, A.S. Low heart rate variability and the effect of depression on post-myocardial infarction mortality. Arch. Intern. Med. 2005, 165, 1486–1491. [Google Scholar] [CrossRef]
  24. Blumenthal, J.A.; Sherwood, A.; Babyak, M.A.; Watkins, L.L.; Waugh, R.; Georgiades, A.; Bacon, S.L.; Hayano, J.; Coleman, R.E.; Hinderliter, A. Effects of exercise and stress management training on markers of cardiovascular risk in patients with ischemic heart disease: a randomized controlled trial. JAMA 2005, 293, 1626–1634. [Google Scholar] [CrossRef]
  25. Kiyono, K.; Hayano, J.; Watanabe, E.; Struzik, Z.R.; Yamamoto, Y. Non-Gaussian heart rate as an independent predictor of mortality in patients with chronic heart failure. Heart Rhythm 2008, 5, 261–268. [Google Scholar] [CrossRef]
  26. Kojima, M.; Hayano, J.; Fukuta, H.; Sakata, S.; Mukai, S.; Ohte, N.; Seno, H.; Toriyama, T.; Kawahara, H.; Furukawa, T.A.; Tokudome, S. Loss of fractal heart rate dynamics in depressive hemodialysis patients. Psychosom. Med. 2008, 70, 177–185. [Google Scholar] [CrossRef] [PubMed]
  27. Mahmood, S.S.; Levy, D.; Vasan, R.S.; Wang, T.J. The Framingham Heart Study and the Epidemiology of Cardiovascular Disease: A Historical Perspective. Lancet 2014, 383, 999–1008. [Google Scholar] [CrossRef] [PubMed]
  28. D'Agostino, R.B. Sr.; Vasan, R.S.; Pencina, M.J.; Wolf, P.A.; Cobain, M.; Massaro, J.M.; Kannel, W.B. General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study. Circulation 2008, 117, 743–753. [Google Scholar] [CrossRef]
  29. Andersson, C.; Nayor, M.; Tsao, C.W.; Levy, D.; Vasan, R.S. Framingham Heart Study: JACC Focus Seminar, 1/8. J. Am. Coll. Cardiol. 2021, 77, 2680–2692. [Google Scholar] [CrossRef]
  30. Andersson, C.; Johnson, A.D.; Benjamin, E.J.; Levy, D.; Vasan, R.S. 70-Year Legacy of the Framingham Heart Study. Nat. Rev. Cardiol. 2019, 16, 687–698. [Google Scholar] [CrossRef]
  31. Rempakos, A.; Prescott, B.; Mitchell, G.F.; Vasan, R.S.; Xanthakis, V. Association of Life's Essential 8 with Cardiovascular Disease and Mortality: The Framingham Heart Study. J. Am. Heart Assoc. 2023, 12, e030764. [Google Scholar] [CrossRef] [PubMed]
  32. Cybulska, B.; Kłosiewicz-Latoszek, L. Landmark Studies in Coronary Heart Disease Epidemiology: The Framingham Heart Study after 70 Years and the Seven Countries Study after 60 Years. Kardiol. Pol. 2019, 77, 173–180. [Google Scholar] [CrossRef] [PubMed]
  33. Cooper, L.L.; Mitchell, G.F. Incorporation of Novel Vascular Measures into Clinical Management: Recent Insights from the Framingham Heart Study. Curr. Hypertens. Rep. 2019, 21, 19. [Google Scholar] [CrossRef] [PubMed]
  34. Graf, G.H.J.; Aiello, A.E.; Caspi, A.; Kothari, M.; Liu, H.; Moffitt, T.E.; Muennig, P.A.; Ryan, C.P.; Sugden, K.; Belsky, D.W. Educational Mobility, Pace of Aging, and Lifespan Among Participants in the Framingham Heart Study. JAMA Netw. Open 2024, 7, e240655. [Google Scholar] [CrossRef]
  35. Ding, H.; Mandapati, A.; Hamel, A.P.; Karjadi, C.; Ang, T.F.A.; Xia, W.; Au, R.; Lin, H. Multimodal Machine Learning for 10-Year Dementia Risk Prediction: The Framingham Heart Study. J. Alzheimers Dis. 2023, 96, 277–286. [Google Scholar] [CrossRef]
  36. Murabito, J.M. Women and Cardiovascular Disease: Contributions from the Framingham Heart Study. J. Am. Med. Womens Assoc. (1972) 1995, 50, 35–39, 55. [Google Scholar] [PubMed]
  37. Kahouadji, N. Comparison of Machine Learning Classification Algorithms and Application to the Framingham Heart Study. arXiv 2024, arXiv:2402.15005. Available online: https://arxiv.org/abs/2402.15005 (accessed on 23 February 2025). [Google Scholar]
  38. Keizer, S.; Zhan, Z.; Ramachandran, V.S.; van den Heuvel, E.R. Joint Modeling with Time-Dependent Treatment and Heteroskedasticity: Bayesian Analysis with Application to the Framingham Heart Study. arXiv 2019, arXiv:1912.06398. Available online: https://arxiv.org/abs/1912.06398 (accessed on 23 February 2025). [Google Scholar]
  39. Psychogyios, K.; Ilias, L.; Askounis, D. Comparison of Missing Data Imputation Methods Using the Framingham Heart Study Dataset. In Proceedings of the IEEE Conference, Location, Date. IEEE, 2022. Available online: https://ieeexplore.ieee.org/document/9926882 (accessed on 23 February 2025); Available online: https://ieeexplore.ieee.org/document/9926882 (accessed on 23 February 2025).
  40. Priyanka, H.U.; Vivek, R. Multi Model Data Mining Approach for Heart Failure Prediction. Int. J. Data Min. Knowl. Manag. Process 2016, 6, 31–39. Available online: https://www.aircconline.com/ijdkp/V6N5/6516ijdkp03.pdf (accessed on 23 February 2025). [CrossRef]
  41. Hayano, J.; Yamada, A.; Yoshida, Y.; Ueda, N.; Yuda, E. Spectral Structure and Nonlinear Dynamics Properties of Long-Term Interstitial Fluid Glucose. Int. J. Biosci. Biochem. Bioinform. 2020, 10, 137–143. Available online: https://www.ijbbb.org/vol10/545-K2034.pdf (accessed on 23 February 2025). [CrossRef]
  42. Schutte, A.E.; Kollias, A.; Stergiou, G.S. Blood Pressure and Its Variability: Classic and Novel Measurement Techniques. Nat. Rev. Cardiol. 2022, 19, 643–654. [Google Scholar] [CrossRef] [PubMed]
  43. Bradley, C.K.; Shimbo, D.; Colburn, D.A.; Pugliese, D.N.; Padwal, R.; Sia, S.K.; Anstey, D.E. Cuffless Blood Pressure Devices. Am. J. Hypertens. 2022, 35, 380–387. [Google Scholar] [CrossRef]
  44. Sagirova, Z.; Kuznetsova, N.; Gogiberidze, N.; Gognieva, D.; Suvorov, A.; Chomakhidze, P.; Omboni, S.; Saner, H.; Kopylov, P. Cuffless Blood Pressure Measurement Using a Smartphone-Case Based ECG Monitor with Photoplethysmography in Hypertensive Patients. Sensors 2021, 21, 3525. [Google Scholar] [CrossRef] [PubMed]
  45. Tamura, T.; Huang, M. Cuffless Blood Pressure Monitor for Home and Hospital Use. Sensors 2025, 25, 640. [Google Scholar] [CrossRef]
  46. Tamura, T.; Shimizu, S.; Nishimura, N.; Takeuchi, M. Long-Term Stability of Over-the-Counter Cuffless Blood Pressure Monitors: A Proposal. Health Technol. 2023, 13, 53–63. [Google Scholar] [CrossRef]
  47. Pandit, J.A.; Lores, E.; Batlle, D. Cuffless Blood Pressure Monitoring: Promises and Challenges. Clin. J. Am. Soc. Nephrol. 2020, 15, 1531–1538. [Google Scholar] [CrossRef]
  48. Gogiberidze, N.; Suvorov, A.; Sultygova, E.; Sagirova, Z.; Kuznetsova, N.; Gognieva, D.; Chomakhidze, P.; Frolov, V.; Bykova, A.; Mesitskaya, D.; Novikova, A.; Kondakov, D.; Volovchenko, A.; Omboni, S.; Kopylov, P. Practical Application of a New Cuffless Blood Pressure Measurement Method. Pathophysiology 2023, 30, 586–598. [Google Scholar] [CrossRef]
  49. Rajendran, A.; Minhas, A.S.; Kazzi, B.; Varma, B.; Choi, E.; Thakkar, A.; Michos, E.D. Sex-Specific Differences in Cardiovascular Risk Factors and Implications for Cardiovascular Disease Prevention in Women. Atherosclerosis 2023, 384, 117269. [Google Scholar] [CrossRef]
  50. Faulkner, J.L. Obesity-Associated Cardiovascular Risk in Women: Hypertension and Heart Failure. Clin. Sci. (Lond.) 2021, 135, 1523–1544. [Google Scholar] [CrossRef]
  51. Mehta, L.S.; Velarde, G.P.; Lewey, J.; Sharma, G.; Bond, R.M.; Navas-Acien, A.; Fretts, A.M.; Magwood, G.S.; Yang, E.; Blumenthal, R.S.; Brown, R.M.; Mieres, J.H.; American Heart Association Cardiovascular Disease and Stroke in Women and Underrepresented Populations Committee of the Council on Clinical Cardiology; Council on Cardiovascular and Stroke Nursing; Council on Hypertension; Council on Lifelong Congenital Heart Disease and Heart Health in the Young; Council on Lifestyle and Cardiometabolic Health; Council on Peripheral Vascular Disease; Stroke Council. Cardiovascular Disease Risk Factors in Women: The Impact of Race and Ethnicity: A Scientific Statement from the American Heart Association. Circulation 2023, 147, 1471–1487. [Google Scholar] [CrossRef]
  52. Kim, C. Management of Cardiovascular Risk in Perimenopausal Women with Diabetes. Diabetes Metab. J. 2021, 45, 492–501. [Google Scholar] [CrossRef] [PubMed]
  53. Brown, R.M.; Tamazi, S.; Weinberg, C.R.; Dwivedi, A.; Mieres, J.H. Racial Disparities in Cardiovascular Risk and Cardiovascular Care in Women. Curr. Cardiol. Rep. 2022, 24, 1197–1208. [Google Scholar] [CrossRef] [PubMed]
  54. Rodriguez de Morales, Y.A.; Abramson, B.L. Cardiovascular and Physiological Risk Factors in Women at Mid-Life and Beyond. Can. J. Physiol. Pharmacol. 2024, 102, 442–451. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Receiver Operating Characteristics(ROC). Vertical axis shows true positive rate, and horizontal axis shows false positive rate. The true positive rate (TPR) and false positive rate (FPR) are calculated for each cutoff point used to distinguish between survived and non-survived. These points are plotted on a graph, with the TPR on the vertical axis and the FPR on the horizontal axis, and then connected by a line. The closer a point is to the top left corner—indicating a low false positive rate and a high true positive rate—the better the model's performance. Similarly, a larger area under the curve (AUC) reflects a more effective discriminative marker.
Figure 1. Receiver Operating Characteristics(ROC). Vertical axis shows true positive rate, and horizontal axis shows false positive rate. The true positive rate (TPR) and false positive rate (FPR) are calculated for each cutoff point used to distinguish between survived and non-survived. These points are plotted on a graph, with the TPR on the vertical axis and the FPR on the horizontal axis, and then connected by a line. The closer a point is to the top left corner—indicating a low false positive rate and a high true positive rate—the better the model's performance. Similarly, a larger area under the curve (AUC) reflects a more effective discriminative marker.
Preprints 150555 g001
Figure 2. Refined correlation plot. Shows correlation matrix that has been cleaned using principal component analysis (PCA) for 15 factors in the dataset, such as gender and age. Various risk factors have been strongly associated with the onset of CVD and increased mortality. In the Framingham Heart Study (FHS), the variable "educ" represents years of education: 1 to 11 for elementary to high school (including dropouts), 12 for high school graduates, 13 to 15 for some college (including associate degrees), 16 for university graduates (bachelor's degree), and 17 or more for graduate or professional education. This variable is used to analyze the relationship between educational level and health or cardiovascular disease risk.
Figure 2. Refined correlation plot. Shows correlation matrix that has been cleaned using principal component analysis (PCA) for 15 factors in the dataset, such as gender and age. Various risk factors have been strongly associated with the onset of CVD and increased mortality. In the Framingham Heart Study (FHS), the variable "educ" represents years of education: 1 to 11 for elementary to high school (including dropouts), 12 for high school graduates, 13 to 15 for some college (including associate degrees), 16 for university graduates (bachelor's degree), and 17 or more for graduate or professional education. This variable is used to analyze the relationship between educational level and health or cardiovascular disease risk.
Preprints 150555 g002
Figure 3. Plots of age, glucose, BMI, and heart rate for survived and non-survived. Survivors are shown in blue, non-survivors in orange. Age is an important factor analyzed in the Framingham dataset, but it has been shown to be closely associated with other physiological factors.
Figure 3. Plots of age, glucose, BMI, and heart rate for survived and non-survived. Survivors are shown in blue, non-survivors in orange. Age is an important factor analyzed in the Framingham dataset, but it has been shown to be closely associated with other physiological factors.
Preprints 150555 g003
Figure 4. Predicting the risk of mortality for each factor. Figure (a) is explanation of the XGB model using SHAPley, (b) is bar plot, and (c) is summary of all effects of each feature.
Figure 4. Predicting the risk of mortality for each factor. Figure (a) is explanation of the XGB model using SHAPley, (b) is bar plot, and (c) is summary of all effects of each feature.
Preprints 150555 g004
Table 1. Categories, descriptions, and notations for each variable in FHS.
Table 1. Categories, descriptions, and notations for each variable in FHS.
Category Variable Name English Description
Basic Information RANDID Random ID for individual identification
SEX Sex (1 = Male, 2 = Female)
AGE Age (years)
Health Status & Risk Factors TOTCHOL Total cholesterol (mg/dL)
SYSBP Systolic blood pressure (mmHg)
DIABP Diastolic blood pressure (mmHg)
CURSMOKE Current smoking status (1 = Yes, 0 = No)
CIGPDAY Cigarettes per day
BMI Body mass index (BMI, kg/m²)
DIABETES Diabetes (1 = Yes, 0 = No)
BPMEDS Antihypertensive medication (1 = Yes, 0 = No)
HEARTRTE Heart rate (bpm)
GLUCOSE Glucose level (mg/dL)
HDLC High-density lipoprotein cholesterol (mg/dL)
LDLC Low-density lipoprotein cholesterol (mg/dL)
Medical History educ Education level
PREVCHD Previous coronary heart disease (1 = Yes, 0 = No)
PREVAP Previous angina pectoris (1 = Yes, 0 = No)
PREVMI Previous myocardial infarction (1 = Yes, 0 = No)
PREVSTRK Previous stroke (1 = Yes, 0 = No)
PREVHYP Previous hypertension (1 = Yes, 0 = No)
Event Occurrence DEATH Death (1 = Yes, 0 = No)
ANGINA Angina occurrence (1 = Yes, 0 = No)
HOSPMI Hospitalization for myocardial infarction (1 = Yes, 0 = No)
MI_FCHD Myocardial infarction or coronary heart disease occurrence (1 = Yes, 0 = No)
ANYCHD Any coronary heart disease occurrence (1 = Yes, 0 = No)
STROKE Stroke occurrence (1 = Yes, 0 = No)
CVD Cardiovascular disease occurrence (1 = Yes, 0 = No)
HYPERTEN Hypertension occurrence (1 = Yes, 0 = No)
Follow-Up Period TIME Follow-up period (months or years)
PERIOD Study period or phase
TIMEAP Time to angina occurrence
TIMEMI Time to myocardial infarction occurrence
TIMEMIFC Time to myocardial infarction or coronary heart disease occurrence
TIMECHD Time to coronary heart disease occurrence
TIMESTRK Time to stroke occurrence
TIMECVD Time to cardiovascular disease occurrence
TIMEDTH Time to death
TIMEHYP Time to hypertension occurrence
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated