Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Sleep Quality, Nutrients Intake and Social Development Index predict Metabolic Syndrome in the Tlalpan 2020 Cohort: A Machine Learning and Synthetic Data study

Version 1 : Received: 10 January 2024 / Approved: 10 January 2024 / Online: 11 January 2024 (11:23:53 CET)

A peer-reviewed article of this Preprint also exists.

Gutiérrez-Esparza, G.; Martinez-Garcia, M.; Ramírez-delReal, T.; Groves-Miralrio, L.E.; Marquez, M.F.; Pulido, T.; Amezcua-Guerra, L.M.; Hernández-Lemus, E. Sleep Quality, Nutrient Intake, and Social Development Index Predict Metabolic Syndrome in the Tlalpan 2020 Cohort: A Machine Learning and Synthetic Data Study. Nutrients 2024, 16, 612. Gutiérrez-Esparza, G.; Martinez-Garcia, M.; Ramírez-delReal, T.; Groves-Miralrio, L.E.; Marquez, M.F.; Pulido, T.; Amezcua-Guerra, L.M.; Hernández-Lemus, E. Sleep Quality, Nutrient Intake, and Social Development Index Predict Metabolic Syndrome in the Tlalpan 2020 Cohort: A Machine Learning and Synthetic Data Study. Nutrients 2024, 16, 612.

Abstract

Metabolic Syndrome (MetS) is a serious condition that significantly increases the risk of cardiovascular diseases and the severity of type 2 diabetes, also impacting on the development and evolution of other chronic diseases. Predicting metabolic syndrome is a complex task due to the multifactorial nature of this condition, which involves a combination of various risk factors such as abdominal obesity, insulin resistance, dyslipidemia, and hypertension. The complex interplay of these factors makes it challenging to predict the syndrome. Both genetic predisposition and environmental factors also contribute to the development of metabolic syndrome. Metabolic syndrome affects diverse populations with different ethnicities, lifestyles, and socioeconomic backgrounds. Prediction models, in addition, need to account for population heterogeneity and consider variations in risk factors across different groups. The present study analyzed data from participants in a cohort from Mexico City to identify key risk factors in men and women, addressing the presence of unbalanced data. In order to tackle the issues posed by data imbalance data, SMOTE and ADASYN were applied to assess significant differences in the selection of risk factors for MetS prediction. Random Forest and RPART models using ADASYN and SMOTE demonstrated better performance, achieving a balanced accuracy of approximately 87%. In women, they highlighted sleep quality, anxiety factors, tobacco consumption, and nutritional components. In the case of men, stronger associations were identified with the social development index and factors related to gout in parents.

Keywords

Poor quality sleep; Social Development Index; Nutrients; Machine learning; Features selection; Balancing methods; Mexico City; Tlalpan 2020 cohort

Subject

Medicine and Pharmacology, Cardiac and Cardiovascular Systems

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.