Version 1
: Received: 13 June 2022 / Approved: 14 June 2022 / Online: 14 June 2022 (09:54:46 CEST)
How to cite:
Massaro, A.; Magaletti, N.; Cosoli, G.; Giardinelli, V. O.; Leogrande, A. The Prediction of Diabetes. Preprints2022, 2022060202. https://doi.org/10.20944/preprints202206.0202.v1
Massaro, A.; Magaletti, N.; Cosoli, G.; Giardinelli, V. O.; Leogrande, A. The Prediction of Diabetes. Preprints 2022, 2022060202. https://doi.org/10.20944/preprints202206.0202.v1
Massaro, A.; Magaletti, N.; Cosoli, G.; Giardinelli, V. O.; Leogrande, A. The Prediction of Diabetes. Preprints2022, 2022060202. https://doi.org/10.20944/preprints202206.0202.v1
APA Style
Massaro, A., Magaletti, N., Cosoli, G., Giardinelli, V. O., & Leogrande, A. (2022). The Prediction of Diabetes. Preprints. https://doi.org/10.20944/preprints202206.0202.v1
Chicago/Turabian Style
Massaro, A., Vito O.M. Giardinelli and Angelo Leogrande. 2022 "The Prediction of Diabetes" Preprints. https://doi.org/10.20944/preprints202206.0202.v1
Abstract
The following article presents an analysis of the determinants of diabetes using a dataset containing the surveys of 2000 patients from the Frankfurt Hospital in Germany. The data were analyzed using the following models, namely: Tobit, Probit, Logit, Multinomial Logit, OLS, WLS with heteroskedasticity. The results show that the presence of diabetes is positively associated with "Pregnancies", "Glucose", "BMI", "Diabetes Pedigree Function", "Age" and negatively associated with "Blood Pressure". A cluster analysis is realized using the fuzzy c-Means algorithm optimized with the Elbow method and three clusters were found. Finally a confrontation among eight different machine learning algorithms is realized to select the best performing algorithm to predict the probability of patients to develop diabetes.
Keywords
Machine Learning; Clusterization; Elbow Method; Prediction; Correlation Matrix; Principal Component Analysis; Binary and non-Binary regression models
Subject
Business, Economics and Management, Economics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.