Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Synthetic Health data Generation for Enhancement of Non-Invasive Diabetes AI-Based Prediction

Version 1 : Received: 17 August 2023 / Approved: 21 August 2023 / Online: 21 August 2023 (11:39:55 CEST)

How to cite: Cruz Castañeda, W.A.; Bertemes Filho, P. Synthetic Health data Generation for Enhancement of Non-Invasive Diabetes AI-Based Prediction. Preprints 2023, 2023081464. https://doi.org/10.20944/preprints202308.1464.v1 Cruz Castañeda, W.A.; Bertemes Filho, P. Synthetic Health data Generation for Enhancement of Non-Invasive Diabetes AI-Based Prediction. Preprints 2023, 2023081464. https://doi.org/10.20944/preprints202308.1464.v1

Abstract

Continuous glucose monitoring devices allow diabetes condition management. However, when limited data is available, one option is to increase their size by generating synthetic samples. From a homemade wearable prototype was created a real dataset with 18 instances and 53 attributes that capture characteristics of capillary and venous blood glucose, oxygen concentration, pulse rate, skin temperature, and 24 modules and 24 phases related to bio-impedance. The objective of this article is to generate synthetic datasets, and also it investigates the ideal features subset and optimal model for non-invasive diabetes prediction. Gaussian-Copulas (GC), conditional generative adversarial networks (CG), variational autoencoders, and Copula-GAN techniques' were used to generate five synthetic datasets. Experiments show that GC1 and GC2 datasets follow min/max boundaries and are not copies of the original data. Multilayer perceptron regressor outperformed (train and test) with 2.17, 2.51 in MAE; 9.29, 13.59 in MSE; 3.05, 3.69 in RMSE, and 0.95, 0.92 in R2 in GC1, and 2.64, 3.02 in MAE; 11.43, 15.11 in MSE; 3.38, 3.89 in RMSE, and 0.94, 0.92 in R2 in GC2 with eight features. Future work is necessary to explore autoencoder and generative architectures, datasets with diverse characteristics, and the effect of the number of features.

Keywords

synthetic generation; wearables health data; non-invasive diabetes prediction

Subject

Engineering, Electrical and Electronic Engineering

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.