Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Accuracy Improved Classification and Regression Tree (CART) Model: Diabetes Prediction Using Minority Over-Sampling and Particle Swarm Optimization Techniques

Version 1 : Received: 4 June 2023 / Approved: 5 June 2023 / Online: 5 June 2023 (10:10:36 CEST)

How to cite: Sukestiyarno, Y.L.; Rofif, M.Z. Accuracy Improved Classification and Regression Tree (CART) Model: Diabetes Prediction Using Minority Over-Sampling and Particle Swarm Optimization Techniques. Preprints 2023, 2023060300. https://doi.org/10.20944/preprints202306.0300.v1 Sukestiyarno, Y.L.; Rofif, M.Z. Accuracy Improved Classification and Regression Tree (CART) Model: Diabetes Prediction Using Minority Over-Sampling and Particle Swarm Optimization Techniques. Preprints 2023, 2023060300. https://doi.org/10.20944/preprints202306.0300.v1

Abstract

Diabetes is a serious health problem throughout the world, including in Indonesia. The International Diabetes Federation (IDF) reports that the number of adults with diabetes is increasing every year. The Behavioral Risk Factor Surveillance System (BRFSS) is a survey conducted by the Centers for Disease Control and Prevention (CDC) in the United States. Classification methods in data mining techniques are used to classify diabetics and non-diabetics. The data mining process is carried out by preprocessing, feature selection, and dataset classification stages. In the preprocessing stage, data cleaning, data formatting, and data oversampling are carried out using the Synthetic Minority Over-sampling Technique (SMOTE). Next, the feature selection stage is carried out using the Particle Swarm Optimization (PSO) algorithm to find the best attributes. The dataset classification stage is carried out using the CART Model Decision Tree algorithm. The results of the performance evaluation of the CART algorithm are calculated using the confusion matrix and the MAE value, the results obtained for the CART algorithm without SMOTE and PSO obtained the best accuracy of 75.34% and the MAE value of 0.2466, while the CART algorithm using SMOTE and PSO can increase accuracy by 10 .94% to 86.28% and an MAE value of 0.1372.

Keywords

CART algorithm; Accuracy; Synthetic Minority Over-sampling Technique; Particle Swarm Optimization

Subject

Computer Science and Mathematics, Applied Mathematics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.