Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Complement Class Fine-Tuning of Naïve Bayes for Severely Imbalanced Datasets

Version 1 : Received: 18 November 2022 / Approved: 23 November 2022 / Online: 23 November 2022 (07:00:22 CET)

How to cite: Alenazi, F.S.; Hindi, K.E.; AsSadhan, B. Complement Class Fine-Tuning of Naïve Bayes for Severely Imbalanced Datasets. Preprints 2022, 2022110435. https://doi.org/10.20944/preprints202211.0435.v1 Alenazi, F.S.; Hindi, K.E.; AsSadhan, B. Complement Class Fine-Tuning of Naïve Bayes for Severely Imbalanced Datasets. Preprints 2022, 2022110435. https://doi.org/10.20944/preprints202211.0435.v1

Abstract

In this work, we adapt the fine-tuning algorithm of the Naïve Bayesian (FTNB) classifier to make it more suitable for imbalanced datasets. In particular, we boost misclassified instance probability terms by an amount that is disproportional to the harmonic mean of actual and predicted classes. The intuition is that discriminative attributes when the instance is misclassified would have small probability term pair values in both the actual class due to data scarcity and the predicted class due to weak correlation. Conversely, if both values are relatively high, then the attribute has good data coverage (support) and it should not be a cause for misclassification. Since the harmonic average is dominated by the smaller value and we have an imbalanced dataset, we should enact a large update if both or either term probabilities of actual and predicted classes are small. We used several benchmark datasets (60 different balanced and imbalanced datasets) to determine if the poor performance of the NB classifier is due to the scarcity of data and compared the performance of the proposed algorithm with NB, original FTNB, and other relatively new SOTA Ensemble Imbalanced Classifiers. Our empirical results reveal that the new proposed algorithm significantly outperforms all other classifiers.

Keywords

imbalanced datasets; intrusion detection; wireless sensor networks (WSNs); ransomware attack classification; machine learning

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.