ARTICLE | doi:10.20944/preprints202307.0014.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; unbalanced dataset; augmentation; multiclass classification; metrics boosting method; sota algorithm; visual transformer; ResNet; Xception; Inception
Online: 3 July 2023 (08:25:13 CEST)
One of the critical problems in multiclass classification tasks is the imbalance of the dataset. This is especially true when using contemporary pre-trained neural networks, where, in fact, the last layers of the neural network are retrained. Therefore, the large datasets with highly unbalanced classes are not good for models’ training since the use of such a dataset leads to overfitting and, accordingly, poor metrics on test and validation datasets. In this paper the sensitivity to a dataset imbalance of Xception, ViT-384, ViT-224, VGG19, ResNet34, ResNet50, ResNet101, Inception_v3, DenseNet201, DenseNet161, DeIT was studied using a highly imbalanced dataset of 20,971 images sorted into 7 classes. It is shown that the best metrics were obtained when using a cropped dataset with augmentation of missing images in classes up to 15% of the initial number. So, the metrics can be increased by 2-6% compared to the metrics of the models on the initial unbalanced data set. Moreover, the metrics of the rare classes' classification also improved significantly – the TruePositive value can be increased by 0.3 and more. As result, the best approach to train considered networks on an initially unbalanced dataset was formulated.
ARTICLE | doi:10.20944/preprints202208.0495.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: chronic venous disease; deep leaning; data mining; Resnet50; DeiT; automatic classification; automatic CEAP classification
Online: 29 August 2022 (12:46:56 CEST)
Chronic venous disease (CVD) occurs in a substantial proportion of the world's population. If the onset of CVD looks like a cosmetic defect, then over time, it can develop into serious problems that require surgical intervention. The aim of the work is to use deep learning (DL) methods for automatic classification of the stage of CVD for self-diagnosis of a patient by using the image of the patient’s legs. The required for DL algorithms images of legs with CVD were obtained by using Internet Data Mining. For images preprocessing, the binary classification problem “legs - no legs” was solved based on Resnet50 with accuracy 0.998. The application of this filter made it possible to collect a data set of 11,118 good quality leg images with various stages of CVD. For classification of various stages of CVD according to CEAP classification, the multi classification problem was set and resolved by using two neural networks with completely different architecture - Resnet50 and DeiT. The model based on DeiT without any tuning shows better results than the model based on Resnet50 (precision = 0.770 (DeiT) and 0.615 (Resnet50)). To demonstrate the results of the work, a telegram bot was developed, in which fully functioning DL algorithms are implemented. This bot allows evaluating the condition of the patient's legs with a fairly good accuracy for the CVD classification.