Automatic Classification Approach for Detecting COVID-19 using Deep Convolutional Neural Networks

The COVID-19 pandemic situation has created even more difficulties in the quick identification and screening of the COVID-19 patients for the medical specialists. Therefore, a significant study is necessary for detecting COVID-19 cases using an automated diagnosis method, which can aid in controlling the spreading of the virus. In this paper, the study suggests a Deep Convolutional Neural Network-based multi-classification approach (COV-MCNet) using eight different pre-trained architectures such as VGG16, VGG19, ResNet50V2, DenseNet201, InceptionV3, MobileNet, InceptionResNetV2, Xception which are trained and tested on the X-ray images of COVID-19, Normal, Viral Pneumonia, and Bacterial Pneumonia. The results from 3-class (Normal vs. COVID-19 vs. Viral Pneumonia) showed that only the ResNet50V2 model provides the highest classification performance (accuracy: 95.83%, precision: 96.12%, recall: 96.11%, F1-score: 96.11%, specificity: 97.84%) compared to rest of the models. The results from 4-class (Normal vs. COVID-19 vs. Viral Pneumonia vs. Bacterial Pneumonia) demonstrated that the pre-trained model DenseNet201 provides the highest classification performance (accuracy: 92.54%, precision: 93.05%, recall: 92.81%, F1-score: 92.83%, specificity: 97.47%). Notably, the ResNet50V2 (3-class) and DenseNet201 (4-class) models in the proposed COV-MCNet framework showed higher accuracy compared to the rest six models. This indicates that the designed system can produce promising results to detect the COVID-19 cases on the availability of more data. The proposed multiclassification network (COV-MCNet) significantly speeds up the existing radiology-based method, which will be helpful to the medical community and clinical specialists for early diagnosis of the COVID-19 cases during this pandemic.


Introduction
The COVID-19 pandemic has triggered large-scale emergencies across the globe that constitutes a public health crisis as well as humanitarian and development crises worldwide. The 2019 "Novel Corona Virus (COVID-19)" as the disease name coined by the World Health Organization (WHO) is a disease associated with infectious respiratory illness. COVID-19 was first identified in Wuhan city, China, at the end of December 2019. Since then COVID-19 cases have spread gradually across the globe, affecting almost 216 countries, infecting 30,905,162 and killing 958,703 people (until 21 September 2020; Worldometer, 2020). The COVID-19 was declared a pandemic by the WHO on March 11, 2020 [1], taking into account the exponential increase in the number of daily cases and severe impact on people's lives, health systems, and economies all over the world. Coronavirus  has some other species: Severe Acute Respiratory Syndrome (SARS) and Respiratory Syndrome of the Middle East (MERS). [2]. Since COVID-19 is a newly discovered from the large family of viruses, therefore, the global community of medical experts is actively working on strategies to control and find a suitable remedy for this virus [3]. The COVID-19 transmission is higher than other viruses and has common symptoms targeting the respiratory systems of the infected, like the other species of coronaviruses. Figure 1 shows the rising COVID-19 cases worldwide, as of September 21, 2020. The primary symptoms of COVID-19 patients are similar to that of seasonal flu (e.g. fever, dry cough, and shortness of breath). However, the lethality of COVID-19 is far more serious and infectious and then the normal flu; concerning severe cases, the infection may lead to pneumonia, kidney disorder, multi-organ failure, difficulty in breathing, and death [4]. However, it is important to mention that this newly identified virus has no history of human contact, which implies the high vulnerability of getting infected from it. Also, one of the major challenges to control and stop the spreading of the virus is the provision of an effective testing system. Therefore, experts are now even more urged to find a cost-effective testing system to cope up with the virus.
Identifying and isolating patients infected with COVID-19 is an important step in managing this global pandemic. To control the spread, health care professionals need an appropriate screening system to determine who has come into contact with the infected people. This method is called as touch tracing. However, the real-time Reverse Transcription Polymerase Chain Reaction (RT-PCR) is one of the recent widely used procedures for detecting COVID-19 [5]. Depending on the RT-PCR test, it may take a few hours to 2 days to find out the results of the test. In this period of the crowning time of the COVID-19 outbreak, many countries are facing a shortage of ventilators and RT-PCR test kits. Among several alternatives to the RT-PCR screening method for diagnosing COVID-19, healthcare experts and researchers are exploring the possibility to use chest radiography imaging tests. Many researchers reported that chest radiological imaging such as computed tomography (CT) and X-ray might be expedient in the primary diagnosis of this disease [6]. While patients with COVID-19, an abnormality can show on either a chest X-ray or CT scan, but the absence of an abnormality on either a chest X-ray or CT scan does not necessarily exclude COVID-19. Therefore, most experts and medical societies believe that chest radiology imaging tests can be an effective tool in detecting COVID-19.
The diagnosis methods PCR tests and CT scans both are comparatively expensive [7] and sometimes are required for critical patients to perform more selective tests. X-ray imaging is relatively cost-effective and usually used for lung infection detection or segmentation and has proven convenient for COVID-19 detection as well [8]. Several studies have been proposed for the detection of COVID-19 using chest X-ray and CT images [9][10][11][12][13][14]. Since COVID-19 attacks the epithelial cells that affect our lung area, medical specialists use X-ray images to diagnose pneumonia, lung irritation, boils, and/or other lung diseases. And almost all hospitals have X-ray imaging machines, but sometimes for rural areas, it might be difficult to use X-ray imaging. Without the enthusiastic evaluation kits, it might be conceivable to use X-rays to monitor for COVID-19. Again, a downside is that the X-ray examination needs a professional radiologist and takes a considerable amount of time, which is valuable since there is still a significant upsurge in infected cases globally. Thus, it is essential to develop an automated method of study to save valuable time for medical specialists. Previous studies have been conducted exploring along this line, one such study is the work by Fan et al. [15], based on a novel COVID-19 Lung Infection Segmentation Deep Network (Inf-Net) for identifying infected regions using chest CT images. Along with that, Waheed et al. [16] developed an Auxiliary Classifier Generative Adversarial Network (ACGAN) based model called CovidGAN to generate synthetic chest X-ray (CXR) images. Hasan et al. [17] study extract features from CT images using deep learning and a Q-deformed entropy algorithm to classify COVID-19, pneumonia, and normal cases after that features are classified using a long short-term memory (LSTM) neural network classifier. They achieved 99.68% accuracy. Wang et al. [18] presented Artificial Intelligence's based deep learning methods to extract COVID-19's specific graphical features. Panwar et al. [19] proposed nCOVnet a DCNN based method for detecting the COVID-19 by examining the X-rays of patients. They achieved the training accuracy is up to 97% using a different number of images. Makris et al. [20] proposed a pre-trained based model with transfer learning for classifying COVID-19. Their best two pre-trained models achieved an accuracy of 95%. Narin et al. [21] implemented the same experiment with three different CNN models (ResNet50, InceptionV3, and InceptionResNetV2) among these models ResNet50 pre-trained model obtained 98% for 2-class classification. So far the results connecting deep learning to COVID-19 diagnosis have been very promising.

Dataset
This study has used a total of 1140 images (240 COVID-19, 300 Normal, 300 Viral Pneumonia, and 300 Bacterial Pneumonia) to develop the multi-classification network (COV-MCNet). The COVID-19 X-ray images are sourced from the GitHub repository [22] and the rest three dataset (normal, viral pneumonia, and bacterial pneumonia) were obtained from the Kaggle repository [23]. Therefore, these datasets have been used for feature extraction based on different deep learning architectures. Details of the used dataset as shown in Table 1. Since this study focused primarily on the detection of COVID-19 infected cases, therefore, the MERS, SARS, and ARDS virus images were not considered. The two datasets are examined separately in the COV-MCNet proposed models. Figure 2 shows several chest X-ray images of normal, COVID-19, viral pneumonia, and bacterial pneumonia patients.

Proposed COV-MCNet
Deep learning methods are widely used in a variety of studies such as image classification, segmentation, and skin disease detection of medical statistics [24,25]. The study proposed a state-ofthe-art deep learning image classifier, namely COV-MCNet (Multi-classification network) based on a deep convolutional neural network (CNN). The COV-MCNet uses eight different pre-trained models which are assembled into 3-and 4-classes to classify COVID-19, normal, viral pneumonia, and bacterial pneumonia cases. The entire methodology is divided into three steps: input and preprocessing steps, pre-trained models, and finally training and classification process. ImageNet is an image database with over 14 million images belonging to over 20 thousand categories created for image recognition competitions [26]. The VGG16 and VGG19 [27] model is an improved version of the convolutional neural network (CNN). These models have small convolution filters (3x3) to get a deeper and more complex network. These two models differ in the depth of convolution, pooling, and fully connected layers. The ResNet50V2 [28] is the upgrade version of ResNet50. The ResNet50V2 model has Deep Residual Networks, which is eight times deeper compared to the VGG nets. This architecture is based on skip connection, which allows us to take activation from one layer and feed it to the future layer. InceptionV3 [29] aims to utilize the additional computation as competently as likely by appropriately factorized convolutions and aggressive regularization. The model 48 layers deep along with pooling and fully connected layers. Inception-ResNet-v2 [30] is the mutual architecture of the Inception with residual connections. This architecture is 164 layers deep. As a result, the network has erudite rich feature demonstrations for an extensive range of images. DenseNet201 (Densely Connected Convolutional Networks) [31] has 201 layers on the ImageNet dataset and it has some compelling advantages: they improve the vanishing-gradient difficulty, fortify feature propagation, boost feature reuse, and significantly reduce the number of parameters. MobileNet [32] is an effective model for mobile and entrenched vision applications. This model uses depthwise separable convolutions based on a rationalized architecture to build light weight deep neural networks. Xception [33] a 71 layers deep convolutional neural network architecture enthused by Inception, where Inception modules have been substituted with depthwise distinguishable convolutions. The network trained on more than a million images from the ImageNet database. A schematic representation of the proposed network is shown in Figure 3.

Input and Pre-processing steps
Since the properties of the image (width and height) vary for chest X-ray images of normal, COVID-19, viral pneumonia, and bacterial pneumonia, therefore, the study has used a fixed size of 224 x 224 pixels. Following that, 80% of the data are used as the training dataset and 20% of them are used to evaluate the trained model. Finally, to obtain the decimal values (0 to 1), we normalized the data by dividing 255.

Pre-trained Models
Pre-trained models are trained on a large benchmark dataset as a starting point to solve different problems. In this study, eight different pre-trained models (e.g. VGG16, VGG19, ResNet50V2, InceptionV3, InceptionResNetV2, DenseNet201, MobileNet, and Xception) have been used for multiclassification (3-class and 4-class). All the models have different convolution and pooling layers which extract the features from images and classifier categorize the images from extracted features.

Training and classification process
In the final step, we fine-tuned the pre-trained models with deep learning image classifiers for detecting COVID-19, normal, viral pneumonia, and bacterial pneumonia cases. In the training and classification process, AveragePooling2D have used for all the models to calculate the average for each patch of the feature map with pool size (2,2). Afterward, we flattened the activations to create a vectorized feature map and connected two fully connected layers; one layer contained 128 nodes, and the other consisted of 3 and 4 for 3-class classification and 4-class classification, respectively. Subsequently, the activations from the second fully connected layer were fed into a softmax layer, which provided the probability for each of normal, COVID-19, viral pneumonia, and bacterial pneumonia.

Experimental Setup
Python programming language was used for the experiments to training the proposed COV-MCNet framework and Jupyter Notebook as an editor for executing the codes. The background running environment is built-up using deep learning framework TensorFlow (1.14) and Keras package [34]. All experiments were carried out on CPU Intel Core i7 9700K -(32 GB/2 TB HDD/128 GB SSD/Windows 10 Home/4 GB Graphics) and equipped with GPU NVIDIA GeForce RTX 2080Ti. The COV-MCNet framework was trained with random initialization weights using the SGD Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 23 September 2020 doi:10.20944/preprints202009.0524.v1 (Stochastic Gradient Descent) optimizer. The batch size and learning rate are experimentally set to 10, 0.0001, and the number of epochs is set to 20 to avoid overfitting for all experiments.

Performance Metrics
To test the classification performance of pre-trained models in the COV-MCNet, the following metrics have been implemented in this study to show the classified or misclassified cases. The performance metrics are calculated based on True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) values.

Accuracy
It measures the ratio of correctly classified cases with respect to the whole dataset. If the accuracy is higher, that means the models perform better. The accuracy is a portion of the predicted or classified value to its actual value. It represented as follows:

Precision
It measures the percentage of correctly classified as positive out of all positive cases. It is defined as follows: The recall is computed as the ratio of positives that were correctly predicted as true positives divided by the number of actual positives. It is calculated as follows: F1 Score is calculated based on the scores of precision and recall. It provides the classification capability of the model. F1 score measures the test's accuracy. If the F1 score presents the best value, that means perfect precision and recall. It is calculated as follows:

Specificity
It is also called True Positive Rate (TPR) which measures the ratio of actual negatives that are correctly labeled. It is represented as follows: TP is the proportion of positive cases that are correctly classified as positive; FP is the proportion of negative cases that are misclassified as positive; TN is the proportion of negative that is correctly classified as negative and FN is the proportion of positive that is misclassified as negative by the proposed model.   Figure S1. Moreover, it can be seen from Figure 4 that a greater variation in loss values at the beginning of the training for all the eight pre-trained models, which may be because of using the less number of COVID-19 datasets as compared to the other three datasets (normal, viral pneumonia). Several previous studies were conducted to detect COVID-19 positive (infected) and negative (normal) and pneumonia cases based on chest X-ray images. For example, Ozturk et al. [35] proposed DarkNet architecture, a convolutional neural network-based model to detect and classify COVID-19 cases from X-ray images. DarkNet achieved 87.02% accuracy for 3-class classification. Asif and Wenhui [36] proposed a transfer learning-based deep CNN model Inception V3 architecture to classify COVID-19 pneumonia and the study reported 96% accuracy. Ioannis et al. [37] proposed a deep transfer learning-based classification task. They achieved 93.48% for the three-class performance. In comparison to these studies, the ResNet50V2 model in our proposed network (COV-MCNet) showed high accuracy than Ozturk et al. [35] and Ioannis et al. [37] and comparable accuracy with Asif and Wenhui [36].

3.2.3-class classification Confusion Matrix and ROC curve
The confusion matrix (CM) and the receiver operating characteristic curve (ROC) for the 3-class classification problem shown in Figures 5 and 6, respectively. Rows of the confusion matrix correspond to an actual class while columns represent to the predicted class and the color intensity    Table 2 demonstrates the performance metrics of the eight pre-trained models used in the proposed network for 3-class classification. It can be noticed from the performance metrics of pretrained models used in COV-MCNet ( Table 2) that the best classification performance was recorded by the ResNet50V2 model for each class such as Normal (Precision: 93.44%, Recall: 95%, F1-Score: 94.21%, Specificity: 96.30%), COVID-19 (Precision: 100%, Recall: 100%, F1-Score: 100%, Specificity: 100%), Viral Pneumonia (Precision: 94.92%, Recall: 93.33%, F1-Score: 94.12%, Specificity: 97.22%). As the ResNet50V2 model has deeper residual networks compared to the other models, which also use skip connections to skip training from a few layers and connects directly to the output hence, these results recommended that the ResNet50V2 model is robust and superior than the other tested models. The results (Table 3) show that the ResNet50V2 pre-trained model in the proposed network (COV-MCNet) showed an accuracy of 95.83% in detecting COVID-19 for 3-class and the achieved precision, recall, F1-score, and specificity values were 96.12%, 96.11%, 96.11%, and 97.84%, respectively.   Figure S2. Moreover, loss values exhibited a greater variation at the beginning of the training for all the eight pre-trained models, which may be due to using the less number of COVID-19 datasets as compared to the other three datasets (normal, viral pneumonia, and bacterial pneumonia) (Figure 7). To the best of our knowledge, there are only two studies about 4-class classification were found based on CoroNet Xception and COVID-Net. For example, Khan et al. [38] detected COVID-19 cases based on the CoroNet Xception pre-trained model and reported an accuracy of 89.6%. In contrast, Wang and Wong [39] proposed a deep neural network-based model, namely COVID-Net, they achieved 92.4% accuracy. In comparison to these studies, the DenseNet201 model in our proposed network (COV-MCNet) showed high accuracy than Khan et al. [38] and comparable accuracy with Wang and Wong [39].   probability of each element in a row. The results (Figure 8) show that the pre-trained models classified COVID-19 cases better than other classes of normal, viral pneumonia, and bacterial pneumonia. Besides, the roc curve ( Figure 9) plots the TPR against FPR which measures the classification performance on the various threshold. In Figure 9a, AUC ~ 1.00 represents COVID-19 (i.e. Class 1), AUC ~ 0.99 represents normal (Class 0), AUC ~ 0.97 represents viral pneumonia (Class 2) 0.97, and AUC ~ 0.98 represents bacterial pneumonia (Class 3).  . As the DenseNet201 model classifier uses features of all complexity levels which inclines to provide further smooth decision boundaries. Also, it has comparatively more layers (i.e., 201 layers) than the rest models as well as improves the vanishing-gradient difficulty, fortify feature propagation, and boost feature reuse, which significantly reduces the number of parameters. Therefore, these results recommended that the DenseNet201 model is robust and superior to the other tested models in terms of precision, recall, F1-score, and specificity. It is observed from Table 5, the DenseNet201 pre-trained model in the proposed study (COV-MCNet) showed better results in detecting COVID-19 for 4-class with accuracy, precision, recall, F1score, and specificity are 92.54%, 93.05%, 92.81%, 92.83%, and 97.47%, respectively.

Summary and Conclusion
In addition to these studies in the literature, the main advantages of our study summarized as follows: A deep convolutional neural network-based framework COV-MCNet was designed with eight different pre-trained models (VGG16, VGG19, ResNet50V2, InceptionV3, InceptionResNetV2, DenseNet201, MobileNet, and Xception) for the detection of COVID-19. The proposed study has used a total of 1140 images (240 COVID-19, 300 Normal, 300 Viral Pneumonia, and 300 Bacterial Pneumonia) to develop the multi-classification network. COV-MCNet classified 3-class (Normal, COVID-19 and Viral Pneumonia) and 4-class (Normal, COVID-19, Viral Pneumonia, and Bacterial Pneumonia) cases without any feature extraction or selection techniques. The proposed multiclassification approach (COV-MCNet) can assist medical specialists for diagnosis X-ray image related diseases. Our proposed model achieved an accuracy of 95.83% and 92.54% for 3-and 4-classes, respectively. Moreover, the number of data we have used in this study is less nonetheless the proposed study obtained superior performance for both classification tasks (3-and 4-class) as compared to the other studies.
The primary limitation of the study is the shortage of COVID-19 image data used for the training of deep learning models. In the future, we intend to improve the proposed model by collecting more radiology data. Besides, we will test the designed multi-classification network (COV-MCNet) with CT images for COVID-19 detection and compare the achieved results with the proposed model which was trained using chest X-ray images.
As the COVID-19 cases are still increasing daily, quick identification of COVID-19 patients is can be one of the effective steps towards preventing the spread of the virus into the non-affected community. Thus and so, this study has proposed a multi-classification approach, namely COV-MCNet based on eight different pre-trained models (VGG16, VGG19, ResNet50V2, InceptionV3, InceptionResNetV2, DenseNet201, MobileNet, and Xception) to detect COVID-19 patients automatically. The suggested models could successfully detect the COVID-19 infected cases based on the 3-class and 4-class classification. The 3-class classification demonstrated the ResNet50V2 best classification model of COVID-19 infected cases with an accuracy of 95.83%, while 4-class classification revealed the DenseNet201 model with an accuracy of 92.54%. The study achieved promising results in comparison to similar studies with small datasets, which can be beneficial for medical specialists to make decisions and gain deeper knowledge about COVID-19 cases. The classification performance of the method can still be improved by increasing the number of training datasets. Also, the study still needs scientific testing but with higher performance, it can pave a way towards a modern and efficient diagnosis of the COVID-19. In the future, we aim to collect more radiology images of COVID-19 from local hospitals to make more superior results using the suggested model.