Deep Learning for COVID-19 Recognition

Pneumonia is a leading cause of death worldwide, and one of the most significant approaches to diagnose pneumonia is Chest X-ray (CXR) since it was used in clinical scenes. Convolutional neural networks (CNNs) have been widely used in computer vision community. Along with the development of CNNs, we want to make use of CNNs to recognize CXR of people who get pneumonia and make classification. It is important, especially during epidemic period. In this paper, we present a new type of residual learning framework, PEPX-Resnet, which makes use of a type of lightweight residual, and apply this network to CXR dataset. The result shows that PEPXResnet is easier to optimize and can have better results, especially for COVID-19 cases. PEPX-Resnet could reach higher accuracy, f1 score and some other evaluations for CXR dataset.


Introduction
Rapid progresses have been evidenced in computer vision via deep learning and large-scale annotated image datasets [1][2][3][4][5], and recent researches show that the depth of network is of great importance, because deep networks could lead to good results in challenging recognition tasks. However, researchers found that deeper neural networks could cause some problems [6], and they proposed residual learning framework which is called Resnet to solve these problems, and get good results.
One of the most typical application of convolutional neural networks in computer vision is face recognition. The most popular system to recognize face are Deep Face recognize system which is shown in [4]. We have some available datasets to train and test networks, such as LFW dataset and MS-Celeb-1M [7]. These datasets have large amounts of pictures of people's faces and some related information such as their names, ages and so on.
Since scientists discovered x-ray, the application in clinical medicine have advanced to a considerable extent. Medical imaging has developed over 50 years including Magnetic Resonance Imaging (MRI), functional magnetic resonance imaging (fMRI), Positron Emission Computed Tomography (PET), Electroencephalogram (EEG) and so on. Convolutional neural networks also be applied in this fields in [8], which is another important application of CNNs.
Pneumonia accounts for around 16% of children who died under 5 years old [9], and it has become the leading cause of death among children [10]. Coronavirus disease 2019 (COVID-19) was threatening large amounts of lives in 2020 [11][12][13] and COVID-19 has become a global crisis which has confirmed cases in almost any countries.
In this paper, we present a new type of residual learning framework, PEPX-Resnet, which makes use of PEPX, a lightweight residual, and this network could achieve higher accuracy, precious, sensitive and f1 score and more stable results when using it to recognize CXR. We will present the architecture of the network and then we will show the results of our experiments, in which we make use of PEPX-Resnet to recognize CXR and diagnose patients of COVID-19. It might be useful in clinical senses, especially when we face the outbreaks of epidemic virus which could cause pneumonia.

Related work
Image recognition have great results in recent years, and some different structures of networks have been used in some face recognition tasks. Researchers presents many approaches to deal with recognition tasks. For example, researches presented an approach to learn a projection that is at the same time distinctive and compact, achieving dimensionality reduction at the same time based on a triplet loss, and directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity in [21]. [22] proposes a hybrid convolutional network (Con-vNet)-Restricted Boltzmann Machine (RBM) model for face verification in wild conditions and researches showed that face recognition tasks can be well solved with deep learning and using both face identification and verification signals as supervision [23].
Along with the development of CNNs, researchers found that deeper CNNs will cause a type of degradation, which will cause decline of the results. Some researchers proposed a new architecture, Resnet, which use residual to train the network and compose a type of residual learning framework, ending up that results become much better and some problems shown in ordinal CNNs, such as degradation, were solved well. This network is used in face recognize tasks in [21], which could get more extraordinary results when using it to recognize pictures in LFW and other datasets compared with VGG [24].
Some researches use deep neural networks in other fields to solve some problems. For example, some researchers use neural networks to solve clinical problems [25][26][27], which make the process of diagnosis become easier for doctors and save them lots of precious time, especially during epidemic period. These researchers have good ideas and get good results in their datasets. There also are some researchers use the model in [28,29] to detect pneumonia, classify pneumonia and get good results. Even some researchers have made use of CNNs to diagnose COVID-19, such as researchers of COVID-Net [30]. They proposed a new architecture of networks, especially a new type of lightweight residual called PEPX, which makes a type of combination of projections and extensions. Their experiments show that this COVID-Net could get great results, especially for COVID-19 cases.

Materials and Methods
In this section, we will show our approach to solve the clinical problem, recognition CXR, to be concise, we proposed a new type of residual learning network, PEPX-Resnet, to recognize CXR and help doctors to make classification and diagnosis. We first introduce PEPX-Resnet as a whole, and compare it with Resnet. Then, we will introduce PEPX-Resnet in detail, including its blocks and PEPX. Finally, we will talk about the focal loss which will be used in our classification problem.

PEPX-Resnet
PEPX-Resnet is combined by many blocks shown in Figure 5. The whole architecture is shown in Figure 1. Clearly, the network has four main blocks, and we use 3x3 kernel in these blocks. This network is mainly inspired by Resnet ( Figure 3). We tried using residual in a different way, and tried different approaches to combine residuals and the backbone of the networks.  It is clear in Figure 1 that PEPX was not only added in each block, but also could be interpreted as a network, which means that the whole network could be interpreted as the combination of two networks, and the results of these two networks was added together with different weights. Plus, it could be observed in Figure 1 that the combination of blocks could give the whole network much more different potential architectures, because different weights of residuals and PEPX mean different architectures, even some architectures totally unsimilar.    In Figure 2 and Figure 3, we can see that Resnet adds a residual in each unit, and the backbone of the network does not change. This bottleneck residual structure improves the performance a lot, but it cannot get good results in CXR datasets. Compared with Fig 1, it is clear that the backbone of PEPX-Resnet is changed by a type of special operation, PEPX, which makes up an individual network, and the whole network allows one-direction communication of data between the backbone and the PEPX, to be concise, data just could be transferred from PEPX to convolution layers. Plus, in Figure 1 and Figure 3, we can observe that PEPX-Resnet also has large-scale residuals, which connect more convolution layers rather than Resnet.

Blocks of PEPX-Resnet
Let's consider Y is the result of a block in CNNs, the result could be written as Y = ( ) . In Resnet, researchers present a new architecture and make use of residual to make the results of the network become batter. In this structure, we consider the result of a unit in a block is Y, and then add the result of the previous unit, then, the result of this unit could be written as Y = ( ) + . The added is the residual of the unit. The structure of this unit is shown in Figure 4. We modify the residual learning framework by using two level of residual and making use of a type of lightweight residual, PEPX, then we give them different weights when we add these residuals to the backbone of the network. The different weights could be achieved by the process of training. Let's consider the result is also Y, and the process could be written as: Y = [ 1 ( ) + ] + * 1 ( ) + * + * ( ) A, B and C in the formula is the different weights of residuals and these are achieved by the process of training. These different weights could make the network have large amounts of different architectures. The basic structure of the block is shown in Figure 5: In Figure 5, the first two layers is called 1 ( ), and the last two layers is called ( ). The operation in the special architecture, PEPX, is called ( ). We combine these operations together by adding them and give them different weights, which is gotten by the process of training. Clearly, this block has many possible architectures because of different weights. Plus, the combination of this block which was shown in Figure 1 could give the network much more different architectures.

PEPX
PEPX is a type of light weight residual, which was used in COVID-Net [30]. It was composed by 5 steps: 1. Project to a dimension with less channels; 2. Extend to a dimension with more channels, and the number of channels is different from input; 3. Use 3x3 kernel to detect some features; 4. Project to a dimension with the same number of channels of step 1; 5. Extend to a dimension with more channels, and the number of channels is the same as we need.
The steps of PEPX are shown in Figure 6. In this paper, we do not use depth-wise convolution to get features in step 3 as COVID-Net, because depth-wise convolution uses individual kernels in different channels, which means that depth-wise convolution cannot get special features at a location in different channels.

Focal loss
Focal loss was proposed to emphasize the training of samples that cannot be predicated well, and makes the results better. The focal loss function is shown there: LS = −α(1 − ) In this formula, is the probability of being classified correctly, which is gotten by softmax. This loss function could emphasize training of samples which have lower , because of the coefficient α(1 − ) , which will be higher when the is lower, and will be lower when is higher. By this means, when the is lower, it could make be higher and enforce the training of these samples. The coefficient and have different functions. The coefficient was used to solve the problem of unbalance of positive and negative data in training dataset, and coefficient was used to adjust the enforcement of those samples which are classified incorrectly. In our experiments, we make = 1, and = 1.
We could get great amounts of pictures of chest X-ray on these accounts, and these pictures included normal people and patients who get pneumonia, and especially, we also get some CXRs of patients of COVID-19. We use Resnet and PEPX-Resnet to classify these pictures into three classes: In this section, we introduce our dataset and then we compare the results of different networks. We know that pneumonia usually manifests as an area of increased opacity in CXR. Because CXR of normal people do not have much area of opacity. Most pneumonia are caused by bacterial and virus, and it usually manifests as an area of increased opacity in CXR. However, diagnosis of pneumonia by CXR is complicated because there are many other conditions that interfere with the diagnosis of pneumonia, such as pulmonary edema, bleeding, volume reduction, lung cancer, etc. and each of these could cause increased opacity in CXR (such as (d) in Figure 7), which means that it is not an easy task to recognize pneumonia by CXR. Especially in epidemic period because of great volumes of cases, such as the outbreak of COVID-19. We can make use of our PEPX-Resnet to help doctors read these images and help them make right judgments.

Dataset
We also found CXR of patients of COVID-19, which is threatening many people's lives [17,18,19]. The examples of these are shown in Figure 8: In our experiments, we use 1000 CXRs of each class to train our network and use 100 CXRs of each class to test. In other words, we will use 3000 CXRs to train our networks and use 300 CXRs to test.

Results
We contrast the results of Resnet and PEPX-Resnet in details. We make the network make a prediction, and the prediction must be one of the results in these three classes: We also contrast the results with different loss functions including Softmax-loss and Focal loss, and the experiment shows that focal loss could get better results in our work. It is clear in Figure 9, the accuracy could be higher when we use PEPX-Resnet, especially when we make use of focal loss. By contrast, Resnet is not stable and has lots of rises and downs.
(a) (b) (c) Figure 10. F1 score of different networks and loss functions. Figure 10 shows the F1 score in our experiments, (a) in Figure 10 is F1 score of Normal, (b) in Figure 10 is F1 score of Non-COVID, (c) in Figure 10 is F1 score of class COVID-19. It is easy to observe that PEPX-Resnet could have higher and more stable F1 score, especially when we make use of focal loss.
Then, we will compare different networks from many different aspects, including accuracy, precious, sensitive and F1 score. We also compare the results of PEPX-Resnet with the results of COVID-Net which was proposed in [30]. We compare the accuracy in Table 1 (The best one is bold):  Then table 2, table 3 and table 4 will compare different networks from many different aspects, including precious, sensitive(recall) and f1 score (The best one is bold).

0.979
Clearly, PEPX-Resnet could get the best results. Even though other networks, including Resnet18 and COVID-Net could get higher precious or sensitive, when we compare F1 score, which could evaluate the network comprehensively, PEPX-Resnet gets the best results in all classes.

Discussion
In this paper, we proposed PEPX-Resnet and we use this network to recognize and classify CXRs to help doctors to handle with some clinical situation. By using two levels of residuals, a type of operation, PEPX, to calculate our data in different dimensions, we could improve accuracy, precious, sensitive and f1 score. Plus, we could make them be more stable. These make the network be more practical in clinical situation. We also tried Softmax-loss and focal loss, and the results shows that even though the accuracy did not change a lot, the loss was much more stable with focal loss, and it could help us improve some evaluations, such as F1 score. Finally, we show the results of our experiments, it is clear that PEPX-Resnet could get better results than Resnet18 and COVID-Net.