Artificial Intelligence Based Study on Analyzing of Habits and with History of Diseases of Patients for Prediction of Recurrence of Disease Due to COVID-19

A patient will visit physicians when he/she feels ill. This illness is not for COVID-19 but it is a general tendency of human being to visit doctor probably it can not be controlled by general drug. When a patient comes to a doctor, the doctor examines him/her after knowing his/her problem. The physician always asks him/her about some questions related to him/her daily life. For example, if a young male patient comes to a doctor with a symptom of fever and cough, the first question doctor asked him that he has a habit of smoking. Then doctor asks him whether this type of symptom appeared often to him previously or not. If the answers of both questions are yes, then the first one is habit and the second one is that he may suffering from some serious disease or a disease due to the weather. The aim of this paper is to consider habit of the patient as well as he/she has been affected by a critical disease. This information is used to build a model that will predict whether there is any possibility of his/her being affected by COVID-19. This research work contributes to tackle the pandemic situation occurred due to Corona Virus Infectious Disease, 2019 (Covid-19). Outbreak of this disease happens based on numerous factors such as past health records and habits of patients. Health records include diabetes tendency, cardiovascular disease existence, pregnancy, asthma, hypertension, pneumonia; chronic renal disease may contribute to this disease occurrence. Past lifestyles such as tobacco, alcohol consumption may be analyzed. A deep learning based framework is investigated to verify the relationship between past health records, habits of patients and covid-19 occurrence. A stacked Gated Recurrent Unit (GRU) based model is proposed in this paper that identifies whether a patient can be infected by this disease or not. The proposed predictive system is compared against existing Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 25 August 2020 doi:10.20944/preprints202008.0542.v1 © 2020 by the author(s). Distributed under a Creative Commons CC BY license. benchmark Machine Learning classifiers such as Support Vector Machine (SVM) and Decision Tree (DT).


Introduction
The World Health Organization (WHO) named Corona Virus Infectious Disease as  Outbreak of this disease occurred in Wuhan, Hubei Province, China, in December 2019.
WHO on March 11, 2020 declared COVID-19 as 'Pandemic' [1]. The negative impact of tobacco use on lung health and its causal association with a plethora of respiratory diseases have been highlighted [2]. The use of tobacco is detrimental to the immune system and it is more vulnerable to infectious diseases. Alcohol consumption is likely to increase the health risks if a person made it a habit. It will not destroy the virus. It will infect mouth and throat. A couple of peoples think that and it will give protection against COVID-19. Practically it is true that a section of persons affected by COVID-19 who are drinking alcohol due to the stress in their profession. Alcohol has a bad effect on immune system and will not stimulate immunity and virus resistance [2][3]. COVID-19 symptoms are heterogeneous in nature ranging from mild flu-like symptoms to acute respiratory distress syndrome, multiple organ failure and death. The significant predictors of morbidity and mortality are age, diabetes patients and other comorbidities. The United States CDC has estimated that the incubation period of symptoms will usually develop within 2-14 days after exposure of COVID-19 affected peoples. So fourteen days has been the time defined as quarantine period [1,4].
For COVID-19 it is required to take preventive measures by all and it is the first line of defence. The guidelines of WHO are to be followed for protecting yourself from COVID-19 [5]. These guidelines are: 1.
Maintain at least 3 feet distance between yourself and anyone who are meeting or talking with you.

2.
It is required to avoid close contact with sick people.

3.
If possible then try to work from home.

4.
It is necessary to clean and disinfect frequently touched objects and surfaces.

5.
Soap and water is used to wash your hands often.

6.
If you are sick, then stay home.

7.
If you are going outside of your home, then use face mask and disposable gloves.

8.
Take care of children without assuming any risk to children from COVID-19.

9.
Avoided crowded place since it may affect you from coronavirus.

10.
Touching your eyes, nose, and mouth as far as possible.

11.
Visit to relatives and friends are less as far as possible

12.
Before re-entering your home, take your shoes off and leave them outside, or spray them with a disinfectant. Remove and discard the mask and disposable gloves.

13.
Disinfect your hands at regular interval when you are in home as well as in office. 14.
Dispose of the tissue safely after cover your cough or sneeze with a tissue.

Clean and disinfect frequently-touched
COVId-19 creates these new habits to humans and humans have to follow these until and unless proper tested vaccine for COVId-19 will be discovered. Recurrent Unit (GRU) [9] layers and stacks into a single platform as an automated tool. It is to be noted that, GRU is an improved version RNN. Implementation of this deep model necessitates adjustment of hyper-parameters so that maximized performance can be obtained.
This deep model is compared with traditional ML classifiers such as Decision Tree (DT) [10] , Support Vector Machine (SVM) [11].
The contribution made by this study for covid-19 disease classification includes: • Processing of healthcare and past habit data using deep learning techniques instead of the traditional healthcare system to identify COVID infected person.
• This work implements stacked-Bi-GRU model to be applied for processing patient data and identifies likelihood Covid-19 infection. This model is adjusted by fine-tuning its hyperparameters for maximising the prediction performance. Other ML algorithms such as SVM and DT are used to compare the proposed implementation for detecting covid-19 infection possibility.

Related Works
Global economy and day to day life are affected by COVID-19. It is slowing down the global economy. Peoples in globe are either sick or are being killed due to the spread of this disease.
Common symptoms of this viral infection are fever, cold, cough, bone pain and breathing problems, and ultimately leading to pneumonia. The new viral disease is affecting humans for the first time. The vaccines are not yet discovered. The emphasis is on taking extensive precautions such as extensive hygiene protocol, social distancing, and wearing of masks, and so on. It is now spreading exponentially region wise. Banning gatherings of people to the spread and break the exponential curve are trying by countries [12][13]. Lockdown by many countries and enforcing strict quarantine are the control parameters to avoid the spread of the havoc of this highly communicable disease. It is now essential to identify the disease at an early stage for controlling the spread of the virus because it is very rapidly spreads from person to person. Manufacturing of the products are slowed down by many countries [14][15].
Various industries and sectors are affected by the cause of this disease. COVID-19 creates significant knock-on effects on the daily life of citizens as well as about the global economy.
Researches on the psychological behaviours and habits of humans have doing at random due to the current pandemic [16]. Recent researches have directed towards the need of mental healthcare by health professionals [17]. It is also emphasized how implementing psychological support systems during the pandemic for ensuring emotional stability. It is now needed to understand the nature of the psychological consequences stemming from this newly emerging disease and initial research efforts are primarily linked to mental health consequences that individuals may experience as a result of isolation and quarantine. It is also necessary to identify related changes in health behaviours that may be occurring at a population level in order to better understand the range of downstream psychosocial consequences of the recent outbreak. Large segments of the population under conditions of isolation need modifications to lifestyle behaviours and it is now inevitable. It changes physical activity as well as dietary habits and even the incidence of domestic violence cases.
A substantial proportion of quarantined individuals during that pandemic reported being distressed and depression. A small number of researches addressed to the impacts on lifestyle or health behaviours and all are focusing on the symptoms of mental health disorders.
It is equally important to bear in mind that health behaviours are also strongly intertwined with mental health. More researches are required to the reciprocal nature of interactions between physical well-being, chronic disease and mortality, with key health behaviours such as smoking, physical activity, reduced alcohol consumption, diet and obesity. Such behaviours have an impact on mental health. Physical activity alone It has been identified as an important protective factor in reducing the risk of developing depression by Physical activity. It was reported that physical activity, alcohol consumption, smoking, body mass index and regularity of social interaction were all associated with specific mental health outcomes (i.e. depression, anxiety and stress). It is important to recognise the interactional nature of human behaviours during and after COVID-19. Some diseases like Diabetes, chronic obstructive pulmonary disease, and hypertension were the most impacted conditions due to reduction proper care. Routine care continues in spite of the pandemic for avoiding a rise in non-COVID-19-related morbidity and mortality. It is also important that patients with chronic diseases continue to receive care in spite of the pandemic [17].

Neural Network and Hyper-parameters
ML focuses on set of algorithms which are applied on dataset in order to infer predictions.
ML is an art of making machines to perform efficiently without being explicitly programmed.
Supervised ML is a variation of ML learning paradigm where learning algorithm accepts set of inputs along with output labels. This means that supervised ML makes the computer to discover some rule for mapping inputs to output labels [18]. Neural networks are essentially a part of ML which is simulated by the working principle of human brain. DL has proven its superiority on solving complex problems. Use of DL techniques is beneficial because, it does not include the manual feature engineering task to be performed due to its self-adaptive nature. DL often demonstrates the involvement of neural network in order to accompany complex problem solving approach. Just like neurons present in human brain, large number of processing elements (nodes) are present in neural network for acquiring best problem solving tactic [19][20].
Before training this neural network, some pre-stage fine-tuning of hyper-parameters is meaningful output signal. For predicting binary class probabilities, sigmoid activation function may be used for activating output nodes. Sigmoid [21] activation function accepts the input data and transforms it in the range of 0 to 1 and it is shown in equation (1).
Tangent hyperbolic (tanh) [22] is another non-linear activation function. This is a smoother and zero-centered function. The range of this function range lies between -1 to 1, thus the output of the Tanh function is given as equation (2).
Dropout technique is employed for case of eliminating over-fitting problem in neural networks. During training, it randomly detaches units along with incoming and outgoing connections from the neural network. Use of dropout makes neural network to acquire benchmark results in supervised classification tasks [23].
Epoch and batch size are hyper-parameters which are also used in neural network training.
Both of these hyper-parameters receive integer values which need to be chosen wisely to make best use of the model's performance. Epoch size is defined to be number of passes to complete through training dataset. The entire dataset is passed forward and backward through the neural network exactly one time within each epoch. While passing the entire dataset into the algorithm, it must be partitioned into fixed size batches. Batch size keeps track of number of processed instances before the model updates its internal parameters. It is to be noted that, batch size should not be too small or too large. Having too small batch size will present high variance since small batch size may not be a good representation of the entire dataset. Again, large batch size may not feasible because it may not fit in memory of the compute samples used for training and may lead to over-fitting problem [24].
While stacking RNN based layers into a single framework, employing an optimizer is necessary. Adam is a popular optimizer that is computationally efficient with lower memory requirement and also easy to implement. This algorithm is applicable for first-order gradientbased optimization of stochastic objective functions, based on adaptive estimates of lowerorder moments. This algorithm is quite well accepted due to its applicability on nonstationary objectives and problems with very noisy and/or sparse gradients [25]. GRU is a gating mechanism in RNN similar to a long short-term memory (LSTM) unit. It is used without an output gate. GRU is considered a variation of the LSTM because both have a similar design and produce equal results in some cases. In GRU the update gate controls information that flows into memory and the reset gate controls the information that flows out of memory. The two vectors decide which information will get passed on to the output. GRU

Recurrent Neural Network and GRU
can be trained to keep information from the past or remove information that is irrelevant to the prediction. Given xt= (x1, . . . ,xT ) be an input sequence, W is the weight matrices σ states the sigmoid function for a GRU.
At time t, the activation function of GRU is hj t which is dependent on previous activation ht-1 j candidate activation function h`t j . This is formulated in equation (3). The update gate (ut j ), and reset gate (rt j ) can be formulated as equation (4) and (5) respectively.

Model Evaluation
Evaluation metrics are taken into consideration while discriminating the performance of any model from other models. Accuracy and loss are required to calculate for any deep model. This function measures the performance of a classification model whose output is a probability value between 0 and 1 [26].
Mean Squared Error (MSE) [27] is another evaluating metric that measures absolute differences between the prediction and actual observation of the test samples. MSE produces non-negative floating point value and a value close to 0.0 turns out to be the best one. It is formulated as equation (6).
where Xi is the actual value and Xi' is the predicted value accuracy and f1-score metrics can be evaluated as equation (7) and (10) respectively. It is to be noted that, f1-score is a metric that relies on calculation of recall and precision which are formulated as equation (8) and (9) respectively [27].

Baseline Models
Disease infection detection has also been predicted by other baseline ML models like SVM and DT. These baseline models are summarized as follows- The advantageous in handling classification tasks with superior generalization performance are done by SVM. The upper limit of the generalization error based on the structural of risk minimization principle is minimized by SVM. It maps input vector to a higher dimensional space by constructing a maximal separating hyper-plane. For separation of data two parallel hyper-planes are constructed on each side of the hyper-plane. This hyper-plane maximizes the distance between the two parallel hyper-planes. The maximized distance between these parallel hyper-planes are considered for getting better generalization error. While handling non-linear separation problem, kernel function plays an important role. Low dimensional input space can be transformed into a higher dimensional space using SVM kernel function.
In other words, kernel function applies some complex data transformations for converting non-separable problem to separable problem.
The tree-like structure is exemplified by DT and it gains knowledge on classification. Leaf node of DT is goal variable and non-leaf nodes of DT are used as a decision node. It indicates certain test that are identified by either of the branches of that decision node. Initial visit of the classifier starts from the beginning of the root this tree until a leaf node is reached. It is useful for forecasting the goal based on some criterion by implementing and training this model.

Experimental Results
The   Table 3 finally summarizes the comparison among proposed classifier, DT and SVM classifier. This shows the efficiency of implemented model which is superior to specified models.

Conclusion
Healthcare industry requires real time collection and processing of medical data. The main point of this industries lies the problem of data handling in real time for prediction and quick attention. DL based framework is exemplified in this paper for identifying covid-19 occurrence based on influential parameters from former habits as well as health records. This study detects the feasibility of using GRU, a gated recurrent neural network for the purpose of covid-19 disease classification. This automated tool provides comparable accuracy to the most widely used SVM, DT classifier models. This enhanced accuracy of GRU can be highly beneficial for healthcare industry.

Conflict of Interest
Authors do not have any conflict of interest.