Epidemiological challenges in pandemic coronavirus disease (COVID‐19): Role of artificial intelligence

Abstract World is now experiencing a major health calamity due to the coronavirus disease (COVID‐19) pandemic, caused by the severe acute respiratory syndrome coronavirus clade 2. The foremost challenge facing the scientific community is to explore the growth and transmission capability of the virus. Use of artificial intelligence (AI), such as deep learning, in (i) rapid disease detection from x‐ray or computed tomography (CT) or high‐resolution CT (HRCT) images, (ii) accurate prediction of the epidemic patterns and their saturation throughout the globe, (iii) forecasting the disease and psychological impact on the population from social networking data, and (iv) prediction of drug–protein interactions for repurposing the drugs, has attracted much attention. In the present study, we describe the role of various AI‐based technologies for rapid and efficient detection from CT images complementing quantitative real‐time polymerase chain reaction and immunodiagnostic assays. AI‐based technologies to anticipate the current pandemic pattern, prevent the spread of disease, and face mask detection are also discussed. We inspect how the virus transmits depending on different factors. We investigate the deep learning technique to assess the affinity of the most probable drugs to treat COVID‐19. This article is categorized under: Application Areas > Health Care Algorithmic Development > Biological Data Mining Technologies > Machine Learning


Supplementary Text
Databases for epidemiological studies Here we are going to summarize some key characteristics of the publicly available data sets as follows.
COVID-19 open research data set 1 The Allen Institute for AI 2 along with some leading research groups has shared the COVID-19 Open Research Data set . Such free resource comprises scholarly articles in order of thousands. The articles deal with the information on coronavirus family. One may apply natural language processing techniques to uncover the hidden relation among various findings. Such findings may help other researchers and doctors in assessing the outcome of some therapeutic procedure.
WHO COVID-19 data 3 World Health Organization (WHO) has already started publishing and updating information about the affected cases over the world in regular interval. The numbers of death and recovery results are provided, which convey the speed of spreading of coronavirus into different parts of the world. Besides, WHO has been sharing various reports related to the study on applying candidate vaccines and several drugs.

ACAPS COVID-19 4
Here various measures associated with coronavirus are integrated in a single platform. The data sets consider several issues, such as social distancing, movement restrictions, public health measures, social and economic measures and lock-down among others, for such measurement. Public health as well as socio-economic conditions are also considered here.
World Bank indicators data set 5 Presently, World Bank has taken an initiative to share data related to recent COVID-19 with the help of Humanitarian Data Exchange (HDX). HDX is an open platform that shares data across organizations during crises. HDX allows sharing data conveniently, using them for analysis.
The data set can be broadly divided into three parts. Data are information related to health status of every individuals, basic hand sanitizing facility for the population with soap and water, and the population density with respect to a range of ages. Figures S1-S3 depict different entities and their attributes as well as relationship among them for the World Bank indicators database.

Kaggle 6
Kaggle, one of the largest data science community in the world, has tried to involve a number of scientists to visualize the pattern of such pandemic activity worldwide. In response this emergency, they have prepared a COVID-19 Open Research Dataset (CORD-19) 7 by incorporating disease along with recovery/death related information in the form of tables. Moreover, they have developed a time series data set having the track of history of a large number of patients worldwide. Figure S4 depicts the entities and their attributes of a small portion of Kaggle databse.

Genetic sequence database 8
Genetic sequence database is a compilation of all freely available annotated DNA sequences. DNA GenBank is the part of the International Nucleotide Sequence Database Collaboration, which comprises European Nucleotide Archive (ENA), GenBank at NCBI and DNA DataBank of Japan (DDBJ). These organizations exchange data among themselves regularly. The aim of GenBank database is to provide and promote access within the scientific community to the most recent and wide-ranging DNA sequence. National Center for Biotechnology Information (NCBI) has recently provided a set of SARS-CoV-2 sequences, accessible in the Sequence Read Archive (SRA) and GenBank. Currently, the repository contains 183 GeneBank sequences and 1 RefSeq sequence in Entrez Nucleotide, and the new NCBI Virus resource submitted from countries like China, Phillipines, Japan and Thailand.

Genomic database 9
Nextstrain provides a frequently updated view of publicly available data of COVID-19. In addition, it contributes a set of alongside powerful analytic and visualization tools that allow epidemiological understanding of the disease and empower the researchers for the solution. A genome database can be described as a storehouse of DNA sequences from different species of plants and animals.

Drug Database
In order to support the prediction based on different AI models, many drugrelated databases have been developed to contain several types of drug-target interaction (DTI) information. Simultaneously, drug related databases are vital resources for DTI predictions in silico 1 . Based on the content of databases, it can be subdivided into four categories, drug centered or target centered databases, DTI databases, DTI affinity databases, and other supporting databases. In the class of "drug centered or target centered databases", seven databases are generally used, such as, BRENDA 2 , PubChem 3 , SuperDRUG2 4 , DrugCentral 5 , PDID 6 , Pharos 7 , and ECOdrug 8 . On the other hand, DTI database has been developed for collecting and validating the DTI and related information.      Figure S4: The figure illustrates the entity set of Kaggle databsase about COVID-19 and its attributes along with the relation among them