Infodemiological study on COVID-19 epidemic and COVID-19 infodemic

Background: Less aligned emphasis has been given to the COVID-19 infodemic coordinating with the COVID-19 outbreak. Global profusion of tangled monikers and hashtags has found their ways in daily communication and contributed to backlash against Chinese. Official naming efforts against infodemic should be meet with a fair share of identification. Based on brief critical reviews on previous multifarious naming practices, we punctuate heuristic introspection in scientific conventions and sociocultural paradigms. Methods: Infodemiological analysis promises to articulate that people around the globe are divided in their favor stigmatized monikers in the public and scientific communities because of perceptual bias. Results: There is no positive correlation between the degree of infection in their territories and collective perceptual bias against COVID-19. The official portfolio “COVID-19” and “SARS-CoV-2” has not become de facto standard usages, but full-fledged official names are excepted to duly contribute to the resilience of negative perceptual bias and collective behavioral propensities amid public panic. Conclusions: As an integral component of preparedness, appropriate nomenclatures should be duly assigned to the newly identified coronavirus and caused respiratory tract disease in humans amid global public health crisis.


Background
On the occasion of the Chinese Lunar New Year of 2020, the 2019 novel coronavirus disease (now known as COVID-19 [1]) was first reported from Wuhan City of China, home to 11 million people. With the spread of the COVID-19 epidemic, massive infodemic has undermined and disrupted global efforts to fight epidemic. However, hitherto the infodemic rarely appears in publications to meet with a fair share of identification, and even its unique risk has only been flirted with. For the underrepresented minority case in academia [2,3], the "2019-nCoV infodemic" (hereinafter, COVID-19 infodemic) is duly credited with the remarks "the high demand for timely and trustworthy information about 2019-nCoV".
On one hand, global profusion of running headlines often inscribe fear, prejudice, disgust and hostility into tangled hashtags and monikers, branding discrimination and stoking panic [4][5][6]. Those monikers and morbid contents always team up with each other in the epicenter of infodemic, wherein one sheds light on the social contagion of the other. The past few weeks has witnessed an explosive growth of stigmatized monikers, which have found their ways in daily communication and contributed to backlash against Chinese and diaspora.
On the other hand, no learned name, no fighting flag of information confrontation with disinformation campaign. As an integral component of preparedness, appropriate nomenclatures should be duly assigned to the newly identified coronavirus and caused respiratory tract disease in humans, which has potential public health impact. So far, there are not universally accepted names yet, either for academicindustrial usage or consistency with international virus taxonomy.
To address such pressing issues, we orchestrate rich metadata available to unfold the picture of the COVID-19 epidemic and COVID-19 infodemic in this infodemiological study. In doing so we seek to address the following research questions: [1] What are best practices for naming new human infectious diseases? [2] To what extent did epidemics reinforce public panic? [3] What is collective perceptual bias against COVID-19 in the public? [4] What is collective perceptual bias against COVID-19 in scientific sphere? [5] What's plausible reasons behind collective perceptual bias?

Data extraction and synthesis
As of 29 February 2020, COVID-19 has spread to 60 countries and territories. Of these, the World Health Organization (WHO) published the number of cumulative cases in 54 Member States on 29 February 2020, as well as Hong Kong, Macao and Taiwan. We retrieved from the cumulative cases of three non-member states -Iceland, Azerbaijan and Monaco -from their official websites. The corresponding total populations of 2019 comes from United Nations (Department of Economic and Social Affairs, Population Division (2019). World Population Prospects 2019. Rev. 1.).
Moreover, metadata from three information sources, electronic books corpus (Google Books Ngram Corpus (GBNC)), journals (Web of Science (WoS) and PubMed) and the Internet (Google Trends Index (GTI)), will be retrieved and orchestrated to facilitate subsequent analysis. GBNC is a unique linguistic landscape that benefits from centuries of development of rich grammatical and lexical resources as well as cultural contexts [7]. For instance, the temporal frequency series of "epidemic" and "panic" in the GBNC from 1800 to 2008 will be retrieved to facilitate subsequent reciprocity analysis. WoS and PubMed are learned publications databases with rich structural metadata. GTI is knowledge dissemination metrics for query incidence of relevant keywords and phrases. The dynamic spatiotemporal patterns of GTI are faith mirrors of demographic perceptions and collective behavioral propensities.

Infodemiological study
Infodemiology is a science portfolio of information science and epidemiology to address the pressing concerns for public health and policy decisions [8,9]. Under the umbrella of infodemiological scenarios, meta-analysis on diachronic discourses of pertinent keywords and phrases promises to articulate the unfolding chronological picture since their debutants on a historical time scale [10]. For examples, the earliest usage track-down of "coronavirus" and "coronaviruses" could provide an insightful and compelling argument for rigorous historical story, and finally help us penetrate to the essence of reality. Diachronic discourse of "coronavirus", "coronaviruses", "Coronaviridae" and "Nidovirales" in English corpus could reflect the historical milestones and the status quo in the field of human coronaviruses research. In the ongoing COVID infodemic, stigmatized monikers are ideal indicators of negative bias, and GTI is employed to determine their populational consumptions across various regions over time to characterize collective perceptual bias.
Corpus-based analysis by discovering patterns in historical context could promise to conclusively detect reciprocal relationship between epidemic and panic over a historical time scale [7,11,12]. The Granger Causality (GC) model is the most widely used approach to identify causal relationships in time series [13]. In reviewing the Granger causality framework, if we can use the information of a signal X to improve the predictability of signal Y prediction, then we can say X "Granger cause" Y. However, the GC model requires the influence of each variable to be linearly separable, therefore it might not apply in dynamic systems [14]. In this study, we address such methodological limitations by using convergent cross mapping (CCM), which is a latest developed method for detecting interactions between time series X and Y in a coupled system [14]. It follows Takens' Theorem [15] and defines nonlinear interactions between time series X and Y by looking at the correspondence between "shadow manifolds" constructed from lagged coordinates by nonlinear state space reconstruction of the time series values of X and Y [16].
We can regard the whole social system as a complex system, and the development process of the social system can be modeled as a dynamic process, expressed as . The trajectory of social development can be expressed as a manifold in a d-dimensional space M. That is: : → .
Societal panic and epidemic can be regarded as a part, or as an expression of the whole social system. Consider societal panic and epidemic in the GBNC as X and Y respectively, we can establish an observation function which maps points in M to real space ℝ. If the observation time length is L, we can get two time series of length L, ! = {X(t)} "#$ ! and ! = {Y(t)} "#$ ! . According to the Takens' Theorem [15], we use the temporally-lagged approximations to reconstruct the phase space and obtain the topologically equivalent reconstructed manifold Here, Τ represents a time-lag variable and E determines the dimensional embedding of the manifold. CCM measures if the nearby points on ! correspond to nearby points on ! . If ! "CCM causes" ! , then all information about ! will be included in the reconstructed manifold of ! , so the state of ! can be estimated from the reconstructed manifold of ! . We use the E+1 nearest neighbor of x " on to identify the neighbors of Y(t). Here, we denote the time indices of the E+1 nearest neighbor of x " from closest to farthest by $ , -, … , (2) where . is the "/ weight of the corresponding E+1 nearest neighbors of Y(t), which can be calculated by where [x " , x " ! ] denotes the Euclidean distance between x " and x " ! . By calculating the similarity ρ between the reconstructed and the original time series, we can estimate the strength of the CCM causality from X to Y. There are several ways to calculate the similarity of time series. Here we refer to the latest model [17] and use Pearson correlation to measure ρ. The closer ρ is to 1, the greater the CCM causality between X and Y is (Figure 3).

Unfolding history: coronavirus and human coronaviruses
Before gauging COVID-19 epidemic and COVID-19 infodemic, it is necessary to take a glimpse into the history of scientific taxonomy and nomenclature of emerging virus and infectious disease [18]. In 1966, an International Committee on Nomenclature of Viruses (ICNV) was established with the mission of introducing some degree of order and consistency into the naming of viruses. In 1973, the ICNV became the International Committee on Virus Taxonomy (ICTV), a global authority on the designation and naming of viruses. Another international authoritative body, WHO is responsible for naming new human infectious diseases. GBNC facsimiles the word frequency of "coronavirus", "coronaviruses", "Coronaviridae" and "Nidovirales" in English corpus from 1963 to 2008, respectively. After the initial description of coronaviruses in 1968 [19], there was a mild increase in the numbers of printed books dealing with them, followed by several peaks, after several human coronavirus epidemics: SARS-CoV in 2002-2003, HCoV-NL63 in 2004 and HCoV-HKU1 in 2005. In 1971, "coronaviruses" was officially approved by ICNV [20,21]. The terms "Coronaviridae", "Nidovirales" and "Coronavirinae" were officially approved by ICTV in 1975ICTV in , 1996ICTV in , 2009 However, as an earlier nomenclature practice, the naming history of coronaviruses (CoV) is always misjudged in scientific community (Figure 1). In retrospect, on 16 November 1968, eight distinguished virologists proposed the term "coronaviruses" in a brief annotation of Nature [19]. In fact, unsung virologist Anthony Peter Waterson (1923Waterson ( -1983 and his colleagues should be credited with the coinage of the neologism "coronavirus" [22,23][Personal communication with Prof. Kenneth McIntosh, the only survivor of the authors of the annotation in Nature that proposed this name [19]]. In humans, there are 7 spectrums of human coronaviruses (HCoVs) known to cause the common cold as well as more severe respiratory disease. Of those, human coronaviruses HCoV-229E, HCoV-NL63, HCoV-OC43 and HCoV-HKU1 are routinely responsible for mild respiratory illnesses like the common cold but can cause severe infections in immunocompromised individuals. But three members have caused deadly outbreaks: SARS-CoV, MERS-CoV, and the newly identified coronaviruses (now known as SARS-CoV-2 [24]).  The diachronic discourse of "coronavirus" and "coronaviruses" in English corpus from 1960 to 2008 unveils that there was a mild increase in the numbers of printed books dealing with HCoVs after the initial description of coronaviruses in 1968. Then, each human coronavirus epidemic -SARS-CoV in 2002-2003, HCoV-NL63 in 2004 and HCoV-HKU1 in 2005 -leads to a new wave of hot research (Figure 1).
Furtherly, meta-analysis results of WoS and PubMed indicate that known knowledge still remains off-limit in the field of combating emerging HCoVs (Figure 2). SARS-CoV-2 is the seventh identified coronavirus that can cause diseases of the respiratory tract via humanto-human transmission. It caused mysterious pneumonia outbreak is spreading far more quickly than the SARS-CoV and MERS-CoV diseases [1,25,26], although the epicenter of the outbreak was locked down to curb the pandemic spread [27]. Presently, its clinical severity is yet to be determined, although many fatal cases have occurred. In sum, the enigmatic nature of HCoVs and the many unknowns about the epidemics have put people on edge (Figure 3).

Does a virus' name really matter?
In fact, an 'inappropriate' official nomenclature might fuel infodemic unconsciously. In recent years, humans have witnessed several outbreaks of infectious diseases caused by viruses, with common names given by stakeholders. Each round of naming practice is not always successful. As a case in point, some strongly-held but flawed names such as "Middle Eastern Respiratory Syndrome" [28] and "Swine flu" were accused of unintentional social impacts and negative economic by stigmatizing certain industries or communities (Figure 2). "Swine flu," an influenza strain known to have originated in pigs, resulted in causing great financial damage to farmers, despite there being no evidence that it could be spread via pork consumption. Since these incidents, in May 2015, WHO released some naming conventions for the naming of new human diseases [29].
Can we learn from history? To find the answer, we examine the popularity of top queries of current coronavirus epidemic via the Google Trends Index (Figure 4). Those dynamic shares are faithful indicators of collective behaviours across various regions over time. Currently, people around the globe are divided in their own options on the Internet and in daily communications. Comparatively, a striking feature was that some stigmatized monikers enjoy high frequencies of collective consumption. This finding reveal that the 2019 novel coronavirus is thought to have originated in China, which led to it being frequently named "Chinese coronavirus", "China coronavirus" or "Wuhan coronavirus". Our survey also pinpoint that those stigmatized names might have contributed to recent backlash against Chinese people. Notably, substantial pattern shifts are observed after the announcements of "2019-nCoV" and "COVID-19", whereas "SARS-CoV-2" had failed to keep abreast of "COVID-19". The official portfolio "COVID-19" and "SARS-CoV-2" has not become de facto standard usages. But the portfolio of full-fledged official names would be beneficial to overdue correctives to ethnic stigmatization in long run.

Collective perceptual bias against COVID-19 in the public
To further examine the demographical perceptions of collective behavioral propensities in ongoing infodemic, we characterize the relationship between geographical interest of stigmatized monikers and the cumulative rate of 58 countries and territories, in which confirmed cases of COVID-19 acute respiratory disease has been reported. The results clearly unveil that people in Egypt, Greece, New Zealand, United Kingdom, United States, Canada, Finland, Russia, Philippines, Denmark, Vietnam, Nepal and Mexico would prefer to use stigmatized monikers against Chinese, in comparison with other counterparts (Figure 5).
To characterize the patterns behind such collective perceptual bias, we further scrutinize geographical interest of stigmatized monikers against Chinese in 13 typical territories with low cumulative rates over time (Figure 6). As co-occurrence perceptual phenomena, the substantial illustration could corroborate that people have enjoyed stigmatized monikers with very high frequencies in these typical territories after 16 January 2020. People has negative perceptual bias in the perception of natural origin of COVID-19 in these regions. Moreover, people hold negative perception of the authoritative responses in many countries [18,[33][34][35]. The prognostic significance of our findings is that such approaches are excepted to determine psychological typhoon eye effect -a paradoxical phenomenon that the respondents in the closer to the epicenter of pandemic appear to be the least concerned by the imminent risks -in the near future.
People hold negative perceptual bias in the perception of natural origin of COVID-19 in the 58 countries and territories with low cumulative rates (Figure 7). In sociocultural setting with relatively complex context beyond epidemiological dimension, this approach of panoramic map empowers us to better understand the prevalence and severity of the COVID-19 infodemic throughout the regions comparatively. This finding reminds us that policy-makers should learn from best practice in the reduction of deliberate infodemic risks, with the pull of go-to resources for knowledge and expertise into academic sphere, as well as in the public.
What's the plausible reasons behind such collective perceptual bias? Demographically, according to Pew Research Center's latest Global Attitudes survey [36], a median of 40% in the surveyed countries have a have a positive view of China, compared with a median of 41% who have an unfavorable opinion. However, recognizing COVID-19 acute respiratory disease with potential public health impact, people are tortured by existential life-and-death questions, so that they are inevitably vulnerable to distorted information from the outside world. When the emerging cases reported in their country, infodemics about the cause of the epidemic flew to and fro and nothing seemed certain or obviously right. As a case in point, the Bill & Melinda Gates Foundation, an American private foundation has spent billions on global healthcare, have been suspiciously accused of manufacturing this biowarfare to "wage economic war on China", jointly with the CIA. Only evidently it was not plausible. Such disinformation campaign reminds us that the army of authoritative organizations should work together with each other and cultivate a well-trained cadre of professionals to mediate infodemic risks.   Notably, from the very beginning of COVID-19 epidemic, people in most Asia-Pacific countries, "where many more name China as a top threat" [36], prioritize relations with China to jointly fight the COVID epidemic, rather than malicious discrimination against Chinese. Regrettably, the underrepresented minority cases, some individuals and media have been committed for slashing China image to the bone by promoting unfounded conspiracy theories, such as non-natural origin of COVID-19 [37], made-in-China coronavirus (even desecration of Chinese national flag), "China is the real Sick Man of Asia", China's Chernobyl moment, etc. Some instigators have to face the music and made an open apology for feeding the trolls, but the others are intent on whitewashing their words under the guise of freedom of speech.
Such ridiculous voices do nothing but breed the pathogen of fear, prejudice, disgust, xenophobia and panic [6]. Undoubtedly, they have been levied harsh criticism overwhelmingly. On 8 February, the Lancet published a statement in solidarity with Chinese professionals in combating the novel coronavirus outbreak and called upon fighting against the army of infodemics [37,38]. Later, more and more public health scientists have endorsed this statement.

Collective perceptual bias against COVID-19 in scientific community
Given that multifarious stigmatized monikers have become dominant in the public, what's about in scientific sphere? It is critical that having individuals who are well versed in naming conventions and collaborate directly with researchers on a regular basis. Unfortunately, before the antidotes to infodemic -proper names -find their ways into the public mind, debate on interim solutions has been going on.
On 12 January 2020, WHO provisionally named the 2019 novel coronavirus disease "2019-nCoV". China's National Health Commission (CNHC) decided to temporarily call the disease "Novel Coronavirus Pneumonia" or "NCP" on 7 February. This official name has invoked intensive arguments outside as well as inside the scientific community. Firstly, Chinese scientists are divided on that official name. Supporters say the descriptive name follows typical classification practices, whereas opponents claim that it could be easily misunderstood and abused to sow the seeds for panic. Secondly, the word 'novel' is confusing in the way that neither the disease nor the host range can be used to reliably determine the virus novelty. Arguably, high mutation and gene recombination rates make this type of virus ideal for pathogen evolution [39]. Once viral mutation happened, it won't be 'novel' any more.
Before that, the 2019 novel coronavirus was designated as "WH-Human-1 coronavirus" or "Wuhan-Human-1 coronavirus" by a group of scientists in Nature on 3 February [40]. In the same vein, on 11 February, another name "HARS-CoV", with 'Han' standing for 'Wuhan in Chinese', was proposed in The Lancet [41]. Obviously, such practices are against the naming principles of WHO [42] -geographic locations should be avoided in disease names, and the name should be short and easy to pronounce. Such names might provoke unintended negative impacts by stigmatizing Wuhan citizens and even Chinese people. Those flawed notions take hold should be duly corrected, as well as other similar paradigms ( Table 1).
In response to such concerns, on 11 February, WHO officially renamed "2019-nCoV" as "COVID-19", with 'CO' meaning 'corona', 'VI' for 'virus', 'D' for 'disease', and '19' referring to 2019. This generic descriptive reassignment offers an overdue corrective to those strongly-held but flawed notions, with the hope of minimizing stigma. Coinciding with the WHO's latest announcement, in a bioRxiv preprint [43], a new name "Severe Acute Respiratory Syndrome coronavirus 2" or "SARS-CoV-2" was penned by the Coronavirus Study Group of the International Committee on Taxonomy of Viruses (ICTV-CSG) on the same day. ICTV-CSG explains that this designation highlights the new strain's similarity to the SARS-CoV [24]. It is unclear whether the ICTV will take this suggestion into consideration.
In the real dilemma, WHO and some prominent virologists are far less skewed towards SARS-CoV-2, the nomenclature endorsed by ICTV-CSG [44,45]. Outside the academic-industrial sphere, people also argued against this official name. Although "SARS-CoV-2" seems to be natural for ICTV-CSG to add a numeral '2' behind "SARS-CoV" to signify their relation, many prominent scientists scramble to refute the latest claim. To the untrained eye, the hasty designation may mislead the public to perceive a severer strain virus as a direct descendant of SARS-CoV in plain sight, rather than a close affinity for the causative agent of China's another major viral outbreak in 2002-03. Before that, on 5 February, Prof. Shibo Jiang and his colleagues proposed another name, "Pneumonia Acute Respiratory Syndrome Coronavirus" or "PARS-CoV" in Cellular & Molecular Immunology [46]. In the same token, this assignment also intends to retain equivalent terminology of SARS-CoV. Nonetheless, only two weeks later, without mention of their earlier similar formulations [46,47], they reintroduced the third name "HCoV-19" ("Human coronavirus 2019") in the Lancet [45], objecting to the usage of SARS-CoV-2.
In fact, the looming worry is that the public are susceptible to SARS-CoV [30], which evokes the memory of higher case fatality ratio. On 9 February, Chen Huan-chun, Chinese academician and virologist, made a public apology for mistakenly saying 2019-nCoV is SARS-CoV, which had stung a sensitive nerve and aroused great consternation in the Chinese public.
Layered on top of mediate infodemic risks, making informed and judicious choice is a catch-22 for each authoritative body. It is necessary to punctuate heuristic cautions and continuous introspection of previous multifarious names [28,30,42,44], which is a requisite bedrock of such scientific efforts. Recently, global profusion of squab candidates has been discussed inside the scientific community, as well as on social media. For example, TARS-CoV [47] and CARS-CoV, with 'ARS' standing for 'acute respiratory syndrome', 'T' for 'transmissible', and 'C' for 'contagious'. Whatever merits and demerits, some of them with plausible reasons should be fairly recognized. Before the pull of academic accession, authority should take an open mind to appreciate modest introspections and rededications to such collective efforts. On 22 February, CNHC officially renamed the temporary English name "NCP" as "COVID-19", with the hope of sitting well with the reference of WHO and further discouraging the use of stigmatized titles [48].

Conclusions
With an emphasis on infodemiological analysis and meta-analysis on COVID-19 epidemic and COVID-19 infodemic, we scrutinize collective communication behaviors on the Internet and pertinent usages in publications in sociocultural paradigms to uncover some unfolded motivations and consequences: [1] As an earlier nomenclature practice, the neologism "coronavirus" came from the idea of Anthony Peter Waterson and his colleagues. Due to the misjudgments of its debut in textbooks, the history of science and technology might be redefined, which might result in discouraging us to reminisce those unsung pioneers who have inaugurated induction and stimulation of seminal inspiration.
[2] Although psychologists often make claims about the relatedness between epidemics and panic on the basis of qualitative evidence.
The quantitative results of convergent cross mapping (CCM) reciprocity analysis reveals that people are invariably vulnerable to panic attacks during episodes of epidemics with enigmatic nature. [3] Infodemic follows closely on the heels of every pathogen like never-departing shadow [18], branding discrimination and stoking panic. The portfolio of full-fledged official names would duly discourage the spread of regional stigmatization and racial discrimination, and reverse negative perceptual bias and collective behavioral propensities amid public panic. [4] People around the globe are divided in their favor stigmatized monikers because of perceptual bias in the public and scientific communities. Perceptual bias in the perception of natural origin of COVID-19 is part of the reason for negative behavioral propensities in specific regions, rather than the degree of infection in their territories. [5] The prognostic significance of above findings is that infodemiological analysis could be expected to provide a hallmark reference to reframe extensible discussions on the approaching tipping point of psychological typhoon eye effect in the COVID-19 infodemic, as well as substantial patterns of the next infodemic.

Discussion
At this critical moment, an epoch-making name is expected to be scientifically pithy and socially acceptable, with the faith of minimizing unintentional negative impacts on nations, economies and people. This is a positivist doctrine, not merely for naming a virus but for the vitality of science and the promotion of social progress. Obviously, some naming practices went awry. Learning lessons of infodemic could surely be seen as a pressing necessity of yielding up some guidelines for the adoption of practical principles intended to enhance the possibility for the lessening of stigmatization and discrimination.
Technically, we now see collaborative efforts as a potential way to help strengthen and standardize ongoing international initiatives of WHO and ICTV. Admittedly, understanding of the way naming rules strengthen and enrich the integrity and quality of naming practices under the umbrella of original mission remains nominal rather than substantial [28,30,42,44]. For example, as the precaution, the word 'novel' was recommended by WHO for "indicating a new pathogen of a previously known type, recognizing that this term will become obsolete if other new pathogens of that type are identified." [29] However, stakeholders frequently reserve 'novel' for striking new type of virus, lest the word lose fundamentally its impact without regular amendments.