Preprint Case Report Version 1 Preserved in Portico This version is not peer-reviewed

Data Forensic Determination of the Accuracy of International COVID-19 Reporting: Using Zipf's Law for Pandemic Investigation

Version 1 : Received: 28 April 2020 / Approved: 30 April 2020 / Online: 30 April 2020 (13:49:58 CEST)

How to cite: Iorliam, A.; Ho, A.T.; Tirunagari, S.; Windridge, D. Data Forensic Determination of the Accuracy of International COVID-19 Reporting: Using Zipf's Law for Pandemic Investigation. Preprints 2020, 2020040531 (doi: 10.20944/preprints202004.0531.v1). Iorliam, A.; Ho, A.T.; Tirunagari, S.; Windridge, D. Data Forensic Determination of the Accuracy of International COVID-19 Reporting: Using Zipf's Law for Pandemic Investigation. Preprints 2020, 2020040531 (doi: 10.20944/preprints202004.0531.v1).

Abstract

Severe outbreaks of infectious disease occur throughout the world with some reaching the level of an international pandemic: Coronavirus (COVID-19) is the most recent to do so. As such pandemics cause extensive loss of lives, hamper industrial operations, and cause economic losses in both developing and developed countries, it is critical to establish common standards of accuracy in the determination and reporting of cases. In particular, there are current concerns that countries are hiding or incorrectly reporting cases of COVID-19. In this paper, we set out a mechanism for using Zipf's law to establish the accuracy of international reporting of COVID-19 cases via a determination of whether an individual country's COVID-19 reporting follows a power-law for confirmed, recovered, and death cases. We observe that the probability of Zipf's law (P-values) for COVID-19 confirmed cases show that Uzbekistan has the highest P-value of 0.940, followed by Belize (0.929), and Qatar (0.897). For COVID-19 recovered cases, Iraq had the highest P-value of 0.901, followed by New Zealand (0.888), and Austria (0.884). Furthermore, for COVID-19 death cases, Bosnia and Herzegovina had the highest P-value of 0.874, followed by Lithuania (0.843), and Morocco (0.825). China, where the COVID-19 pandemic began, is a significant outlier in recording P-values lower than 0.1 for the confirmed, recovered, and death cases. This raises important questions, not only for China but also any country whose data exhibits P-values below this threshold. The main application of this work is to serve as an early warning for the World Health Organization (WHO) and other health regulatory bodies to perform more investigations in countries where COVID-19 datasets deviate significantly from Zipf's law. To this end, we also provide a tool for illustrating Zipf's law P-values on a global map in order to report anomalies.

Subject Areas

Zipf's law; COVID-19; Pandemics

Comments (1)

Comment 1
Received: 30 April 2020
Commenter: RJ Ash Jr
The commenter has declared there is no conflict of interests.
Comment: China, where the COVID-19 pandemic began, is a significant outlier in recording P-values lower than 0.1 for the confirmed, recovered, and death cases.
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.