Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

Fighting the COVID-19 Infodemic in New articles and False Publications: NeoNet, a Text-based Supervised Machine Learning Algorithm

Version 1 : Received: 17 June 2021 / Approved: 18 June 2021 / Online: 18 June 2021 (14:48:31 CEST)
Version 2 : Received: 11 July 2021 / Approved: 12 July 2021 / Online: 12 July 2021 (11:58:57 CEST)
Version 3 : Received: 24 July 2021 / Approved: 26 July 2021 / Online: 26 July 2021 (12:06:04 CEST)

A peer-reviewed article of this Preprint also exists.

Abdeen, M.A.R.; Hamed, A.A.; Wu, X. Fighting the COVID-19 Infodemic in News Articles and False Publications: The NeoNet Text Classifier, a Supervised Machine Learning Algorithm. Appl. Sci. 2021, 11, 7265. Abdeen, M.A.R.; Hamed, A.A.; Wu, X. Fighting the COVID-19 Infodemic in News Articles and False Publications: The NeoNet Text Classifier, a Supervised Machine Learning Algorithm. Appl. Sci. 2021, 11, 7265.

Abstract

The spread of the Coronavirus pandemic has been accompanied by an infodemic. The false information that is embedded in the infodemic affects people’s ability to have access to safety information and follow proper procedures to mitigate the risks. This research aims to target the falsehood part of the infodemic, which prominently proliferates in news articles and false medical publications. Here, we present NeoNet, a novel supervised machine learning text mining algorithm that analyzes the content of a document (news article, a medical publication) and assigns a label to it. The algorithm is trained by TFIDF bigram features which contribute a network training model. The algorithm is tested on two different real-world datasets from the CBC news network and Covid-19 publications. In five different fold comparisons, the algorithm predicted a label of an article with a precision of 97-99 %. When compared with prominent algorithms such as Neural Networks, SVM, and Random Forests NeoNet surpassed them. The analysis highlighted the promise of NeoNet in detecting disputed online contents which may contribute negatively to the COVID-19 pandemic.

Keywords

COVID-19 Infodemic; Text Classification; TFIDF Features; Network Training modes; Supervised Learning; Misinformation; News Classification; False Publications; PubMed; Anomaly Detection

Subject

Computer Science and Mathematics, Algebra and Number Theory

Comments (1)

Comment 1
Received: 12 July 2021
Commenter: Ahmed Hamed
Commenter's Conflict of Interests: Author
Comment: A new version that addressed the respected reviewers's comments
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.