Article
Version 2
Preserved in Portico This version is not peer-reviewed
Preliminary Analysis of COVID-19 Academic Information Patterns: A Call for Open Science in the Times of Closed Borders
Version 1
: Received: 30 March 2020 / Approved: 31 March 2020 / Online: 31 March 2020 (04:38:53 CEST)
Version 2 : Received: 21 April 2020 / Approved: 22 April 2020 / Online: 22 April 2020 (06:15:34 CEST)
Version 2 : Received: 21 April 2020 / Approved: 22 April 2020 / Online: 22 April 2020 (06:15:34 CEST)
A peer-reviewed article of this Preprint also exists.
Abstract
Introduction: The Pandemic of COVID-19, an infectious disease caused by SARS-CoV-2 motivated the scientific community to work together in order to gather, organize, process and distribute data on the novel biomedical hazard. Here, we analyzed how the scientific community responded to this challenge by quantifying distribution and availability patterns of the academic information related to COVID-19. The aim of our study was to assess the quality of the information flow and scientific collaboration, two factors we believe to be critical for finding new solutions for the ongoing pandemic. Materials and methods: The RISmed R package, and a custom Python script were used to fetch metadata on articles indexed in PubMed and published on Rxiv preprint server. Scopus was manually searched and the metadata was exported in BibTex file. Publication rate and publication status, affiliation and author count per article, and submission-to-publication time were analysed in R. Biblioshiny application was used to create a world collaboration map. Results: Our preliminary data suggest that COVID-19 pandemic resulted in generation of a large amount of scientific data, and demonstrates potential problems regarding the information velocity, availability, and scientific collaboration in the early stages of the pandemic. More specifically, our results indicate precarious overload of the standard publication systems, significant problems with data availability and apparent deficient collaboration. Conclusion: In conclusion, we believe the scientific community could have used the data more efficiently in order to create proper foundations for finding new solutions for the COVID-19 pandemic. Moreover, we believe we can learn from this on the go and adopt open science principles and a more mindful approach to COVID-19-related data to accelerate the discovery of more efficient solutions. We take this opportunity to invite our colleagues to contribute to this global scientific collaboration by publishing their findings with maximal transparency.
Supplementary and Associated Material
https://github.com/davorvr/covid-academic-pattern-analysis-v2: GitHub data and code repository
Keywords
COVID-19; open science; data; bibliometric; pandemic
Subject
Social Sciences, Library and Information Sciences
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (3)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment
Commenter: Jan Homolak
Commenter's Conflict of Interests: Author
Commenter: Anne Rosemary Tate
The commenter has declared there is no conflict of interests.
https://publons.com/review/7829975/
Authors, would it be possible to modify your search tool to
1. Identify medxriv pre-prints that have uploaded an EQUATOR checklist
2. Identify preprints that detail the type of study in the title or abstract (Indicate the study's design with a commonly used term in the title or the abstract.).
I ask because most of the authors seem to ignore both. Even authors from very reputable groups.
It would be good if we could persuade medxriv to make these mandatory as this would not only improve the quality, but would also help reviewers like me.
Commenter:
The commenter has declared there is no conflict of interests.
The EQUATOR checklists seem like an excellent tool, and indeed the [medRxiv submission page](https://www.medrxiv.org/submit-a-manuscript) states that "[a]uthors must submit the appropriate research reporting checklist as suggested by the EQUATOR network as supplementary files." From my cursory glance, few of them do, at least properly attached as a precisely named supplement. A functionality that examines this by looking for names of EQUATOR checklists in supplement names could likely be added to our tool, with inaccurate or nondescriptive names of supplementary material being a potential source of false negative results. It's an excellent suggestion and we will definitely explore adding this functionality!
Obtaining detailed data about studies themselves is unfortunately pretty tricky. For instance, using pure search by keyword, it is impossible to know if a study mentioning western blots uses the method, explores it, suggests using it, criticises it, or something else entirely. This requires natural language processing (NLP), which tries to infer this information from context clues. However, NLP models need to be painstakingly trained and thoroughly tested for any specific purpose to achieve reasonable accuracy. Note, however, that I am far from an expert in the field of NLP. Your second suggestion would definitely provide very useful information, but its implementation is unfortunately something that is out of our reach for the forseeable future.
In the end, I would like to thank you once again for your suggestions and your input, it is of immense help. :)
I will conclude with an unfortunately painfully relevant xkcd comic on standards: xkcd.com/927/
Best regards,
Davor