Preprint Article Version 1 This version is not peer-reviewed

Preliminary Analysis of COVID-19 Academic Information Patterns: A Call for Open Science in the Times of Closed Borders

Version 1 : Received: 30 March 2020 / Approved: 31 March 2020 / Online: 31 March 2020 (04:38:53 CEST)
Version 2 : Received: 21 April 2020 / Approved: 22 April 2020 / Online: 22 April 2020 (06:15:34 CEST)

A peer-reviewed article of this Preprint also exists.

Journal reference: Scientometrics 2020
DOI: 10.1007/s11192-020-03587-2


Introduction: The Pandemic of COVID-19, an infectious disease caused by SARS-CoV-2 motivated the scientific community to work together in order to gather, organize, process and distribute data on the novel biomedical hazard. Here, we analyzed how the scientific community responded to this challenge by quantifying distribution and availability patterns of the academic information related to COVID-19. The aim of our study was to assess the quality of the information flow and scientific collaboration, two factors we believe to be critical for finding new solutions for the ongoing pandemic. Materials and Methods: The RISmed R package, and a custom Python script were used to fetch metadata on articles indexed in PubMed and published on rXiv preprint server. Scopus was manually searched and the metadata was exported in BibTex file. Publication rate and publication status, affiliation and author count per article, and submission-to-publication time were analysed in R. Biblioshiny application was used to create a world collaboration map. Results: Our preliminary data suggest that COVID-19 pandemic resulted in generation of a large amount of scientific data, and demonstrates potential problems regarding the information velocity, availability, and scientific collaboration in the early stages of the pandemic. More specifically, our results indicate precarious overload of the standard publication systems, delayed adoption of the preprint publishing, significant problems with data availability and apparent deficient collaboration. Conclusion: In conclusion, we believe the scientific community could have used the data more efficiently in order to create proper foundations for finding new solutions for the COVID-19 pandemic. Moreover, we believe we can learn from this on the go and adopt open science principles and a more mindful approach to COVID-19-related data to accelerate the discovery of more efficient solutions. We take this opportunity to invite our colleagues to contribute to this global scientific collaboration by publishing their findings with maximal transparency.

Supplementary and Associated Material

Subject Areas

COVID-19; open science; data; bibliometric; pandemic

Comments (3)

Comment 1
Received: 9 April 2020
Commenter: Anne Rosemary Tate
The commenter has declared there is no conflict of interests.
Comment: I totally agree with the sentiments of this article and plan to submit a review on Outbreak.

I have been reviewing some of the preprints on medrxiv. Many of these seem to have been hastily put together and often the conclusions are not supported by the methods and results. The Wellcome Trust Outbreak preview initiative is an excellent idea. However, there are so many articles posted every day that it is difficult to know which to choose. I have asked Outbreak to include a question on whether the appropriate EQUATOR checklist has been been uploaded and correctly filled in. This could quickly filter out some of the poorer studies.
I appreciate your transparency and providing access to the data. However, I could not open the data file. My computer recognised it as an R file, but I couldn't open it in R. I do use R occasionally, but many of your readers will not. So could you please upload the data as a plain text (csv or similar file).
Many thanks
Rosemary Tate
+ Respond to this comment
Response 1 to Comment 1
Received: 9 April 2020
Commenter: Jan Homolak
Commenter's Conflict of Interests: I'm the author
Comment: Dear Rosemary,

Thank you for your comment. I agree with you completely, some kind of checklist could improve the situation. We should insist on this and other methods to improve poor publication quality as this is something we have to deal with regardless of the ongoing COVID-19 situation.

We are currently writing the second version of our manuscript with some additional data included, and some corrections. We will have your comment in mind and provide our raw data in .csv or .txt. We didn't do this initially as we find rdata more convenient for storing large amount of data in format that is easier to access in R. However, you are absolutely right, and our analyses should be open for validation in other software packages as well.

We plan to publish the second version of the manuscript here in the next few days (we have to wait for the approval once we upload the version to If you need the data sooner, please feel free to contact me so we can work it out.


[email protected]
[email protected]
Response 2 to Comment 1
Received: 22 April 2020
Commenter: Jan Homolak
Commenter's Conflict of Interests: Author of the paper
Comment: Dear Anne Rosemary Tate,

We have now uploaded a newer version of our manuscript here on and created a new data and code repository with data available in the .csv format. You can access everything here: 10.20944/preprints202003.0443.v2
We added some additional analyses, introduced more controls and corrected some minor mistakes.

Please let me know if you have any further comments (-:

PS. If you submitted a review on Outbreak already in the meantime I'd be interested to read it.

Best regards,

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 3
Metrics 0

Notify me about updates to this article or when a peer-reviewed version is published.