Working Paper Short Note Version 2 This version is not peer-reviewed

The Anomalous Nature of the Fecal Swab Sample Used for RaTG13 Genome Assembly as Revealed by NGS Data Analysis

Version 1 : Received: 7 August 2020 / Approved: 8 August 2020 / Online: 8 August 2020 (06:19:45 CEST)
Version 2 : Received: 8 August 2020 / Approved: 11 August 2020 / Online: 11 August 2020 (08:06:32 CEST)

How to cite: Rahalkar, M.; Bahulikar, R. The Anomalous Nature of the Fecal Swab Sample Used for RaTG13 Genome Assembly as Revealed by NGS Data Analysis . Preprints 2020, 2020080205 Rahalkar, M.; Bahulikar, R. The Anomalous Nature of the Fecal Swab Sample Used for RaTG13 Genome Assembly as Revealed by NGS Data Analysis . Preprints 2020, 2020080205

Abstract

RaTG13 beta coronavirus, which exists in the form of a genome sequence, is the closest relative of SARS-CoV-2 reported till date. The sample from which RaTG13 virus was sequenced was a bat fecal swab collected in 2013 from Tongguan, Mojiang, Yunnan province, China. The genome data for RaTG13, MN996532.1, was deposited on 27th Jan 2020 and the raw data (Illumina reads) was deposited a fortnight later on 13th Feb 2020 https://www.ncbi.nlm.nih.gov/sra/SRX7724752[accn]. Comparison of the RNA Seq data of RaTG13 fecal swab sample to the corresponding data from the bat fecal swabs deposited by the same working group indicated that the raw data seemed to be anomalous in several aspects. Thirty percent of the reads did not match with anything. From the rest of the 70%, an abnormal high proportion was contributed by reads derived from eukaryotes (~68%). These matched with the sequences of not one but various bat species (round leaf bats, fruit bats and other bats) and animal species (squirrels, foxes, etc.) as per Krona analysis included with the SRA data. The proportion of the bacterial reads in the swab was exceptionally low, i.e. 0.7%, which is abnormal, compared to the 70-90% bacterial abundance in other bat fecal swabs. Furthermore, we also found another set of raw data associated with RaTG13, amplicon sequencing of the genome (SRX8357956), which was submitted in May 2020. Analysis of the amplicons by BLAST showed that these collectively do not cover the whole genome (MN996532.1). On closer inspection, the dates mentioned in the files of the sequenced amplicons were also found to be older (2017, 2018). Collectively, the anomalies in the raw data of RaTG13 pose an important question about the overall authenticity of the RaTG13 genome sequence.

Subject Areas

RaTG13; SARS-COV-2; Illumina sequencing, amplicon sequencing, NGS; fecal swab

Comments (3)

Comment 1
Received: 11 August 2020
Commenter: Monali Rahalkar
Commenter's Conflict of Interests: Author
Comment: Mostly we have done grammatical corrections. No major content change in this version.
+ Respond to this comment
Comment 2
Received: 16 August 2020
Commenter: Tom Magic
The commenter has declared there is no conflict of interests.
Comment: Clearly these authors do not have any idea about the difference between RNA virus samples and fece samples.
+ Respond to this comment
Response 1 to Comment 2
Received: 20 August 2020
Commenter: John F Signus
The commenter has declared there is no conflict of interests.
Comment: Actually, the RNA-Seq method prep, “QiaAmp Viral RNA mini kit”, was NOT designed to distinguish between cellular and viral RNA, or DNA and RNA. It clearly stated that they sequenced the whole thing. It is not an “RNA-Virus sample”.
https://www.qiagen.com/us/resources/download.aspx?id=c80685c0-4103-49ea-aa72-8989420e3018&lang=en

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 3
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.