Submitted:
27 August 2024
Posted:
28 August 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Rationale for Identifying Unreal SARS-CoV-2 Genomes in GenBank
1.2. Identifying SARS-CoV-2 Genomes in GenBank with Altered Collection Time
2. Materials and Methods
2.1. Identify Unreal Sequences
2.2. Identify Genomes with Altered Collection Time
3. Results
3.1. Unreal SARS-CoV-2 Genomes in GenBank
3.2. Changes In Viral Sample Collection Time
3.3. NCBI Is Slow To Correct Annotation Errors
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wu, F.; Zhao, S.; Yu, B.; Chen, Y.M.; Wang, W.; Song, Z.G.; Hu, Y.; Tao, Z.W.; Tian, J.H.; Pei, Y.Y., et al., A new coronavirus associated with human respiratory disease in China. Nature 2020, 579, 265-269. [CrossRef]
- Polack, F.P.; Thomas, S.J.; Kitchin, N.; Absalon, J.; Gurtman, A.; Lockhart, S.; Perez, J.L.; Pérez Marc, G.; Moreira, E.D.; Zerbini, C., et al., Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine. N Engl J Med 2020. [CrossRef]
- Corbett, K.S.; Edwards, D.K.; Leist, S.R.; Abiona, O.M.; Boyoglu-Barnum, S.; Gillespie, R.A.; Himansu, S.; Schäfer, A.; Ziwawo, C.T.; DiPiazza, A.T., et al., SARS-CoV-2 mRNA vaccine design enabled by prototype pathogen preparedness. Nature 2020, 586, 567-571.
- Xia, X., Detailed Dissection and Critical Evaluation of the Pfizer/BioNTech and Moderna mRNA Vaccines. Vaccines 2021, 9, 734. [CrossRef]
- Xia, X., Domains and Functions of Spike Protein in SARS-COV-2 in the Context of Vaccine Design. Viruses 2021, 13, 109 doi: 110.3390/v13010109. [CrossRef]
- MacLean, O.A.; Lytras, S.; Weaver, S.; Singer, J.B.; Boni, M.F.; Lemey, P.; Kosakovsky Pond, S.L.; Robertson, D.L., Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly capable human pathogen. PLOS Biology 2021, 19, e3001115.
- Wang, H.; Pipes, L.; Nielsen, R., Synonymous mutations and the molecular evolution of SARS-CoV-2 origins. Virus Evol 2021, 7, veaa098.
- Boni, M.F.; Lemey, P.; Jiang, X.; Lam, T.T.-Y.; Perry, B.; Castoe, T.; Rambaut, A.; Robertson, D.L., Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nature microbiology 2020, 5, 1408–1417.
- Lytras, S.; Xia, W.; Hughes, J.; Jiang, X.; Robertson, D.L., The animal origin of SARS-CoV-2. Science 2021, 373, 968-970.
- Xia, X., Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes. Viruses 2021, 13, 1790.
- Xia, X., Improved method for rooting and tip-dating a viral phylogeny. In Handbook of Statistical Bioinformatics, Lu, H.H.-S.; Scholkopf, B.; Wells, M.T.; Zhao, H., Eds. Springer: Berlin, 2022; pp 397-410.
- Vakatov, D., The NCBI C++ Toolkit Book. National Center for Biotechnology Information (US) https://ncbi.github.io/cxx-toolkit/ (accessed on Sept. 1, 2021): Bethesda (MD), 2009.
- Xia, X., Rooting and Dating Large SARS-CoV-2 Trees by Modeling Evolutionary Rate as a Function of Time. Viruses 2023, 15, 684.
- Xia, X., Extreme genomic CpG deficiency in SARS-CoV-2 and evasion of host antiviral defense. Molecular Biology and Evolution 2020, 37, 2699–2705.
- Nchioua, R.; Kmiec, D.; Müller, J.A.; Conzelmann, C.; Groß, R.; Swanson, C.M.; Neil, S.J.D.; Stenger, S.; Sauter, D.; Münch, J., et al., SARS-CoV-2 Is Restricted by Zinc Finger Antiviral Protein despite Preadaptation to the Low-CpG Environment in Humans. MBio 2020, 11.
- Zimmer, M.M.; Kibe, A.; Rand, U.; Pekarek, L.; Ye, L.; Buck, S.; Smyth, R.P.; Cicin-Sain, L.; Caliskan, N., The short isoform of the host antiviral protein ZAP acts as an inhibitor of SARS-CoV-2 programmed ribosomal frameshifting. Nature communications 2021, 12, 7193.
- Kmiec, D.; Lista, M.J.; Ficarelli, M.; Swanson, C.M.; Neil, S.J.D., S-farnesylation is essential for antiviral activity of the long ZAP isoform against RNA viruses with diverse replication strategies. PLOS Pathogens 2021, 17, e1009726. [CrossRef]
- van Dorp, L.; Acman, M.; Richard, D.; Shaw, L.P.; Ford, C.E.; Ormond, L.; Owen, C.J.; Pang, J.; Tan, C.C.S.; Boshier, F.A.T., et al., Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infect Genet Evol 2020, 83, 104351.
- Gómez-Carballa, A.; Bello, X.; Pardo-Seco, J.; Martinón-Torres, F.; Salas, A., Mapping genome variation of SARS-CoV-2 worldwide highlights the impact of COVID-19 super-spreaders. Genome Res. 2020, 30, 1434-1448.
- Rambaut, A.; Holmes, E.C.; O’Toole, Á.; Hill, V.; McCrone, J.T.; Ruis, C.; du Plessis, L.; Pybus, O.G., A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nature microbiology 2020, 5, 1403-1407.
- Chaw, S.-M.; Tai, J.-H.; Chen, S.-L.; Hsieh, C.-H.; Chang, S.-Y.; Yeh, S.-H.; Yang, W.-S.; Chen, P.-J.; Wang, H.-Y., The origin and underlying driving forces of the SARS-CoV-2 outbreak. J. Biomed. Sci. 2020, 27, 73.
- Drummond, A.J.; Ho, S.Y.; Phillips, M.J.; Rambaut, A., Relaxed phylogenetics and dating with confidence. PLoS Biol 2006, 4, e88. [CrossRef]
- Lepage, T.; Bryant, D.; Philippe, H.; Lartillot, N., A general comparison of relaxed molecular clock models. Mol. Biol. Evol. 2007, 24, 2669-2680.
- Rannala, B.; Yang, Z., Inferring speciation times under an episodic molecular clock. Syst Biol 2007, 56, 453-466. [CrossRef]
- Korber, B.; Fischer, W.M.; Gnanakaran, S.; Yoon, H.; Theiler, J.; Abfalterer, W.; Hengartner, N.; Giorgi, E.E.; Bhattacharya, T.; Foley, B., et al., Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus. Cell 2020, 182, 812-827 e819.
- Yurkovetskiy, L.; Wang, X.; Pascal, K.E.; Tomkins-Tinch, C.; Nyalile, T.P.; Wang, Y.; Baum, A.; Diehl, W.E.; Dauphin, A.; Carbone, C., et al., Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant. Cell 2020, 183, 739-751 e738.
- Oude Munnink, B.B.; Sikkema, R.S.; Nieuwenhuijse, D.F.; Molenaar, R.J.; Munger, E.; Molenkamp, R.; van der Spek, A.; Tolsma, P.; Rietveld, A.; Brouwer, M., et al., Transmission of SARS-CoV-2 on mink farms between humans and mink and back to humans. Science 2021, 371, 172.
- Worobey, M.; Levy, J.I.; Malpica Serrano, L.; Crits-Christoph, A.; Pekar, J.E.; Goldstein, S.A.; Rasmussen, A.L.; Kraemer, M.U.G.; Newman, C.; Koopmans, M.P.G., et al., The Huanan Seafood Wholesale Market in Wuhan was the early epicenter of the COVID-19 pandemic. Science 2022, 377, 951-959. [CrossRef]
- Pekar, J.E.; Magee, A.; Parker, E.; Moshiri, N.; Izhikevich, K.; Havens, J.L.; Gangavarapu, K.; Malpica Serrano, L.M.; Crits-Christoph, A.; Matteson, N.L., et al., The molecular epidemiology of multiple zoonotic origins of SARS-CoV-2. Science 2022, 377, 960-966.
- Volz, E.; Hill, V.; McCrone, J.T.; Price, A.; Jorgensen, D.; O'Toole, Á.; Southgate, J.; Johnson, R.; Jackson, B.; Nascimento, F.F., et al., Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity. Cell 2020, 184, 64-75.
- Xia, X., Sequence evidence that the D614G clade of SARS-CoV-2 was already circulating in northern Italy in the fall of 2019. Qeios: 2022.
- Ruan, Y.; Wen, H.; Hou, M.; He, Z.; Lu, X.; Xue, Y.; He, X.; Zhang, Y.-P.; Wu, C.-I., The twin-beginnings of COVID-19 in Asia and Europe—one prevails quickly. National science review 2022, 9, nwab223. [CrossRef]
| ACCN(1) | Country | T(2) | (3) | (4) | (5) |
|---|---|---|---|---|---|
| OM094978 | USA | 3/24/2021 | 454 | 25.0880 | 1.2718E-11 |
| OM108445 | India | 7/1/2021 | 553 | 30.5588 | 5.3517E-14 |
| OP022337 | USA | 10/20/2021 | 664 | 36.6926 | 1.1603E-16 |
| OP268178 | Mexico | 8/19/2022 | 967 | 53.4364 | 6.2067E-24 |
| PP434597 | India | 4/10/2023 | 1201 | 66.3673 | 1.5034E-29 |
| PQ008636 | India | 12/10/2023 | 1445 | 79.8507 | 2.0955E-35 |
| PQ008633 | India | 1/11/2024 | 1477 | 81.6190 | 3.5753E-36 |
| PQ008634 | India | 1/11/2024 | 1477 | 81.6190 | 3.5753E-36 |
| PQ008635 | India | 1/18/2024 | 1484 | 82.0058 | 2.4284E-36 |
| ACCN | Country | T | |||
|---|---|---|---|---|---|
| OM095202 | USA | 10/8/2020 | 287 | 15.8596 | 1.2950E-07 |
| MZ722043 | USA | 10/25/2020 | 304 | 16.7990 | 5.0614E-08 |
| OM095001 | USA | 11/25/2020 | 335 | 18.5121 | 9.1264E-09 |
| OM095004 | USA | 11/25/2020 | 335 | 18.5121 | 9.1264E-09 |
| OM095010 | USA | 11/25/2020 | 335 | 18.5121 | 9.1264E-09 |
| OM095127 | USA | 12/11/2020 | 351 | 19.3963 | 3.7697E-09 |
| MW960278 | Pakistan | 12/11/2020 | 351 | 19.3963 | 3.7697E-09 |
| MZ722192 | USA | 12/14/2020 | 354 | 19.5620 | 3.1938E-09 |
| OP278726 | Pakistan | 12/17/2020 | 357 | 19.7278 | 2.7059E-09 |
| OM095142 | USA | 12/21/2020 | 361 | 19.9489 | 2.1693E-09 |
| MZ722000 | USA | 12/21/2020 | 361 | 19.9489 | 2.1693E-09 |
| MZ722615 | USA | 12/21/2020 | 361 | 19.9489 | 2.1693E-09 |
| MZ722630 | USA | 12/21/2020 | 361 | 19.9489 | 2.1693E-09 |
| MZ722702 | USA | 12/21/2020 | 361 | 19.9489 | 2.1693E-09 |
| OP022336 | USA | 12/30/2020 | 370 | 20.4462 | 1.3193E-09 |
| ACCN | Country | T1(1) | T2(2) | Tree1..Tree2(3) | T1 - T2 |
|---|---|---|---|---|---|
| MW795884 | USA | 1/13/2020 | 1/13/2021 | -366 | |
| OK244698 | USA | 1/14/2020 | 12/30/2021 | -716 | |
| MW585340 | USA | 1/5/2020 | 1/5/2021 | -366 | |
| MZ028629 | USA | 2/18/2020 | 2/18/2021 | 7/12/2021..5/7/2022 | -366 |
| MZ436887 | Sierra_Leone | 1/14/2020 | 1/14/2021 | 11/8/2021..5/7/2022 | -366 |
| MZ436896 | Sierra_Leone | 1/14/2020 | 1/14/2021 | 11/8/2021..5/7/2022 | -366 |
| MZ469886 | US | 1/12/2020 | 1/12/2021 | 11/8/2021..5/7/2022 | -366 |
| MZ469887 | US | 1/6/2020 | 1/6/2021 | 11/8/2021..5/7/2022 | -366 |
| MZ473469 | US | 2/17/2020 | 2/17/2021 | 11/8/2021..5/7/2022 | -366 |
| MW786995 | USA | 3/10/2020 | 3/10/2021 | 4/3/2021..5/7/2022 | -365 |
| MW921831 | USA | 3/15/2020 | 3/15/2021 | 4/25/2021..5/7/2022 | -365 |
| MZ021503 | India | 3/1/2020 | 3/1/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ021504 | India | 3/6/2020 | 3/6/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ021505 | India | 3/6/2020 | 3/6/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ021506 | India | 3/3/2020 | 3/3/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ278198 | US | 4/21/2020 | 4/21/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ397171 | Myanmar | 5/28/2020 | 5/28/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ397172 | Myanmar | 5/28/2020 | 5/28/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ397173 | Myanmar | 5/28/2020 | 5/28/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ397174 | Myanmar | 5/28/2020 | 5/28/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ397175 | Myanmar | 6/2/2020 | 6/2/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ397176 | Myanmar | 6/2/2020 | 6/2/2021 | 11/8/2021..5/7/2022 | -365 |
| MZ397177 | Myanmar | 5/26/2020 | 5/26/2021 | 11/8/2021..5/7/2022 | -365 |
| MW591579 | USA | 1/18/2020 | 12/17/2020 | 4/25/2021..5/7/2022 | -334 |
| MW750862 | USA | 5/22/2020 | 3/2/2021 | 4/3/2021..5/7/2022 | -284 |
| MW750906 | USA | 5/23/2020 | 1/14/2021 | 4/3/2021..5/7/2022 | -236 |
| MW737421 | Iran | 10/25/2019 | 2/11/2020 | 4/3/2021..5/7/2022 | -109 |
| MW898809 | Iran | 12/12/2019 | 2/29/2020 | 4/25/2021..5/7/2022 | -79 |
| MZ077094 | USA | 4/14/2021 | 4/20/2021 | 7/12/2021..5/7/2022 | -6 |
| MW093534 | USA | 6/6/2020 | 6/11/2020 | 4/3/2021..9/4/2021 | -5 |
| MW883366 | USA | 3/29/2021 | 3/22/2021 | 4/25/2021..5/7/2022 | 7 |
| MW883371 | USA | 3/27/2021 | 3/16/2021 | 4/25/2021..5/7/2022 | 11 |
| MW883363 | USA | 3/29/2021 | 3/11/2021 | 4/25/2021..5/7/2022 | 18 |
| MW883370 | USA | 3/27/2021 | 3/8/2021 | 4/25/2021..5/7/2022 | 19 |
| MW883364 | USA | 3/29/2021 | 1/21/2021 | 4/25/2021..5/7/2022 | 67 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).