Vale, F.F.; Vítor, J.M.B.; Marques, A.T.; Azevedo-Pereira, J.M.; Anes, E.; Goncalves, J. Origin, Phylogeny, Variability and Epitope Conservation of SARS-CoV-2 Worldwide. Virus Research 2021, 304, 198526, doi:10.1016/j.virusres.2021.198526.
Vale, F.F.; Vítor, J.M.B.; Marques, A.T.; Azevedo-Pereira, J.M.; Anes, E.; Goncalves, J. Origin, Phylogeny, Variability and Epitope Conservation of SARS-CoV-2 Worldwide. Virus Research 2021, 304, 198526, doi:10.1016/j.virusres.2021.198526.
Vale, F.F.; Vítor, J.M.B.; Marques, A.T.; Azevedo-Pereira, J.M.; Anes, E.; Goncalves, J. Origin, Phylogeny, Variability and Epitope Conservation of SARS-CoV-2 Worldwide. Virus Research 2021, 304, 198526, doi:10.1016/j.virusres.2021.198526.
Vale, F.F.; Vítor, J.M.B.; Marques, A.T.; Azevedo-Pereira, J.M.; Anes, E.; Goncalves, J. Origin, Phylogeny, Variability and Epitope Conservation of SARS-CoV-2 Worldwide. Virus Research 2021, 304, 198526, doi:10.1016/j.virusres.2021.198526.
Abstract
The coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) challenges include understanding what triggered SARS-CoV-2 emergence, how this RNA virus is evolving or how the genomic variability may impact the primary structure of proteins that are targets for vaccine. We analyzed 19471 SARS-CoV-2 genomes and 199,984 spike glycoprotein sequences available at the GISAID database from all over the world and 3335 genomes of other Coronoviridae family members available at Genbank, collecting SARS-CoV-2 high-quality genomes and distinct Coronoviridae family genomes. Here, we identify a SARS-CoV-2 emerging cluster containing 13 closely related genomes isolated from bat and pangolin that showed evidence of recombination, which may have contributed to the emergence of SARS-CoV-2. The analyzed SARS-CoV-2 genomes presented 9632 single nucleotide polymorphisms (SNPs) corresponding to a variant density of 0.3 over the genome, and a clear geographic distribution. SNPs are unevenly distributed throughout the genome and hotspots for mutations were found for the spike gene and ORF 1ab. We describe a set of predicted spike protein epitopes whose variability is negligible. All predicted epitopes for the structural E, M and N proteins are highly conserved. This result favors the continuous efficacy of the available vaccines.
Biology and Life Sciences, Biochemistry and Molecular Biology
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.