Abstract
SARS-CoV-2, the novel coronavirus behind COVID-19 pandemic is acquiring new mutations in its genome. Although some mutations provide benefits to the virus against human immune response, a number of them may result in their reduced pathogenicity and virulence. By analyzing more than 3000 high-coverage, complete genome sequences deposited in the GISAID database, here I report a unique 28881-28883:GGG>AAC trinucleotide-bloc mutation in the SARS-CoV-2 genome that results in two sub-strains, described here as SARS-CoV-2g (28881-28883:GGG genotype) and SARS-CoV-2a (28881-28883:AAC genotype). Computational analysis and literature review suggest that this bloc mutation would bring 203-204:RG(arginine-glycine)>KR(lysine-arginine) amino acid changes in the nucleocapsid (N) protein affecting the SR (serine-arginine)-rich motif of the protein, a critical region for the transcription of viral RNA and replication of the virus. Thus, 28881-28883:GGG>AAC bloc-mutation is expected to modulate the pathogenicity of the SARS-CoV-2. Remarkably, SARS-CoV-2g and SARS-CoV-2a strains can be linked with the heterogeneity of COVID-19 cases across different regions within and between countries by analyzing existing data. Sequence analysis suggests that severely affected cities, such as Milan, Lombardy, New York, Paris have the predominant presence of SARS-CoV-2g strains, whereas less affected places like Abruzzo, Lyon, Valencia have a relatively higher presence of SARS-CoV-2a, an indication that the latter strain may contribute to the reduced cases of COVID-19. A similar relationship is observed when Netherlands, Portugal are compared with Spain, France and Germany. These analyses suggest that the SARS-CoV-2 has already evolved into a less infective SARS-CoV-2a affecting COVID-19 cases in different regions. The time a country or region needs to acquire SARS-CoV-2a strains may be indicative to the time it would need to overcome the peak of the COVID-19 cases. To confirm these assumptions, prompt retrospective and prospective epidemiological studies should be conducted in different countries to understand the course of pathogenicity of the SARS-CoV-2a and SARS-CoV-2g. Potential drugs can be designed targeting 28881-28883 region of the N protein to modulate virus pathogenicity.