Preprint
Article

World-wide Sequence Variant and Non-synonymous Amino Acid Substitution Signature in SARS-COV-2 Structural Proteins

Altmetrics

Downloads

641

Views

566

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

25 August 2020

Posted:

27 August 2020

Read the latest preprint version here

Alerts
Abstract
Like other viruses, SARS-COV-2 too mutating and thus creating divergent variants across the world. Protein sequence variation occurs due to non-synonymous single-nucleotide polymorphism (SNP) that alter the amino acid. Amino acid substitutions on homooligomer interfaces may change the structure of the protein and hence alter the regular or known functional activities of a viral protein. Studies reveal that even a single point mutation in virus protein can significantly change their biology, leads to peculiar pathogenic properties. Therefore, an in-depth investigation of the amino acid substitution in the genomic signature of a protein is highly essential for the rapidly evolving virus-like SARS-COV-2. Investigation of world-wide and country-specific substitution features may be crucial and highly essential to decipher pathogenicity. These might be also helpful to precise structure prediction and identification of possible therapeutic targets for effective drug design. We perform extensive analysis towards highlighting and characterizing the amino acid substitution signature occurs in the four structural proteins (Spike-S, Nucleocapsid-N, Membrane-M, Envelope-E) of SARS-COV-2. We use a total of 9587 viral sequences reported from 49 different countries across the globe. In this study, we try to study the amino acid substitution patterns and its impact on change in biochemical properties, thereby possible changes in protein structures. We perform the following analysis: a) isolating and grouping variants we considered, for different protein sequences; b) identifying amino acid substitution type that are frequently and rarely occurring and reporting their location within the sequence; c) change in chemical properties due to amino acid substitution; and f) highlight country-specific divergent variation and substitution signature. In terms of mutational changes, E and M proteins are relatively stable than N and S proteins. A significant quantity of variations is observed in spike (S) proteins. Our study further reveals an interesting fact that the substitution location is random in N protein, whereas the substitution sites in M protein is less varying and almost stable. Substitutions specific to active sub-domains in S and N proteins reveals that sub-domains like Heptapeptide Repeat (HR2), Fusion peptides (FP), and Transmembrane (TM), which are involved in cellular membrane fusion and entry of the virus into the host cells, are significantly mutated. Majority of the substitutions leads to change in biochemical properties (side chain and hydropathy) of amino acid. A good number of exclusive variants are found specific to a particular country. We strongly believe that the current findings will be helpful for protein structure analysis of viral structural proteins and antiviral drug discovery.
Keywords: 
Subject: Biology and Life Sciences  -   Virology
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated