Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

World-wide Sequence Variant and Non-synonymous Amino Acid Substitution Signature in SARS-COV-2 Structural Proteins

Version 1 : Received: 25 August 2020 / Approved: 27 August 2020 / Online: 27 August 2020 (12:39:30 CEST)
Version 2 : Received: 26 February 2021 / Approved: 4 March 2021 / Online: 4 March 2021 (10:17:15 CET)

A peer-reviewed article of this Preprint also exists.

Jayanta Kumar Das and Swarup Roy. 2021. A study on non-synonymous mutational patterns in structural proteins of SARS-CoV-2. Genome. 64(7): 665-678. Jayanta Kumar Das and Swarup Roy. 2021. A study on non-synonymous mutational patterns in structural proteins of SARS-CoV-2. Genome. 64(7): 665-678.


Like other viruses, SARS-COV-2 too mutating and thus creating divergent variants across the world. Protein sequence variation occurs due to non-synonymous single-nucleotide polymorphism (SNP) that alter the amino acid. Amino acid substitutions on homooligomer interfaces may change the structure of the protein and hence alter the regular or known functional activities of a viral protein. Studies reveal that even a single point mutation in virus protein can significantly change their biology, leads to peculiar pathogenic properties. Therefore, an in-depth investigation of the amino acid substitution in the genomic signature of a protein is highly essential for the rapidly evolving virus-like SARS-COV-2. Investigation of world-wide and country-specific substitution features may be crucial and highly essential to decipher pathogenicity. These might be also helpful to precise structure prediction and identification of possible therapeutic targets for effective drug design. We perform extensive analysis towards highlighting and characterizing the amino acid substitution signature occurs in the four structural proteins (Spike-S, Nucleocapsid-N, Membrane-M, Envelope-E) of SARS-COV-2. We use a total of 9587 viral sequences reported from 49 different countries across the globe. In this study, we try to study the amino acid substitution patterns and its impact on change in biochemical properties, thereby possible changes in protein structures. We perform the following analysis: a) isolating and grouping variants we considered, for different protein sequences; b) identifying amino acid substitution type that are frequently and rarely occurring and reporting their location within the sequence; c) change in chemical properties due to amino acid substitution; and f) highlight country-specific divergent variation and substitution signature. In terms of mutational changes, E and M proteins are relatively stable than N and S proteins. A significant quantity of variations is observed in spike (S) proteins. Our study further reveals an interesting fact that the substitution location is random in N protein, whereas the substitution sites in M protein is less varying and almost stable. Substitutions specific to active sub-domains in S and N proteins reveals that sub-domains like Heptapeptide Repeat (HR2), Fusion peptides (FP), and Transmembrane (TM), which are involved in cellular membrane fusion and entry of the virus into the host cells, are significantly mutated. Majority of the substitutions leads to change in biochemical properties (side chain and hydropathy) of amino acid. A good number of exclusive variants are found specific to a particular country. We strongly believe that the current findings will be helpful for protein structure analysis of viral structural proteins and antiviral drug discovery.


Clustering; Mutation; Amino acid substitution; Structural proteins;Biochemical properties;Functional sub-domains


Biology and Life Sciences, Virology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0

Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.