Role of N501Y mutation in SARS-CoV-2 spike protein structure

It has been more than a year since the first case of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was found. This coronavirus has infected more than 110 million people worldwide by the end of February, 2021, and several virulent as well as more spreadable mutant forms of SARS-CoV-2 have emerged subsequently. In the latter group, three variants B.1.1.7, B.1.351, and P1 lineages, have been reported. Using computer simulation, the present paper investigates the structural differences between the wild type SARS-CoV-2 spike protein and its Asn501Tyr (N501Y) mutant variant. Time-based structural changes between the receptor binding domains of these two species are also examined. The N501Y mutation is common to all the three aforesaid mutant variants.


Introduction
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has already infected more than 110 million people and 2.4 million deaths globally [1]. This virus, first detected in December 2019, is particularly deadly due to its strong adaptability of different clades and its fast mutations. In December of 2020, a more contagious mutant version of B.1.1.7 has emerged.
Another variant strain, B.1.351 lineage appeared around the same time. In January 2021, a further variant of aggressive nature, P1 was reported [2]. As of now, it is still not fully confirmed if these new strains represent antibody resistant viruses or antibody neutralization escape mutants.
The, SARS-CoV-2 pathogen contains several structural proteins including Spike (S), Envelope (E), Membrane (M) and Nucleaocapsid (N) with negative sense RNA. From structural perspectives, the transmembrane spike protein consists of two subunits, S1 and S2; the Nterminal S1 binds to the angiotensin converting enzyme 2 (ACE2) receptors in the epithelial airway cells of human lungs, while the C-terminal S2 operates in membrane fusion. S1 is considered the receptor binding domain (RBD). A recently found mutation of the S protein, namely, the B.1.1.7 lineage or 20B/501Y.V./VOC 202012/01 mutations have shown 70% greater spreadability compared to the original virus. Along with several other mutations the Asn501Tyr (N501Y) mutation of S protein is observed in all three lineages mentioned above [2]. This 501 residue is located within the S1 subunit of the S protein.
In our earlier studies we have analyzed several immunologically relevant protein structures, specifically focusing on the correlation between their structural changes and their functions [3][4][5][6][7]. In the present paper, we examine the structures of the wild type (wt) S1 RBD and its N501Y variant, which is a mutation common to all the three virulent strains mentioned above.

Materials and Methods
There are several structures of the spike RBDs in the protein databank. For our study, we have selected the E chain of 6M0J, the SARS-CoV-2 S protein's RBD as described by Lan et al [8].
6M0J is the X ray crystal structure of SARS-CoV-2 RBD bound ACE2 receptor at 2.45 Å resolutions. We have chosen the E subunit of 6M0J as wt RBD, and used the subsequent mutant variant, N501Y for simulations. We performed two sets of simulations, one for the wt and another one for its mutant N501Y variant of RBD S1 protein.
Nanoscale Molecular Dynamics (NAMD), quickMD and Visual Molecular Dynamics (VMD) [9][10][11] programs were employed for time-based simulation studies of the wt and mutant spike S1 RBDs. Mutant variants were generated based on the wt proteins using the mutator plugin of VMD software. The "structure manipulation" feature was utilized to prepare the protein structures and to generate the psf structure files. The implicit solvation method was used in combination with the Generalized Born implicit solvent (GBIS) models [12], and the CHARMM36 force field. Energy minimization was performed in 2000 steps, and annealing was performed for 0.03 ns. The system temperature was gradually increased from 60 to 300K for 0.24 ns, while equilibration was carried out for 0.04 ns at 300 K. The final production run was performed for 30 ns at 300 K using the Langevin dynamics. No atoms were constrained for the production run. For all simulations the time step was set to 2 femtoseconds (fs) using the NVT ensemble. In total two sets of simulation data were obtained, wt strain and one single point mutant variant. The proteins' structural illustrations were developed with Biovia's Discovery Studio Visualizer [13] Results  Although a small steric hindrance with Gln498 is observed for the mutant version, following the initial simulation protocols (minimization/ annealing and equilibration) the temporal profile has changed, and the steric hindrance is no longer detected. We have presented the preliminary interaction data of Tyr501 (Fig. 1D) recorded before the minimization process; notably, this is not a commonly recommended procedure, since a minimization step would frequently remove steric overlapping.  Fig. 2. The corresponding PDB coordinates from the dcd file were generated using the stride command of the VMD window. wt Asn is hydrophilic in nature whereas mutant Tyr is hydrophobic. An aromatic residue usually interacts with its neighbor residues to make the structure stronger and stable. Therefore, based on initial data of Fig. 1, it is reasonable to assume that, within the spike protein the mutant residue may be more interactive than its wild form.  it is evident that the mutant species is more stable than its wt version. The RMSF values of the mutant (501Y) and wt (501N) residues are 1.837Å and 3.27 Å respectively. While wt Asn501 residue has high RMSF values, some of the nearby residues (around residue 484 and terminal residues around 520) also show higher fluctuation values than those of their mutant version.
Overall, and throughout the residues, the mutant species maintains a relatively lower RMSF.   Although numerous studies have been performed since the initial outbreaks of Covid 19, it is too early to draw any conclusion about its structural variants and their impacts, because this new virus can rapidly mutate and/or adapt.
Based on Fig. 1 we can assume that mutant N501Y is more interactive. However, during the time of MD simulations, changes may occur in the spike protein's interactions and/or its capability to bind with neighboring residues. Such changes could potentially alter, with the possibility of actually improving in extended runs. Since these "variants of concern" and their mutations have only been identified recently, and since experimental data on this subject is seldom, drawing a definite conclusion on the interactive nature of N501Y mutant is not straight forward. Moreover, N501Y within the S protein RBD shows detectable interactions. As the variants of SARS-CoV-2 transmit faster, stronger interactivity between N501Y and hACE2 receptor cannot be ruled out.
A recently published paper suggests the presence of efficient binding affinity between S1 RBD and ACE2, as determined by structural and functional analyses [14]. Another preprint, that also became available rather recently, indicates certain differences found between the receptor binding interfaces in N501Y mutant and wt protein ecto-domain by cryoEM [15]. An additional preprint uses computational methods to describe the structural dynamics of the mutant bound receptor [16]. While these aforesaid results are now available, the present work strictly focuses on the detailed structure based analyses of N501Y mutant residue and the S1 protein RBD. The exploratory computational results presented here also suggest that, the N501Y mutant variant exhibiting a detectable level of higher stability than its wt variant (Fig. 3). According to these results, one may infer that the mutant N501Y residue definitely plays a critical role in stabilizing the overall structure of spike protein's RBD. However considering the few experimental data on this subject, it is difficult to conclude with certainty whether or not N501Y itself can qualify as a stabilizing mutation. More direct experimental evidence will likely be necessary to further adequately address this question Previous authors have suggested that spread-rates would be relatively higher in the D614G mutated SARS-CoV-2 variant as a result of epidemiological distribution [17]. It has also been suggested that, for mice, "favored" interactions would be observed between the mutant RBD and the ACE2 receptor [18]

Conclusions
The results presented here predict greater stability for the mutant species. This inference is based on certain differences found between the two groups in their RMSD/RMSF calculations and secondary structure analyses. In a follow-up work of the present study, we will discuss the structural impact of these emerging variants on human receptor bindings and assessment of their time-based stability. It is still a subject of future investigation whether there are any other factors that promote the SARS-COV-2 variants' access to a host cell. It is also not clear at this time whether these new strains may emerge as antibody resistance viruses, or may escape from the current vaccinations, or weaken the vaccination efficacy; the answers may lie in the near future.

Conflict of interest statement
The author declares no financial conflict of interest.