Insights in the interaction of HPV-16 E6 and its variants (E-G350, E-A176/G350, E-C188/G350, AAa and AAc) with MAGI-1

Oncogenic protein E6 from Human Papilloma Virus 16 (HPV-16) mediates the degradation of Membrane-associated guanylate kinase with inverted domain structure-1 (MAGI-1), throughout the interaction of its protein binding motif (PBM) with the Discs-large homologous regions 1 (PDZ1) domain of MAG1-1. Generic variation in the E6 gene that translates to changes in the protein’s amino acidic sequence modifies the interaction of E6 with the cellular protein MAGI-1. MAGI-1 is a scaffolding protein found at tight junctions of epithelial cells, where it interacts with a variety of proteins regulating signaling pathways. MAGI-1 is a multidomain protein containing two WW (rsp-domain-9), one guanylate kinase-like, and six PDZ domains. PDZ domains played an important role in the function of MAGI-1 and served as targets for several viral proteins including the HPV-16 E6. The aim of this work was to evaluate, with an in silico approach, employing molecular dynamics simulation and protein-protein docking, the interaction of the intragenic variants E-G350 (L83V), E-C188/G350 (E29Q/L83V), E-A176/G350 (D25N/L83V), E6-AAa (Q14H/H78Y/83V) y E6-AAc (Q14H/I27RH78Y/L83V) and E6-reference of HPV-16 with MAGI-1. We found that variants E-G350, E-C188/G350, E-A176/G350, AAa and AAc increase their affinity to our two models of MAGI-1 compared to E6-reference.


Introduction
High-risk human papillomaviruses (HR-HPV) are the principal etiological agents of cervical cancer (CC), being the HPV-16 genotype one of the most prevalent worldwide [1]. The encoding proteins E6 and E7 from HPV-16 are the major oncogenic determinants of the disease's progression. These proteins control regulatory functions of the cell cycle, promote proliferation, induce malignant transformation, and facilitate migration and invasion of transformed cells [2].
E6 protein contains 151 amino acids, with a molecular weight of approximately 19 kDa. Structurally, E6 has two zinc fingers, and a LXXLL domain which is vital for its oncogenic potential. Moreover, an important motif of E6 is the PDZ binding motif (PBM), composed of the residues: E148, T149, Q150, and L151 located at the carboxyl terminus of the protein [3]. Any variability in the amino acid sequence in the PBM motif or in neighbouring regions of E6 could modify the if more protein regions were added. The study of protein-protein interactions remains a challenge these days due to the high cost of production and purification of recombinant proteins and the technical difficulties to crystallize them. The 1491 amino acids of MAGI-1 made it difficult to obtain it´s 3D structure. The inherent technical difficulties to solve the complex structure of proteins, led to the call for integrating complementary computational approaches. We decided to delimit our MAGI-1 models to experimentally proven domains to interact with E6 [27].
Recently, our team has adopted an in-silico analysis approach to evaluate the structural changes of E6 and its variants [28,29]. Since, evidence of the role of the PBM motive of E6 in the progression and aggressiveness of CC has been published [4,26], and the possible changes in the behaviour of the AA variants and the other variants are not clear. We propose an in silico approach to get insights into the interactions of the E6-reference oncoprotein and its variants (E-G350, E-A176/G350, E-C188/G350, AAa and AAc), against MAGI-1 a cellular target protein.

Results and Discussion
HPV-16 HPV-16 is accounting for more than 70% of the CC cases [1], it has been well established that the oncoproteins E6 and E7 are responsible for the onset and aggressiveness of the disease. Of these two proteins E6, dysregulates the cell cycle, promotes hipper proliferation, induces malignant transformation, and facilitates migration and invasion of transformed cells in in vivo and in vitro assays [2]. Also, it has been proposed that the difference in the oncogenic potential of this virus is mediated by the genetic variation that occurs in the E6 gene, which alters the thermodynamics and structural stability on the 3D structure of the protein. However, the molecular insights of the changes remain unknown [7].
In order to get insights into the interaction of the five variants of E6 from HPV-16 with the cellular protein MAGI-1 were adopted an in silico approach. Molecular Dinamics (MD) simulations and docking analyses were performed to explore the disruptive effect of mutations in five variants of E6 from HPV-16 and their consequences on the interactions with MAGI-1 255 and 329.

3D protein structures
Multiple alignments of the sequence of E6 and its variants were performed to evidence amino acidic changes between them. E6 mutations Q14H, D25N, I27R, E29Q and H78Y are found in a non-domain region adjacent to the zinc finger domain 1. While E6 mutations L83V and H78Y are located in an interdomain region between the two zinc finger domains. The 2D structures of the E6-reference and the variants were predicted; the results show that the secondary structure did not change between the variants and E6-reference ( Figure 1A).
The crystal structure of E6 present five alfa-helix and four beta-sheets. Also, it contains two zinc molecules forming two finger domains located at C30, C33, C63 C66, and C103, C106, C136, C139 residues (Figure1B). After mutating E6-reference to obtained variants (E-G350, E-A176/G350, E-C188/G350, AAa and AAc) we conducted an structural alignment was using the VMD RMSD tool ( Figure 1B). The RMSD values less than two Å represents accurate models. The RMSD values obtained for the variants were 0.51Å for E-G350 (green), 0.66Å for E-C188/G350 (yellow), 0.18 Å for E-A176/G350 (blue), 0.44 Å for AAa (red), and 0.69 Å for AAc (orange) which means that the mutations did not alter the proteins 3D structure, other than the punctual mutations sites ( Figure  1B). All models were evaluated by Ramachandran plot showing for all variants that 91.9% of the amino acids fall in the favored region and 8.1% in the allowed region, which means a good stereochemistry for 100% of the residues ( Figure S1). A close up of the mutated residues in the alignment shows that the side chains of mutants H14 and Y78 were exposed to the protein's surface while the amino acids Q14 and H78 side chains of E6-reference were inverted ( Figure 1B). According to the program HOPE, which analyses the structural and functional effects of point mutations, H14 is bigger than Q14, bigger residues might lead to bumps on the 3D structure of the protein. H14 is among the observed mutations at this position in other homologous sequences. This sometimes suggests that the mutant is not damaging for the protein's structure and function, on the other hand, the residue is located near a highly conserved position and can gain interactions with target proteins. The Y78 is bigger and more hydrophobic than the H78. This can result in loss of hydrogen bonds and disturbance of correct folding. The accessibility of the residues in the mutants could increase the number of interactions with the MAGI-1 models [30].
Mutations E29Q and D25N remained the same size; therefore, not visible change in 3D structure was observed ( Figure 1B). A change in residue charge from negative to neutral can cause loss of interactions with other molecules or residues [30]. Mutant R27 is bigger than I27, this can be observed in the 3D structure comparison ( Figure 1B). Also, the change of a neutral to positive residue leads to the possibility of repulsion of ligands or other residues with the same charge. Moreover, the hydrophobicity of the wild type residue is lost which will lead to the loss of hydrophobic interactions [30]. The mutation L83V found in all the variants can cause the proteins to lose interactions with other proteins because, V83 is a smaller than L83 ( Figure 1B), loss of interactions with cellular target proteins could lead to a diminution on the affinity of interactions for the variants [30].
There are many crystal structures in the RCSB PDB server related to the crystalized MAGI-1; however, none of these files corresponds to the complete protein structure. Moreover, there are no reliable software for homology modelling such large proteins [27]. To overcome this drawback, we delimitated our models to domains that have been experimentally shown to interact with E6 [21,24,31]. Our first model included the WW1, WW2, PDZ1 domains and was denominated MAGI-1 255 and a second model where we added a highly disordered region of 76 amino acids adjacent to the PDZ1 domain denominated it MAGI-1 329. A sequence alignment of our final models is shown in figure 2A. The 3D homology models of MAGI-1 255 (amino acid 300 to 554) and of MAGI-1 329 (amino acid 300 to 628) were obtained using the crystal structure from human MAGI-1 PDZ1 (PDB ID:2KPK) as a template on the I-TASSER server ( Figure 2B and 2C). The best models were chosen due to the criteria of good alignment with the template measured by C-Score, TM score, and RMSD values. Model MAGI-1 255 shown in Figure 2B consists of six alfa-helix, one 310 helix, eleven beta-sheets and the rest of the residues appeared lightly twisted in random coils, which are 15.3%, 2.4%, 13.7% and 68.6% respectively of the protein structure. The two WW domains and the interdomain regions from amino acid G300 to amino acid I471 are shown in magenta, and the PDZ1 domain from amino acid H472 to amino acid R554 is shown in purple ( Figure 2B). Model MAGI-1 329 consists of seven alfa-helixes, twelve beta-sheets, and the rest of the residues appeared in loops and coils, which are 15%, 14.1%, and 70.9%, respectively the protein structure. For this model, the 3D structure from amino acid G300 to R554 was coloured in purple, and the extra region of 76 (G555 to T628) amino acids were colored cyan (figure 2C). According to dynamic studies using complementary isothermal titration calorimetry and nuclear magnetic resonance (NMR), the interaction of E6 with MAGI-1 occurs mainly with the PDZ-1 (H472-R554) domain, but different affinity patterns were observed with adjacent regions [4]. To evaluate if the interaction of E6 with our models is modified by adjacent amino acids, we decided to add a highly disordered region of 76 amino acids (G555-T628) to model MAGI-1 329 ( Figure 2C) Ramachandran plot for model MAGI-1 255 and MAGI-1 329 exhibited 92.8% and 91.4% respectively of residues in most favored regions and 7.2% and 8.6% respectively residues are in disallowed regions, which shows a good stereochemistry for more than 90% of the residues, this makes our models acceptable for more refinement with MD simulation ( Figure S2).

Molecular dynamics simulation analysis
To examine the change in the protein dynamics and stability, the 3D models of HPV-16 E6 and its variants, as well as MAGI-1 255 and MAGI-1 329 were refined by MD simulation for 200 ns. Trajectories were analyzed by calculating the root mean square deviation of atomic positions (RMSD), root mean square fluctuation (RMSF), the radius of gyration (Rg), the dPCA analysis and dPCA based clustering ( Figure 3).
After 200 ns of the MD simulation and using a snapshot of the most populated cluster of E6 and variants, a structural alignment was done ( Figure 3A). The carboxyl terminus of the E-C188/G350, AAa and AAc proteins showed a greater difference compared to E6-reference, while the other variants 3D structures remained very similar compare to E6-reference ( Figure 3A).
The RMSD calculation of the E6-reference (purple) during the 200 ns of MD simulation, reached equilibrium at 20 ns of trajectory, while the non-European variants AAa (orange) and AAc (red) were equilibrated at 80 ns, after 150 ns AAa variant loses its equilibrium and recovers it at the end of the DM simulation, probably due to its mutations and its context ( Figure 3B). Something similar happens to the variant E-C188/G350 (yellow), which reaches equilibrium at 60 ns, and it's equilibrium is disturbed from the 150ns to the180ns. This behaviour is attributed to mutations in the E29A and L83V positions that directly affect the structure of the protein, causing disturbs an it´s stability ( Figure 3B). Concerning E-A176/G350 (blue), the equilibrium was reached at the first 20ns, but a greater disturbance episode it's observed from 100ns to the end of the trajectory it is also thought that the nature of mutation D25N on the proteins may contribute to this behaviour. It was also observed that for E6-reference, E-G350 (green), and E6-AAc, the RMSD values during the simulation ranged from 2 Å to 5 Å ( Figure 3B). While variants AAa, E-A176/G350 and E-C188/G350 were characterized by higher continuous RMSD fluctuations from the 140 ns to the end of the MD simulation ( Figure 3B). The RMSD values of simulated proteins indicated their stability and particular behavior and provided a suitable basis for further analysis.
The Rg presents different grades of compactness during the simulation evidencing a less compactable grade at the end of the trajectory, mainly in the variants E-A176/G350, E-C188/G350 and AAa ( Figure 3C in yellow and orange). Meanwhile, E6-reference and variants E-G350 and AAc maintain compactness during the simulation ( Figure 3C in green and red). This also confirms that point mutations caused structural destabilizing effects leading to the loss of protein compactness in the E-A176/G350, E-C188/G350 and AAa variants.
Since distance deviations from the starting structure may not necessarily reflect mobility of structural elements, RMSF was used to obtain information on flexibility. According to the data graph in figure 3C, there are six maximum fluctuations peaks areas shared by E6-reference and variants: One at M1 to P5, for E6-reference the fluctuation peak was 16 Å, and for the variants, the highest peak corresponds to E-C188/G350 with 27 Å, the rest of the variants fluctuate from 14 to 24 Å of distance being E-G350 the lowest peak, this region is composed by coils and turns with non-secondary structure ( Figure 3D). The second region at L28 to L50 composed of loops and an alpha helix: reached 15 Å for E6-reference and it wasthe highest fluctuation peak. Fluctuation for variants ranges from 20 to 25 Å; clearly the variants had greater fluctuation in these residues, where E-C188/G350 had the greates fluctuation peaks ( Figure 3D). This phenomenon is interesting and it's attributed to mutation E29Q exclusive of the E-C188/G350 variant, while the behavior of the other variants is exclusive of their own structural changes caused by their shared and exclusive mutations. The third peak at C51 to L65, in a coil and two beta-sheets, the variants reached 20 to 23 Å of fluctuation peaks, while E6-reference´s fluctuation was only 16 Å ( Figure 3D). The fourth region of fluctuation at C80 to L110, was composed of loops, one alpha helix and two beta-sheets and reaches 25 Å for variant E-A176/G350 and E6-reference. For variants AAa and AAc the fluctuation distance was 20 Å. While the peak for variant E-G350 was only 14 Å, evidently less flexible than the other variants and E6-reference ( Figure 3D). The fifth fluctuation peak at C111 to C140, in two beta sheets, coils and an one alpha helix, for variants AAa, AAc, E-C188/G350, E-A176/G350 had a fluctuation distance of 20 Å . E6-reference, also, reached a distance of 20 Å. On the other hand, the distance of E-G350 reached 15 Å, and it tends to decrease for the rest of the amino acids at the carboxyl terminus ( Figure 3D). Finally the residue-based RMSF of the backbone for the E6-reference displayed less flexible residues than the variants (E-G350, E-C188/G350, E-A176/G350, E6-AAa, and E6-AAc), at the carboxyl terminus (145-151) composed mainly by loops ( Figure 3D). Interestingly, this region includes the PBM (ETQV) motif of E6, which is important for this oncoprotein interaction with MAGI-1. Since there is a higher fluctuation in variants compares to E6-reference, it can be deduced that mutations change the structural flexibility of the 3D protein structure. Ramachandran analysis after the MD simulation of these structures shows that more than 98% of the amino acids of the proteins during the simulation remain in the highly favored regions, which means that the protein's conformation are well refined and have native conformations ( Figure S3).
For our two MAGI-1 models we showed a snapshot of the most populated cluster from the dPCA clustering analysis in (Figures 4A and B). The PDZ1 domain of our two models have little changes in its 3D structure, but overall it keeps its main secondary structures ( Figure 4A and B).
The RMSD of MAGI-1 255 (purple) and MAGI-1 329 (black) models during the 200 ns trajectory showed that both models reached equilibrium before the 100 ns of the simulation and continue stable for the rest of the trajectory. Moreover, MAGI-1 255´s RMSD value after equilibrium ranges from 10 to 13 Å and MAGI-1 329 model´s RMSD values range from 9 to 10 Å, which means our models are reliable for further investigation ( Figure 4C). We explored the flexibility of the models by measuring Cα. The RMSF values of the models through trajectory, mainly 3 regions of MAGI-1 255-model showed more flexible areas, those regions correspond to amino acids G300 to A309 of the WW1 domain, G349 to D379 belong to the WW2 domain and Q399 to H429 that correspond to an interdomain region between the WW2 and PDZ1 domains of MAGI-1 ( Figure 4D). MAGI-1 329 model showed three regions with more flexibility which include amino acids E304 to I319 of the WW1 domain, Q399 to V433 corresponding to an inter domain between WW2 and PDZ1 domains and N566 to T628, this region was the most flexible of the two models. Interestingly, this region corresponds to a highly disordered region of the whole protein ( Figure 4D) [4]. The Rg show that both models maintained a compacted structure throughout the trajectory of MD simulation simulation ( Figure 4E). The Ramachandran analysis shows the refinement of more than 80% of the model´s residues ( Figure S4).

Dihedral Principal component analysis.
dPCA was used to obtain a broader view of dynamic properties with respect to MD simulation simulation results of E6-reference and its variants, MAGI-1 255 and MAGI-1 329. The covariance matrix for the first 20 eigenvectors of E6-reference was 11.40 nm2 and 10.09 nm2, 11.14 nm2, 7.12 nm2, 12.23 nm2 and 10.60 nm2 for the variants E-G350, E-A176/G350, E-C188/G350, AAa and AAc, respectively ( Figure 5A). Moreover, the dPCA analysis showed that the first 20 eigenvectors captured 45-57% of the total protein motions (56.7, 45.5, 52.5, 51.1, 55.3 and 53.8%) for E6-reference, E-G350, E-A176/G350, E-C188/G350, AAa and AAc respectively ( Figure 5B). Whereas the projections of the first two principal components (PC1 vs PC2) contributed to 15-28% of the collective motions (28.22, 15.11, 24.20, 22.40, 26.66 and 24.59%) for E6-reference E-G350, E-A176/G350, E-C188/G350, AAa and AAc respectively ( Figure 5B). These results showed a change in the motion of the atoms of the E-A176/G350, AAa and AAc variants compared to E6-reference and a considerable change in the motion of the atoms of G350 and E-C188/G350 compared to E6-reference, which suggests that the properties of the movements described by the first PCs were different in the six protein systems ( Figure 5A and B). The projection of the first two eigenvectors (PC2 vs. PC1) for E6-reference, E-G350, E-A176/G350, E-C188/G350, AAa and AAc ( Figure 5C-H), shows that E6-reference system (Fig. 5C) present different mobility behavior compared to the variants systems. The variants E-A176/G350 and AAa were expanded in the conformational space due to their flexibility ( Figure 6F and G). The variants E-G350, E-C188/G350 and AAc have more restricted motions, making them the more stable of the six protein systems (Figure 5 D, E and H). This points out that the punctual mutations of residues affect conformation and motion.
With respect to MAGI-1 255 and MAGI-1 329 the covariant matrix for the first 10 eigenvectors were 31.69 and 35.97 nm2 and the dPCA analysis showed that the first 10 eigenvectors captured 44.8 and 44.6 % of the proteins total motions ( Figure 6A). The projections of the first two principal components (PC1 vs PC2) contributed to 29.5 and 39.2% of the collective motions ( Figure 6B). These results showed a change in the motion of the atoms of MAGI-1 255 and MAGI-1 329 are considerable, which means that missing 76 amino acids of model 255 restricts its motions making it more stable. These results support Ramírez et al., 2015 observations about the contribution of the highly disordered 76 amino acid region adjacent to the PDZ-1 domain of MAGI-1 to its behavior.

Protein docking
Mutations in proteins can affect protein structure and stability, consequently, these mutations alter the kinetics and thermodynamics of protein-protein interactions (PPI) [32]. Using ClusPro blind base docking method, a representative protein structure of the most populated cluster obtained from the dPCA clustering analysis of the MD simulation refined proteins (E6-reference, E-G350, E-A176/G350, E-C188/G350, AAa and AAc) were docked against the representative structure of MAGI-1 329 and MAGI-1 255, also, obtained from dPCA clustering analysis. Docking resulted in 1000 protein conformations of complexes. The top 10 docked complexes from each problem were analyzed for the lowest energy and residues binding between the two proteins. The best complexes were selected base on a greater number of cluster members and the lowest energy according to ClusPro guidelines.
The global free binding energy of the E6-reference against MAGI-1 329 and 255 complexes were calculated as -51.90 and -48.14 kcal/mol respectably, using FiberDock [33]. These energies were bigger than the energies obtained from the complexes between the E6 variants and the MAGI-1 models; this means that there is a greater affinity between MAGI-1 models and E6 variants (Table 1). According to our docking results the variants AA, E-A176/G350 and E-G350 showed smaller energy values -148.17, -148.86 and -148.86, we interpreted this as a gain of interaction affinity between these proteins. Moreover, the lower energy docking values were observed with the MAGI-1 329 model (Table 1). In addition, there was an increment in the number of hydrogen bond and salt bridges interactions of E6 and its variants with MAGI-1 329 compared to our MAGI-1 255 model (Table S1). The protein-protein interfaces of the complexes were analyze using PDBsum generate [49]. The top docked complex of each variant against MAGI-1 255 and 329 were subjected to PDBsum to identify the interacting residues. Comparative analysis between the twelve complexes from docking interfaces of E6-reference and its variants identified a list of different amino acids that were shown to be responsible for the interaction with the MAGI-329 and MAGI-1 255.
For the complex E6-reference and MAGI-1 255 ( Figure 7A), occurs mainly through amino acids of the WW2 domain (Y74, V79, D80, W66, G65, A64, Y78, I76 and from the PDZ1 domain A234, H231 and G230 with twenty-four amino acids of E6-reference that included: Y81, R77 and Y76 which are adjacent to H78 a highly mutated amino acid in E6 (Table S1). There are nine hydrogen bonds, 3 salt bridges (Table S1). On the other hand, the interaction of E6-reference with MAGI-1 329 was through amino acids corresponding to WW2 domain and adjacent non-domain regions. Amino acids C34 to F171 were responsible for most of the interactions with twenty-three amino acids of E6-reference that included: R147, R131, R77, Y76, V31, T32 and H78 from E6-reference ( Figure 7G). Some of the E6-reference interacting residues are located closed to the mutation sites, and its interacting residues are different compared to the variants. It is important to point out that H78 is mutated to T78 in the AAa and AAc variants, and the interaction of this residue with the models of MAGI-1 is lost for variant AAa. Also, we observed that R147 is the only residue close to the PBM motif of E6 interacting with our MAGI-1 329 model (Table S2). The differences between the interaction of E6-reference with our two models from MAGI-1 is remarkable, the increase in protein´s coverage results in a gain of interacting residues, an increase of hydrogen bonds, Salt bridges (Table S1). Regarding the missing interaction of the PBM motif with MAGI-1 255 and 329, we believe that the residues could be oriented in a way that avoids the interaction or may be block by adjacent residues. According to the online server HOPE, this could be attributed to changes in size and in charge from the mutations of each variant [50].
Its important to point out that the interacting residues from complex E-C188/G350/MAGI-1 329 included all the amino acids of the PBM motive of E6 (E148, T149, Q150, and 151V/L) interacting mainly with amino acids from the WW1 (G1-R33) domain and amino acids from a highly disordered region of MAGI-1 (G256-T329 in our model). The complex AAa/MAGI-1 329 shows a similar interacting pattern where amino acids from the PBM motif interact mainly with the highly disordered region of MAGI-1 (G256-T329). Different interacting patterns were observed from the rest of the variants and model MAGI-1 255 (Table S2).
It is important to understand the changes in PPIs caused by these mutations may alters the affinity and stability of the interaction of E6 with proteins important for tissue homeostasis. The increase in affinity and stability of the interaction of E6 with MAGI-1 result in an increase in the degradation of MAGI-1 and as a consequence in the loss of stability of important cell complexes that maintain cell-cell adherence at the adherents junctions.

3D protein structures
Multiple alignments of the sequences were performed using CLUSTAL X 1.81. The secondary structure of the E6 protein and its variants were predicted using PSIPRED server. The crystal structure of VPH-16 E6 protein was obtained from the Protein Data Bank (RCSB PDB), with the identification number: 4XR8, chain H. The E6 structure on this PDB contains 151 residues with four-point mutations in S80C, S97C, S111C, and S140, which were reverted to obtain the E6 reference in the PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC. After that, all the mutations were carried out in the E6 reference to obtained all the variants of HPV-16. The mutations were done as indicated next, to obtained E-G350: L83V; E-C188/G350: E29Q and L83V; E-A176/G350: D25N and L83V; AAa: Q14H, H78Y, and L83V; for AAc: Q14H, I27R, H78Y, and L83V. The obtained proteins were structurally aligned and visualized using VMD.
To obtained the 3D structure of MAGI-1, a total of 255 and 329 amino acids from the amino acid terminal region of the protein sequence (300-554 and 300-628) were retrieved from the UniProtKB database, (accession number Q96QZ7) and submitted to I-TASSER server as two separate jobs. First, the 3D structure with 225 residues, which comprises WW1, WW2 and the PDZ1 domains of MAGI-1 was obtained by homology modelling using the I-TASSER server, as a template, we selected PDB file: 2KPK, which corresponds to the PDZ-1 domain of the MAGI-1. Furthermore, a 3D structure of 329 amino acids of MAGI-1, which includes the WW1, WW2, PDZ-1 and a 76 amino acidic disordered region of this protein, was obtained by homology modelling on the I-TASSER server using the same template PDB. All the 3D predicted structures were evaluated using the Rampage web server to obtain the Ramachandran plots (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php).

Molecular dynamics simulation
Parameters for the two Zn2+ ions and eight cysteine-ligand coordination of E6 were kindly provided by Justin Lemkul from the Virginia Polytechnic Institute and State University, these parameters included a CYSD patch for the deprotonation of the eight zinc-bound cysteines and a ZN_C patch to covalently link cysteines to the zinc ions. The correct coordination of the deprotonated cysteines and the ion zinc using these patches has been demonstrated in previous studies [34], the CHARMM 36 force field was employed for the application of the patches using CHARMM software [35,36]. Afterwards, a 200 ns of MD simultion of E6 and its variants were performed using the 2.8 NAMD software package [37] with CHARMM36 and CHARMM22 force fields [36]. For MAGI-1 models we used CHARMM 27 topology and parameter files for proteins. Each system was placed in a cubic box of TIP3P water with a minimum distance of 10 Å between the solute atoms and the edge of the box [38]. To neutralize the systems, we added 7568 water molecules, 21 Na+ and 27 Cl− to the E6-reference. To variant E-G350, we added 7420 water molecules, 21 Na+, and 27 Cl−, to E-C188/G350 variant, 7484 water molecules, 21 Na+, and 28 Cl− were added, to E-A176/G350 variant, 7660 water molecules, 22 Na+ and 29 Cl− were added, to AAa variant 7566 water molecules, 21 Na+ and 27 Cl− were added and to AAc variant, 7554 water molecules, 21 Na+ and 28 Cl− were added. For MAGI-1 255 we added 10932 water molecules and 18 Na+, and for MAGI-1 329, 11456 water molecules and 23 Na+ were added. Each system was neutralized to 0.15 mol/L of NaCl and submitted to minimization energy for 10,000 steps of steepest descent minimization followed by equilibration for 1 ns under constant temperature 310 K and pressure 1 atm (NPT) ensemble with protein atoms restraints [39,40]. MD simulation were run for 200 ns, considering all proteins as soluble.

Trajectory and dPCA analysis
The carma software [44], was used to calculate the root mean square deviation (RMSD) calculates the average deviation in the atomic stability throughout MD simulation, radius of gyration (Rg) measures the compactness and expansion of the molecules, and the root means square fluctuation (RMSF) a parameter to explored the flexibility of the protein through MD simulation, as well as the Principal component analysis (PCA) and dPCA based clustering analysis employing the last 50 ns of the trajectory. dPCA is a standard tool in statistical mechanics used in order to determine the correlated motions of the residues to a set of linearly uncorrelated variables called principal components, and it allows to obtain the large scale collective motions of the atoms on the simulations, which frequently correlates with the proteins biological function and structural properties [41]. Finally, we obtained the PDB files from the most populated cluster analysis and performed a protein-protein docking. Molecular graphics were performed in Sigma plot 12.0. VMD was used to visualize all the 3D proteins [42].

Protein-protein docking
The protein-protein dockings were carried in ClusPro server [43,44], the program has been consistently rated among the best global docking methodologies in the CAPRI challenge (Critical Assessment of Predicted Interactions) [43]. For the docking studies, refined models for most populated cluster from E6-reference or its variants were docked within the MAGI-1 (235 and 329) homology models, where MAGI-1 models were the receptors and E6, and its variants were used as a ligand. The conformers with the highest cluster members and the lowest energy calculated in FireDock were taken for analysis on the PDBSum server [33,45]. All docking complexes were visualized by VMD software [42].

Conclusions
We proposed an in-silico approach to evidence the differences in the interaction of E6 and five of its natural variants with two models, cellular protein MAGI-1. According to our results variants, AAa and E-C188/G350 showed greater levels of instability, less compactness, a gain of fluctuation regions that are correlated to the increment of active sites. We attribute this behaviour to specific mutations of proteins, and these mutations cause physicochemical changes that affect the behaviour of proteins. Very marked dynamic changes are observed, particularly at the amino and carboxyl termini of proteins, where there is a gain in flexibility in the variants compared to E6-reference. Also, according to the dPCA results a dramatic change of motions behaviour for mutants compared to E6-reference. These differences in structure and mobility incremented the affinity of variants E-C188/G350 and AAa for our models of MAGI-1. E-C188/G350 increases its affinity for our models by three times, increasing the binding bonds by 50%. A similar pattern is observed among all the variants compared to E6-reference. Our results suggest that the physicochemical changes that gave rise to thermodynamic changes of the variants and an increase the affinity for our MAGI-1 models. Here, we were able to represent the possible changes in the physicochemical properties of E6 proteins and the repercussion in the interaction affinity with MAGI-1. An experimental validation will be necessary to evaluate the degradation profile of the MAGI-1 protein mediated by E6-reference and its variants.