Hepatitis C Virus E2—Host Cell Receptor, HSPA5, Binding Site Prediction

Hepatitis C Virus (HCV) is the main causative factor for liver cirrhosis and the development of liver cancer, with a confirmed ~ 180 million infections worldwide. E2 is an HCV structural protein responsible for virus entry to the host cell. Heat Shock Protein A5 (HSPA5), also termed BiP and GRP78, is the master regulator of the unfolded protein response mechanism, where it mainly localizes in the lumen of the Endoplasmic Reticulum (ER) in normal conditions. Under the stress of HCV infection or carcinogenesis, HSPA5 is upregulated. Consequently, HSPA5 escapes the ER retention localization and translocates to the cytoplasm and plasma membrane. Pep42, a cyclic peptide that was reported to target explicitly cell-surface HSPA5 in vivo. Owing to the high sequence and structural conservation between the C554-C566 region of HCV E2 and the Pep42, then we propose that the HCV E2 C554-C566 region could be the recognition site. The motivation of this work is to predict the possible binding mode between HCV E2 and HSPA5 by implementing molecular docking to test such proposed binding. Docking results reveal the high potent binding of the HCV E2 C554-C566 region to HSPA5 substrate-binding domain β (SBDβ). Moreover, the full-length HCV E2 also exhibits high binding potency to HSPA5 SBDβ. Defining the binding mode between HCV E2 and HSPA5 is of significance, so one can interfere with such binding and reducing the viral infection.


Introduction
Hepatitis C Virus (HCV) is a chronic disease that is widely spread in the last three decades, with almost 180 million infections globally (A. A. Elfiky, A., & M., 2016;Suzuki, Ishii, Aizaki, & Wakita, 2007;P. L. Yang, Gao, Lin, Liu, & Villareal, 2011). Chronic HCV is the leading cause of liver cirrhosis and hepatocellular carcinoma (HCC) (A. A. Elfiky et al., 2016;Ganesan & Barakat, 2017;Gonzalez-Grande, Jimenez-Perez, Gonzalez Arjona, & Mostazo Torres, 2016; P. L. Yang et al., 2011). HCV is transmitting through direct blood to blood contacts, for example, during the uncontrolled blood transfusion, usage of unsterilized dental equipment, injecting drugs, and from infected mother to infants at delivery (Gonzalez-Grande et al., 2016;Powdrill, Bernatchez, & Gotte, 2010). The current development in the Direct Acting Antivirals (DAA) reduced the harmful side effects accompanied the immunomodulation therapy (interferon) (Abdo A Elfiky, 2019;Gonzalez-Grande et al., 2016;P. L. Yang et al., 2011). Viral infection or carcinogenesis induces various cellular stress responses, e.g., autophagy and Unfolded Protein Response (UPR), to help cellular adaptation to stress either by restoring homeostasis or promoting apoptosis. Chaperoning system or Heat Shock Proteins (HSP) are upregulated to alleviate the stress of accumulating misfolded proteins (Ibrahim, Abdelmalek, & Elfiky, 2019). HSPA5 or the Glucose Regulating Protein 78 (GRP78), is one of the HSP70 family that is considered as the master regulator of the Endoplasmic Reticulum homeostasis (Gething & Sambrook, 1992;Ibrahim et al., 2019;Lee, 2005;Li & Lee, 2006;Quinones, Ridder, & Pizzo, 2008;Rao et al., 2002). Upon cell stress, HSPA5 dissociates from and activating the three UPR sensors, Activating Transcription Factor 6 (ATF6), Protein kinase RNA-like Endoplasmic Reticulum Kinase (PERK), and Inositol-requiring Enzyme 1 (IRE1), to trigger downstream signaling that will aid in cell fate decision. (Ibrahim et al., 2019;Shen, Chen, Hendershot, & Prywes, 2002). Accordingly, HSPA5 is able to escape the ER retention and translocate to the plasma membrane and associate with other cellular proteins over the cell membrane (cell-surface HSPA5) (Bailly & Waring, 2019;Ibrahim et al., 2019;Wu et al., 2014). The cell-surface HSPA5 acts as a signaling receptor to multiple pathogens through binding to the viral envelope or spike proteins or fungal coat proteins (Booth et al., 2015;A. A. Elfiky, 2020aA. A. Elfiky, , 2020bGebremariam et al., 2014;Ibrahim et al., 2019;Ibrahim, Abdelmalek, Elshahat, & Elfiky, 2020). The E2 of HCV is the main protein responsible for host cell recognition and entry; it was appeared to be complexed with chaperones. Meanwhile, HSPA5 is reported to overexpress in liver cells upon HCV infection. At the same time, its knockdown reduces the viral load in vivo, revealing crosstalk and interaction between HCV E2 protein and cell-surface HSPA5 (Liberman et al., 1999). By knowing that, A cyclic peptide of length 13 amino acids CTVALPGGYVRVC (Pep42) can deliver doxorubicin to cancer cells (Ibrahim et al., 2019;Martin et al., 2010) by targeting the cell-surface HSPA5 in vivo (Kim et al., 2006). Thus, one can predict the binding site between cell surface HSPA5 and viral E2 protein based on the sequence, and hence the structural, similarity between Pep42 and HCV E2 protein.
Indeed, identifying the binding site and the interaction mode of HCV E2 with cell-surface HSPA5 is critical in order to deter the HCV propagation. Protein/peptide (HSPA5/ HCV E2 C554-C566) and protein/protein (HSPA5/ full-length HCV E2) docking are employed in this study to explore such binding using the state-of-the-art molecular docking techniques. The protein/protein docking software, HADDOCK, utilizes solvation and Molecular Dynamics Simulation (MDS) in refining the interacting residues (binding site) after docking, for ensuring the reliability of the formed interactions (van Dijk & Bonvin, 2006).

Materials and methods
The HCV E2 solved-structure protein data bank (PDB) file and its sequence file (FASTA) (released in the year 2014 with resolution 2.4 Ǻ) are downloaded from the Protein Data Bank database (PDB ID: 4WEB) (Berman et al., 2000). The water, ions, and other chains were removed from the PDB file of protein chain E utilizing PyMOL software (DeLano, 2002). Additionally, the full-length, wildtype HSPA5 (2.99 Ǻ resolution) in the open configuration (PDB ID: 5E84) is downloaded and prepared for the docking experiment (J. Yang, Nune, Zong, Zhou, & Liu, 2015;J. Yang et al., 2017). Multiple sequence alignment is performed using the Clustal Ω web server for the available HCV E2 protein sequences in the National Center for Biotechnology Information (NCBI) protein database of the National Institute of Health (NIH) (NCBI, 2020). The Clustal Ω web server is also utilized to align sequences of HCV E2 protein and the peptide Pep42 (Sievers et al., 2011), while ESpript 3.0 is used to represent the alignments (Gouet, Courcelle, Stuart, & Metoz, 1999). The C554-C566 region of HCV E2 protein that fits well with Pep42 (38.5 % identity) is further analyzed by the ProtScale webserver (ExPASy bioinformatics resource portal) (Garg et al., 2016;Gasteiger et al., 2005).
The Pep42 cyclic peptide model is built using the I-TASSER web server (Zhang, 2008). The protein/peptide docking software HpepDock (Zhou, Jin, Li, & Huang, 2018) is utilized to dock Pep42 and HCV E2 (C554-C566) peptide into HSPA5. Rigid docking scheme is used without the binding site determination for HSPA5 to scan the possible binding sites. Moreover, the HADDOCK web server (van Dijk & Bonvin, 2006) is used to test the binding of the fulllength HCV E2 (PDB ID: 4WEB chain E) to HSPA5 Substrate Binding Domain β (SDBβ) (PDB ID: 5E84 chain A). For HSPA5, the active residues involved in the interaction are I426, T428, V429, V432, T434, F451, S452, V457, and I459 (J. Yang et al., 2015). On the other hand, HCV E2 active residues are selected to be the region C554-C566. Furthermore, the residues surrounding the active residues are chosen to be the passive residues in HADDOCK. Protein-Ligand Interaction Profiler (PLIP) webserver (Salentin, Schreiber, Haupt, Adasme, & Schroeder, 2015) is used to assess the interactions established after docking.

Sequence and Structural Alignment
Pep42 is reported to target cell-surface HSPA5 selectively and is used to deliver doxorubicin to cancer cells. The sequence, and hence the structural similarity of HCV E2 with Pep42, supports its potential to bind to the cell-surface HSPA5 by the same recognition site. Figure 1 shows the Multiple Sequence Alignment (MSA) between the Pep42 and the 237 different sequences of HCV E2 found in the NCBI (figure S1 shows the complete MSA of the 237 sequences). As shown in Figure 1, Pep42 is adequately aligned (high sequence conservation) with the C554-C566 (numbering scheme of PDB ID: 4WEB). The consensus sequences CTWMN and TGFTKTC show the preservation of the C554 (forming disulfide bond to C510), G561 (flexible residue crucial for the turn formation), and C566 which forms another disulfide bond with C496, while some of the aligned sequences stop at T565 ( Figure S1). On the structural side, the C554-C566 region folds into β-hairpin (PDB ID: 4WEB), like the Pep42 model, suggesting being the recognition site for HSPA5 SBDβ. Figure 1: Multiple Sequence Alignment (MSA) between Pep42 and some HCV E2 sequences downloaded from the NCBI protein database. The full MSA is presented in the supplementary figure S1. Sequences are aligned using the Clustal Ω web server and represented by ESpript 3 software. Red highlights indicate identical residues found while; residues written in red are conserved. Consensus motifs are depicted in blocks for clarity. Figure 2A depicts the structural alignment between the HCV E2 C554-C566 region (green cartoon) and the Pep42 model built by I-TASSER (red cartoon). Identical amino acids are labeled (three-letter codes). As represented in the figure, the β-hairpin structure persists in the Pep42 and HCV E2 C554-C566 region. The Root Mean Square Deviation (RMSD) of the superposition is 2.05 Å. High structural conservation is shown in the figure solidify our suggestion of the functional similarity between Pep42 and that part (C554-C566) of HCV E2. The presence of glycine residue at position 8 of the Pep42 (G561 in HCV E2) is crucial for the βturn formation. Also, the presence of terminal Cystine residues (C554 and C566) is a must for a disulfide bond to be formed. Kyte & Doolittle hydropathy parameters (Kyte & Doolittle, 1982) of the Pep42 and HCV E2 C554-C566 are shown in the bar graph of figure 2B. HCV E2 C554-C566 region is less hydrophobic compared to the Pep42 peptide, especially at the center of the peptide (the turn region).

Protein/peptide docking
PyMOL software is utilized to prepare the peptides (Pep42 and HCV E2 C554-C566 region) and the proteins (HSPA5 and HCV E2). Missing H-atoms are added, water molecules and ions are removed from the PDB files. A redocking experiment is performed, as a control, to test the performance of the docking software. The RMSD of the re-docked complexes shows good agreement with the experimental structures (1.97 Å and 1.51 Å for HpepDock and HADDOCK).
Pep42 peptide and HCV E2 C554-C566 region (PDB ID: 4WEB) are docked (using HpepDock webserver) into the substrate-binding domain β of HSPA5 (PDB ID: 5E84). The binding site of HSPA5 is defined to be I426, T428, V429, V432, T434, F451, S452, V457, and I459. On the other hand, the entire peptide (Pep42 or HCV E2 C554-C566 region) is identified to be the binding site and treated as rigid, in order not to lose the structural folding.
PLIP web server is used to analyze the established interactions upon docking. Four Hydrogen bonds and four hydrophobic contacts are formed between HSPA5 and Pep42 peptide, with the interacting residues C1, T2, V3, A4, L5, and Y9 of Pep42. Seven Hydrogen bonds and two hydrophobic contacts are set in the case of HCV E2 C554-C566 peptide, with T555, W556, N558, and S559 as the HCV E2 interacting residues. The mode of the interaction in the case of HCV E2 peptide is less hydrophobic, but more H-bonds are established. As a consequence, the HpepDock binding score for HCV E2 C554-C566 peptide is more negative (-111.2), which means it has a better binding affinity to HSPA5 compared to the Pep42 peptide (-72.4). Figure 3 demonstrates the interactions that established after docking of HCV E2 C554-C566 peptide (red cartoon) into HSPA5 SBDβ (green cartoon). The interactions with the peptides are maintained by the H-bonding and hydrophobic contacts of HSPA5 (green sticks) with the peptides (red sticks).
Figure 3: Peptide-HSPA5 docking experiment using HpepDock. HSPA5 solved structure (5E84), represented in the green cartoon, is docked with HCV E2 C554-C566 peptide (red cartoons). Residues from HSPA5 SBDβ that involved in the interaction with the peptide are labeled and represented in green sticks while these interacting residues from the peptide are represented in red sticks.
A total of 167 docked structures are clustered to 5 clusters according to HADDOCK docking scores. Every cluster has a different number of docked structures (from 6 (cluster 5) up to 119 (cluster 1)). The HADDOCK score for each cluster varies from -107.5 ±3.1 (the best-scored cluster 1) to -74.7 ±11.6 (cluster 3), while the score in the case of HSPA5/Pep42 complex is -66.4 ±8.0.
Utilizing PLIP software, we checked the docking mode of the top-ranked cluster (cluster 1). Table 1 lists the interactions that formed between HCV E2 and HSPA5 SBDβ in the best four docking structures of cluster 1. H-bonding and hydrophobic interactions are the two established communications between HCV E2 and HSPA5. The most stable H-bonds are that formed through S560 and K564 from HCV E2, and E427, G430, and T458 in the HSPA5 SBDβ. For the hydrophobic interactions, Y562 from HCV E2 contacts the hydrophobic batch of HSPA5 SBDβ at residues I426, V429, F451, and V457. These are the same hydrophobic batch that recognizes unfolded proteins (Ibrahim et al., 2019;Martin et al., 2010). Figures 4A and 4B show the interactions established in a representative docking complex from cluster 1. Surface representation in figure 4A shows how the HSPA5 (green surface) recognizes HCV E2 (blue surface), while the C554-C566 region of E2 is shown in the red cartoon for clarification. The C554-C566 part of HCV E2 (red cartoon) is embedded between the SDBβ loops. Cartoon representation for the same complex is depicted in figure 4B using PyMOL with the same coloring scheme. The interacting amino acids are represented in red (HCV E2) and green (HSPA5 SBDβ) colored sticks and labeled using their three-letter code. For HCV E2, the residues S560, Y562, and K564 (red marked sticks) are the central interacting residues among the C554-C566 region. These residues form H-bonds and hydrophobic contacts with I426, E427, F451, and T458 (green and labeled sticks) of the HSPA5 SBDβ. The present study predicts how the binding can establish between HCV E2 and the cellsurface exposed chaperone, HSPA5, protein. In addition, the N-terminal of the HCV E2 is flanking near the binding site (yellow stick in the center of the enlarged panel of figure 4B). The N-terminal of protein is determinant of its in vivo half-life, and the masking of the chaperone HSPA5 to the viral protein N-terminal may increase its half-life (i.e., reduce its degradation rate) (A. Varshavsky, 1997;Alexander Varshavsky, 2019). Molecular dynamics study is suggested as future work to quantify the dynamics at the binding site with and without binding inhibitors (Abdo A. Elfiky, 2020).

Conclusion
Chronic HCV is one of the widely spread diseases that can cause cirrhosis and hepatocellular carcinoma. The viral protein E2 is the central host cell recognizing protein that facilitates viral entry to the liver cell. Simultaneously, HSPA5 overexpression is reported to increase the viral load. The present approach suggests the HSPA5/HCV E2 binding site by using the molecular docking technique, recommending HSPA5 as an under-explored target against HCV infection. Molecular dynamics study will be involved as future work to quantify the dynamics at the binding site with and without binding inhibitors. Further experimental work is required to assess the provided binding site and to test more HSPA5 inhibitors for HCV resistance.