Virtual screening of inhibitors against spike glycoprotein of SARS-CoV-2: a drug repurposing approach

The novel coronavirus (SARS-CoV-2) is a human pathogen recently emerged in China, causing a global pandemic of severe respiratory illness (COVID19). SARS-CoV-2 makes entry in to human cells through its spike (S) protein that binds to cell surface receptors. Widespread of SARS-CoV-2 has been attributed to high affinity of S protein to its receptor. A homology model of the receptor binding domain of SARS-CoV-2 S protein (RBD) was built. RBDreceptor docking and published molecular dynamics data were used to map the key RBD-receptor interaction hotspot (RBDhp) on the RBD. Primary virtual screening was carried out against RBDhp using more than 3300 compounds approved by U.S Food and Drug Administration (FDA) and other authorities for human use. Compounds that bind to hpRBD with a binding energy ≤ 6.5 kcal/mol were subjected to secondary screening using a recently published cryo EM (2.9 Å) structure of RBD. A cardiac glycoside (dgitoxin), two anthracyclines (zorubicin and aclarubicin), a tetracycline derivative (rolitetracycline), a cephalosporin (cefoperazone) and a food dye (E-155) were predicted to be most potent inhibitors of RBD – receptor interaction. An anti-asthmatic drug (zafirlukast) and several other drugs (itrazol, fazadinium, troglitazone, gliquidone, Idarubicin, Oxacillin) were found to be high affinity binders that may have a potential to inhibit RBD – receptor interaction.


Introduction
The outbreak of coronavirus disease (COVID- 19) which was first reported from Wuhan, China has been spreading around the world [1]. World Health organization has declared the SARS-CoV-2 as a Public Health Emergency of International Concern (PHEIC) on the 30 th of January 2020 by considering its rapid transmission from human to human and spread into many of the continents [2][3]. It has now been recognized by WHO as a global pandemic making it the first pandemic caused by a coronavirus [4]. Currently the number of confirmed infections around the world stands at 167 511while a total of 6606 lives have been claimed since its initial outbreak in December 2019 [4].
Currently there are more than 350 genome sequences of SARS-CoV-2 that have been shared on the online platform Global Initiative on Sharing All Influenza Data (GISAID). The 30,000 base-pair genome of SARS-CoV 2 shows that the virus is adequately divergent from SARS-CoV; a human-infecting beta-coronavirus that caused an epidemic in 2003 [5]. Further, based on the genome variations located in open reading frame 8 (for which the function is yet unknown), Tang and others have identified two viral linages which are named as L and S which are likely to be evolved in early stages of outbreak in china [6]. Like other viruses, SARS-CoV-2 evolves over time introducing random mutations. While the mutation rate for SARS-CoV-2 has not been calculated accurately it has been suggested to have moderate to high mutation rate when compared to other members of single-stranded RNA viruses [7]. SARS-CoV-2 binds to the human angiotensin converting enzyme 2 (ACE-2) receptor through densely glycosylated spike (S) protein as the initiation step of the entry mechanism to human cells [8][9]. The S protein of SARS-CoV-2 binds ACE-2 through its receptor binding domains (RBD) and the RBDup conformation of the S protein is a prerequisite for the formation of RBD-ACE-2 complex [9]. Divergence of SARS-CoV-2 RBD amino acid sequence from other corona viruses and RBD-ACE-2 binding experiments has given implications on relatively high affinity viral binding to host cell receptors [9].
High affinity RBD-ACE-2 binding has been suggested among other factors for wide spared of SARS-CoV-2. Although highly important drug target, unavailability of a high-resolution crystal structure and solvent accessible binding surface has made it a tedious target for virtual screening and drug design [10]. Cryo EM structure of SARS-CoV-2 S protein (pre fusion down conformation) has been published at 3.5 Å resolution [9]. Availability of several high resolution X-ray crystal structures of SARS CoV S protein (UP conformation) in complexed with ACE-2, structural rigidity and high degree of sequence similarity of RBDs warrants us to generate reasonably accurate homology models for drug screening. In the current study more than 3000 compounds approved by various regulatory authorities including U.S. Food and Drug Administration (FDA) were screened against an optimized homology model of the SARS-CoV-2 S protein RBD. After publishing our initial work in the first version of the preprint [11] a 2.9 Å cryo EM structure (PDB 6m17) of RBD-ACE-2 complex have been published [12]. High resolution of 6m17 and law temperate factor of the RBD-ACE-2 interface warranted carrying out a second round of screening using the new cryo EM structure.

Molecular modeling
Spike protein was selected for virtual screening instead of selecting ACE -2 receptor since compounds that block ACE -2 receptor is known to have modulatory effect on blood pressureand several other cardiovascular system related side effects [13]. Cryo EM map of SARS-CoV-2 S protein has been published in receptor unbound pre fusion conformation at a resolution of 3.5 A o [9]. The Cryo EM structure reflects the high degree of structural homology between SARS-CoV-2 RBD and SARS-CoV RBD having a RMSD value of 3.0 Å [9]. Low resolution and the conformational considerations of Cryo EM structure published by Wrapp et al, makes it an unreliable structure for virtual screening [9]. Therefore, a homology model was generated using high-resolution crystal structure of the SARS -CoV (PDB ID:2AJF) in ACE-2 bound up confirmation. Energy minimized homology model used for the screening showed a Cα RMSD of 1.47 Å betweenthe RBD domains of SARS-CoV-2 and the SARS-CoV.

Identification of Ligand Binding Site
Analysis of interaction surface of RBD-ACE2 in chimera showed strong H bonding at TYR 505 which is well exposed indicating an initial contact point with ACE-2. Further polar interactions were observed at THR 500, ASN 501, GLY 502, TYR 449, GLN 493 and GLN 498. This approach gives interactions in consistence with pervious protein -protein docking data [14] and data arising from MD experiments [15]. Further surface analysis of the structure reveled Arg 439 > ASN 439 mutation on one side of the structure that is protruding (SpkP) from the rest of the S protein ( Figure 1). This mutation seems to significantly weaken the hydrogen bond formed by residue number 439. Therefore, the key interactions were restricted to one side of SpkP. (Figure 1). For THR 500, ASN 501, GLN 498,observations were in consistence with the RBD-ACE-2 interactions observed in the most recent publication of cryo EM structure 6m17 [12]. A comparison of binding residues is given in  (RBD-HM, template-2AJF). b. Cryo EM structure of RBD at 2.9 Å resolution (6m17). Red color resides are the previously identified key binding residues [11,15]. Magenta color residues are the resides that form polar RBD-ACE-2 interactions observable in 6m17. Red dotted circle shows the open hydrophobic pocket of BRD-HM which is closed in 6m17.

Virtual Screening
Protein ligand interaction analysis was carried out for hits (binding energy ≤ -6.5 kcal/mol) resulting from secondary screening. According to our previous experience with protein-protein interaction inhibitors, a two hit hypothesis (targeting at least two key resides that form polar proteinsprotein interactions) was employed to select potent inhibitors of S protein and ACE-2. This approach when coupled with high energy binding, provides a better approach for inhibiting relatively strong protein-protein interactions that span through large surface area. Detailed list of interactions formed by the hits identified in the current study are given in table 1.
Digitoxin, (ZINC accession -8101077) a FDA approved cardiac glycoside formed hydrogen bonds with four residues critical in forming RBD-ACE-2 interaction (table 1, figure 3a). This lowest energy pose had a binding energy value of -7.6 kcal/mol indicating its potential as a RBD-ACE-2 interaction inhibitor. Digitoxin is used in the treatment and management of cardiac congestion, arrhythmias and heart failure (used in place of digoxin) [16]. Use of digitoxin is decreasing owing to the availability of better replacement drugs and some toxic properties (anorexia, nausea and vomiting) [16][17].
The first and second binding poses and fifth binding pose of zorubicin (ZINC accession -03831623) exhibited interesting interactions (table 1, figure 3 b-d). In particular, the pose one was observed to form hydrogen bonds with Gln 498 and Thr 500 that have been experimentally elucidated to form most important strong polar interactions with ACE-2 [12].
Thus the compound picks two most critical residues through hydrogen bonds while forming hydrophobic interactions with Gln493 and Tyr 505. Zorubicin is classified in the zinc library as a world drug (not approved by FDA) and it is approved in several European countries including France for the treatment of acute leukemia [18].
Aclarubicin (ZINC accession -08101054) is an anthracycline drug (world drug not approved by FDA) that is similar to zorubicin. It is used with cytarabine in Japan, China and India for the treatment of different acute myeloid leukemia conditions [19]. Hydrogen bonds were observed at Gln 498 and Try 453, Gln493 and Tyr 449 in the lowest energy binding pose of aclarubicin. Thus we provide strong evidence that the aclarubicin is a potent candidate for the inhibition of RBD -ACE-2 interaction. (Table 1). However anthracycline drugs such as zorubicin and aclarubicin are known to cause cardiac toxicity, inhibition of DNA replication and repair when used for a longer period of time [18,20] Rolitetracycline (ZINC accession -3831437) is a FDA approved broad-spectrum tetracycline antibiotic particularly used for parenteral administration in cases requiring elevated concentrations. At the therapeutic concentrations rolitetracycline does not cause toxic effects. [21]. The first and the second lower energy binding poses of rolitetracycline interacted with at least two key resides forming hydrogen bonds (table 1, figure 3 e). In both poses, strong hydrogen bonds were observed with Tyr 453. Therefor rolitetracycline is a potent candidate for the inhibition RBD -ACE-2 interaction. E-115 (Brown HT) is a common food dye (bis-azo dye) authorized as a food additive in the European Union (EU) [22]. Lowest energy poses showed hydrogen bonding with key residues Gln 498 and Asn 501 while the second lowest energy binding pose showed hydrogen bonding with key residues Thr 500 and Asn 501. Thus the current strudy provide evidence for E 155 as a potent candidate for the inhibition RBD -ACE-2 interaction. (Table 1,  Interestingly a commonly used anti asthmatic drug (zafirlukast) was predicted to bind with high affinity (-7.4 kcal/mol). Zafirlukast forms hydrogen bond with only one key residue (applies to all poses) while in all the poses zafirlukast stand in the binding site giving steric hindrance to key residues (not shown). Therefore, considering the high affinity and its common use as an asthmatic drug, zafirlukast may be tested on COVID 19 patients without waiting for in vitro data.
Further, an azole antifungal (itrazol) a phenylimidazole muscle relaxant (fazadinium) a 1benzopyran antidiabetic drug (Troglitazone), an antidiabetic sulfonylurea drug (gliquidone) an anthracycline antineoplastic drug (Idarubicin) and a beta-lactam antibiotic (Oxacillin) were found to be high affinity binders that may have the potential to inhibit RBDreceptor interaction (table 1).  Swiss Model [24] was used to generate homology models. For virtual screening, autodock vina program [25] in the academic distribution of PyRx [26] was run in the local server computer (Intel Xeon CPU E5-2420 v2, 2.2 GHz,12 core processor, 32 GB RAM, Ubuntu OS 16.04 LTS). For protein structure manipulation, comparison and binding pose analysis, Chimera v 1.14 [15] was run on a personal computer (Intel® Core™ i7-3540M processor ,4 GB ram, windows 10). Protein Ligand Interaction Profiler (PLIP) web server [27] was used for detailed analysis of protein-ligand complexes.

Molecular modeling
Homology models of the RBD was constructed using the crystal structure (

Secondary Screening
After carrying out the primary screening, ligands having binding energy values less than -6.5 kcal/mol were used for secondary screening against 6m17. For the secondary screening the RBD of 6m17 chain E was retained and other polypeptide chains and water molecules were removed. Screening was carried out using the same method as described in section 2.2.
Binding poses having binding energy vale ≤ -6.5 kcal/mol were characterized using view dock function in chimera and PLIP web server to identify the profile of protein -ligand interactions. Compounds that fit in to the binding site and form polar interactions with at least two critical residues were considered as potent inhibitors of RBD-ACE-2 interaction.

Conclusion
In conclusion, a cardiac glycoside (dgitoxin), Four antibiotics; two anthracyclines (zorubicin and aclarubicin), a tetracycline derivative (rolitetracycline) and a cephalosporin (cefoperazone), a food dye (E-155) were predicted to be potent inhibitors of RBDreceptor interaction based on the selection criteria used. A commonly used anti-asthmatic drug (zafirlukast) and several other drugs (itrazol, fazadinium, troglitazone, gliquidone, idarubicin, oxacillin) were found to be high affinity binders that may have potential to inhibit RBDreceptor interaction. Results of present study suggest the potential of these compounds as prophylactic medication or use in preventive countermeasures.

Compliance and ethics
The authors declare that they have no conflict of interest.

Acknowledgement
Mr. Kanchana Senanayake. Assistant network manager of the Institute of Biochemistry, Molecular Biology and Biotechnology for providing computing facility