Virtual Screening based prediction of potential drugs for COVID-19

SARS-CoV-2 is a betacoronavirus that was first identified during the Wuhan COVID-19 epidemic in 2019. It was listed as a potential global health threat by WHO due to high mortality, high basic reproduction number and lack of clinically approved drugs and vaccines for COVID-19. The genomic sequence of the virus responsible for COVID-19, as well as the experimentally determined three dimensional structure of the Main protease (Mpro) are available. The reported structure of the target Mpro was utilized in this study to identify potential drugs for COVID-19 using molecular docking based virtual screening. The results of this study confirm earlier preliminary reports based on studies of homologs that some of the drugs approved for treatment of other viral infections also have the potential for treatment of COVID-19. Anti-viral drugs, approved for human therapy, that target Mpro of COVID-19 were ranked for potential effectiveness against COVID-19 and novel candidates for drug repurposing were identified. In addition, the molecular docking method, utilized in this study, identified potential beneficial off target effects of some drugs that are currently in clinical trials.


Introduction
COVID-19 is a highly infectious disease associated with high mortality Zhao et al., 2020), and there are no approved drugs or vaccines for this disease (Li & De Clercq, 2020). World Health Organization has declared COVID-19 as a public health emergency of international concern (Lai et al., 2020). SARS-CoV-2, the virus responsible for COVID-19, is a betacoronavirus . The previous name for this virus was 2019-nCoV (ICTV (International Committee on Taxonomy of Viruses, 2020). The genome of SARS-CoV-2 has been sequenced Chan et al., 2020;Zhu et al., 2020). The genomic sequence of SARS-CoV-2 has 96% identity to the bat-coronavirus and 79.6% sequence identity to SARS-CoV (Zhou et al., 2020). Although there are no approved drugs or vaccines for COVID-19, a number of clinical trials are in progress (Lu, 2020). Lopinavir and Ritonavir, combined with Chinese herbal medicines, were used in preliminary clinical studies .
Computational methods can be utilized for design and engineering of drugs (Marshall, 1987;Leelananda & Lindert, 2016;Kuntz, 1992) ; (Macalino et al., 2015;Talluri, 2019). The low time requirements of computational methods are conducive for high throughput screening of available drugs to identify potential drugs for novel diseases as well as to predict the adverse effects of novel drugs (Jones et al., 1997;Friesner et al., 2004;Irwin & Shoichet, 2016;Mohanasundaram & Sekhar, 2018). Development of novel drugs is a time consuming process and generally several years of work are required for clinical approval (Stromgaard et al., 2017). Drug repositioning, also known as repurposing, is an effective strategy to combat novel diseases caused by infectious agents that spread rapidly (Ashburn & Thor, 2004;Li et al., 2016;Chopra & Samudrala, 2016). Drugs that have been approved for some disease, are safe for human use , and only their effectiveness against the disease of interest needs to be established (Hurle et al., 2013). In life threatening cases, where there is no alternative medicine or vaccine, such a drug repurposing strategy is particularly attractive. However, clinical trials are necessary to ensure that such treatment is better than a placebo (Ding, 2016;Novac, 2013).
An in silico screening of herbal medicines for treatment of COVID-19 has also been reported . Although several clinical trials are in progress to assess the potential of putative therapeutic agents, very limited data is available publicly regarding the in vitro and in vivo activities of the drugs that are currently in clinical trials for treatment of COVID-19 (Lu, 2020). It has been reported that Chloroquine phosphate shows anti COVID-19 activity (Gao et al., 2020a). Several clinical trials are assessing the potential of protease inhibitors such as Lopinavir and Ritonavir that have been approved for treatment of other viral infections. Lopinavir and Ritonavir were identified in earlier studies to target the Mpro of SARS virus. The protein sequences of COVID-19 Main protease (2019-nCoV Mpro) and SARS-CoV Mpro are 96% identical. In several early studies, the similarities in the sequence of a potential target for COVID-19 to that of the SARS Mpro were utilized to build a model for the structure of SARS-CoV-2 Mpro. Homology based models were utilized to screen a library of compounds to predict potential drugs for COVID-19 (Nguyen et al., 2020;Xu et al., 2020;Liu & Wang, 2020) . The sequence similarity of the SARS-CoV-2 Mpro to the SARS Mpro is sufficiently high to build a good model for the structure of SARS-CoV-2 Mpro. However, the predictions of virtual screening studies and binding energy calculations are generally more accurate if a high resolution experimental structure of the target is available. The recent availability of the the high resolution experimental structure of the target (Jin et al., 2020), the main protease of SARS-CoV-2, was utilized in the current study as the target for molecular docking based virtual screening. The predictions of this study will provide information that can be utilized for choice of candidate drugs for in vitro, in vivo and clinical trials.

Target preparation:
The structure of the target protein SARS-CoV-2 Mpro, 6lu7.pdb (Liu et al., 2020;(Jin et al., 2020), was obtained from RCSB Protein Data Bank (Burley et al., 2017). Hydrogens for pH 7.0 and gasteiger charges were added to the protein and a pdbqt format file was generated by using Open Babel (obabel;O'Boyle et al., 2011) . Autodock Vina requires receptor structure in pdbqt format. SMINA can utilize input receptor files in both pdb and pdbqt formats. Avogadro (Hanwell et al., 2012), Pymol (DeLano, 2002 and Rasmol (Sayle & Milner-White, 1995) , were used for visualization. Active site residues were identified by their proximity to the ligand.

Ligand preparation:
The list and structures of approved drugs were obtained from SuperDRUG2 database. SuperDRUG2 maintains a database of drugs approved for clinical use (Siramshetty et al., 2018;Goede et al., 2005) . The version of the database downloaded on 18 February, 2020 contained 3639 approved drugs. For each molecule in the 3D conformer list obtained from SuperDrug2, the first conformer was utilized in this study. Gasteiger charges were added, and structural optimization was carried out with MMFF94 forcefield using steepest descents followed by conjugate gradient minimization. Subsequently, the structures were converted into pdbqt format, for use in docking calculations with Vina. These structures were further converted to mol2 format for calculations with SMINA. Avogadro and Pymol were utilized for visualization of the ligands.
Data related to Clinical Trials was downloaded from clinicaltrials.gov on 6 th March, 2020. Drug like small molecules in interventional category were selected. Some of these investigational molecules have been approved for other indications, whereas others are not. The investigational molecules that were approved for other indications were listed in SuperDRUG2, whereas others were downloaded from Pubchem (Hähnke et al., 2018) or Drugbank (Wishart et al., 2018). These molecules were in 2D SDF format. Low energy conformations were generated and then hydrogens and gasteiger charges were added using obabel. Obabel was used for geometry optimization was carried out with MMFF94 forcefield. However, computational studies were not undertaken for Carrimycin, currently in Phase 4 of clinical trials, due to lack of structural data.
Docking: Virtual screening workflow was implemented using obabel, Autodock Vina (Trott & Olson, 2010), SMINA (Koes et al., 2013) and customized Python and shell scripts. Pymol and Rasmol were utilized for visualization of the docked results. Autodock Vina was used for docking calculations that did not involve flexible active sites residues. The exhaustiveness parameter that controls the extent of the search was chosen as 8, and 9 modes were generated for each ligand.
SMINA, a derivative of Vina, was used for docking calculations that involved flexible active site residues. The flex_hydrogens option was used in SMINA to enable comparability of Vina and SMINA results. The same set of parameters were utilized for Vina and SMINA, with the following exceptions. The exhaustiveness parameter was set to 16 for docking calculations involving flexible residues in the active site. Residues 41 and 145 of SARS-CoV-2 Mpro were designated as flexible for calculations reported in Table 1,Table 2 and Table 3.
Ubuntu 14.04 operating system (64-bit) installed on a home built computer, equipped with Pentium Gold G5400 3.7GHz processor and 16GB memory, was utilized for all computational work.

Results
The inhibitor N3 was bound to Mpro in the structure of the complex (6lu7.pdb) determined from Xray crystallographic data. The binding energy and binding modes of N3 were predicted with Autodock Vina and SMINA. There were minor differences in the binding energies and binding modes determined by the two methods, due to small differences in conformational search algorithms and inherent randomness of conformational search. The second lowest energy binding mode predicted by Vina and the lowest energy binding mode predicted by SMINA were found to match the experimental binding mode. The energies of the first two binding modes reported by Vina were identical, but the first one was significantly different from the native binding mode, while the second had an excellent match with the native binding mode. The superposition of the predicted binding modes and the experimental result are displayed in Figure 1. These two methods may be deemed to yield similar results for the purpose of this study.
The list of approved drugs listed in SuperDrug2 database was used to identify drugs that have the potential to bind to Mpro of SARS-CoV-2 by using Vina. For most drugs the difference between the binding energy calculated with Vina and with flexible docking with SMINA were less than 0.5 kcal/mol. The drug with the best binding in the database was Beclabuvir, with a predicted binding energy of -10.4 kcal/mol (Vina) and a predicted binding energy of -10.0 kcal/mol for flexible docking with SMINA. Although the binding mode of Beclabuvir is similar to that of N3 in the experimentally determined structure of the complex of SARS-CoV-2 Mpro and N3, there are some significant differences in the sites of interaction (Figure 2).
Saquinavir, an antiviral protease inhibitor, was predicted to have a binding energy of -9.3 kcal/mol with Vina and -9.2 kcal/mol with SMINA. The predicted binding mode of Saquinavir displays a high degree of similarity to that of the ligand N3 in the experimentally determined structure of the complex of SARS-CoV-2 Mpro and N3 (Figure 3).
The predicted binding energies for selected antiviral drugs that have potential for therapy for COVID-19 are displayed in Table 1. The mean difference between the two methods for predicting binding energies is 0.33 kcal/mol and the maximum difference in energy is 0.9 kcal/mol, for the drugs in Table 1.
The list of highest ranked approved drugs, based on Vina scores, are displayed in Table 2. The corresponding binding energies for flexible docking, estimated by using SMINA are also shown in Table 2. The mean difference is 0.42 kcal/mol and the maximum difference in energy is 1.0 kcal/mol between the two methods for the binding energies displayed in Table 2. These data indicate that Nilotinib, Tadalafil, Lifitegrast, Digitoxin, Digoxin and Tirilazad are also potential drug candidates for COVID-19 therapy. Although, it is likely that some of these are false positives, in vivo results show that Digitoxin and Digoxin, the compounds having the best SMINA scores in this list, are effective inhibitors of MERS-CoV in Vero cells with a high therapeutic index .
Binding energies and binding constants for small molecule drugs in Clinical Trials for therapy of COVID-19 were calculated by using Vina and SMINA (Table 3). The difference between the estimated binding energies is less than 1 kcal/mol for binding energies estimated with a rigid receptor (Vina) and for those estimated with limited flexibility at the active site (SMINA). All protease inhibitors that are currently in Phase 4 or Phase 3 of clinical trials have predicted dissociation constants that are equal to or better than 22 M. Surprisingly, two compounds (Methyl prednisolone and Remdesivir), whose mode of action is not recognized as protease inhibition, also have dissociation constants in the M range. These predicted interactions, if confirmed, indicate that these compounds have multiple modes of action, and that the off target effects are beneficial for therapeutic applications. Methyprednisolone, currently in Phase 4 clinical trials for therapy of COVID-19, is a glucocorticoid that is approved for disorders that require inhibition of proinflammatory signals. Molecular Docking based binding energy (Table 3) indicates that Methylprednisolone may have an unexpected ability to bind and inhibit Mpro of COVID-19, enhancing its effectiveness. Utilization of Thalidomide, an immunomodulatory and antiinflammatory agent with Methylprednisolone has been used for successful treatment of one case of COVID-19 .

Discussion
The primary limitations of the docking based virtual screening strategy are the high false positive rates and low correlation coefficients between estimated binding energy and experimental measures of activity reported in some earlier studies (Leach et al., 2006;Wang et al., 2016). Even if the same scoring function and target receptor structure are used, differences in predicted binding constants can arise due to differences in charges assigned to the receptor, the method used for relaxation of the receptor, flexibility of the receptor, charge type added to the ligand, differences in use of united atoms, size and centering of the grid used to calculate the molecular potential, the exhaustiveness of the search as well as due to the randomness of the search procedure (Gaillard, 2018;Masters et al., 2020). Therefore, it is not surprising that there are differences in the predicted binding constants reported by different groups. Despite these limitations, the estimates of binding energy provide valuable information that can inform and guide further studies (Kontoyianni, 2017;Li et al., 2019), such as, Molecular Dynamics based estimates of free energies, in vitro studies and in vivo studies.
Lopinavir, Ritonavir and Nelfinavir have been reported in earlier studies as potential drug candidates that target Mpro of SARS-CoV-2. The results of this virtual high throughput screening study, based on the reported structure of Mpro obtained from X-ray crystallographic data, are consonant with the earlier predictions. The binding energies for Lopinavir and Ritonavir (-7.4 to -7.7 kcal/mol), predicted in this study, are consistent with preliminary clinical data indicating effectiveness for these drugs . Potentially more effective drugs, with lower binding energy, have been identified in this study. Beclabuvir, with a predicted binding energy of -10.4 kcal/mol (Vina), was identified as the drug with the best binding energy (Table 2). However, based on similarity of binding mode, Saquinavir with a predicted binding energy of -9.3 kcal/mol is the best candidate drug (Table 1 and Figure 3).
Earlier computational studies for prediction of drugs for COVID-19 utilized structural information regarding the target from comparative modeling (Elfiky, 2020;Lin et al., 2020b;Xu et al., 2020). Studies based on the modeled structure of SARS-CoV-2 Mpro, molecular docking and free energy calculations, predicted Nelfinavir to be potential drug for COVID-19 . Although results of the current study confirm the potential of Nelfinavir to bind and inhibit SARS-CoV-2 Mpro, it is ranked lower than several other clinically approved antiviral protease inhibitors (Table  1).
A Drug Target Interaction Deep Learning Model has been used to predict that Atazanavir and Efivirenz have better binding that Ritonavir (Beck et al., 2020). However, the binding constants estimated with AutoDock Vina, whose scoring function has been validated in multiple studies, predict that both Atazanavir (-7.2 kcal/mol) and Efavirenz (-6.4 kcal/mol) are weaker binders than Ritonavir (-7.5 kcal/mol).
Machine intelligence based Generative Network Complex has been used to design drugs based on the experimental structure of SARS-CoV-2 Mpro with best predicted binding energy of -10.56 kcal/mol (Gao et al., 2020b). The binding energies for the GNC based design were estimated by a method that differs from out method. However, the binding energies estimated by their method for Lopinavir and Ritonavir differ from our estimates by less than 1.0 kcal/mol. Gao et al., have also reported the use of high throughput virtual scanning for a limited set of compounds and have reported an estimated binding constant of -10.02 kcal/mol for CHEMBL222234 . The GNC based method is also a promising approach, however, designed molecules require a longer experimental testing phase before they can be used in clincal studies, compared to drugs predicted using the approach described in this study which uses clinically approved drugs.
Among the class of protease inhibitors that have been approved for human clinical indication and currently in clinical trials for treatment of COVID-19, Darunavir, an HIV protease inhibitor was predicted to have the best binding energy (-8.7 kcal/mol, Table 3). The Virtual Screening of all drugs approved for clinical use identified Saquinavir, also an HIV protease inhibitor, with predicted binding energy of -9.3 kcal/mol, indicating a potential for increased efficacy.
The results of Molecular Docking for compounds currently in clinical trials, predict that Danoprevir, currently in Phase 4 clinical trials for treatment of COVID-19, has the best binding energy for inhibition of Mpro of SARS-CoV-2 (-9.3 kcal/mol, Table 3). Danoprevir is an inhibitor of NS3 protease used for treatment of Hepatitis C. Danoprevir (included in Ganovo) has been approved by China's FDA for treatment of Hepatitis C (Markham & Keam, 2018). The screening of all drugs in the approved category list in SuperDRUG2 database identified Beclabuvir, an inhibitor of Hepatitis C NS5B protease, as the drug having the best binding energy (-10 kcal/mol) for interaction with Mpro of SARS-CoV-2, indicating a potential for higher efficacy. Beclabuvir has been approved as part of a fixed dose combination in Japan for treatment of Hepatitis C virus genotype 1 (Garimella et al., 2018). However, at present, neither Beclabuvir nor Danoprevir are listed at the FDA server at fda.gov for list of approved drugs.

Conclusion
Saquinavir and Beclabuvir were identified as the best candidates for COVID-19 therapy based on virtual high throughput screening of clinically approved drugs and the structure of SARS-CoV-2 Mpro determined from X-ray diffraction data. The results of this study also rationalize the limited data regarding effectiveness of drugs for COVID-19 therapy, and provide information that can be utilized for choice of candidate drugs for in vitro studies and in vivo studies. The predicted binding and ranking of drugs will also be useful to interpret the results of ongoing clinical trials that are testing existing drugs for effectiveness against COVID-19. Tables   Table 1. Predicted binding energies of selected protease inhibitors approved as antiviral drugs (kcal/mol). SMINA calculations were carried out with flexible residues at the active site.  Table 2. Predicted binding energies of highest scoring approved drugs from SuperDrug2 database (kcal/mol). SMINA calculations were carried out with flexible residues at the active site. Table 3. Estimated binding energies (kcal/mol) and binding constants for small molecule drugs in clinical trials for therapy of COVID-19. SMINA calculations were carried out with flexible residues at the active site. Figure 1a : Superposition of experimentally determined binding mode of ligand N3 and binding mode predicted with Vina. Green is experimentally determined ligand and blue is predicted binding mode.

Figures
Figure 1b : Superposition of experimentally determined binding mode of ligand N3 and binding mode predicted with SMINA. Green is experimentally determined ligand and blue is predicted binding mode. Figure 2 : Superposition of experimentally determined binding mode of ligand N3 and binding mode of Beclabuvir predicted with SMINA. Green is experimentally determined ligand (N3) and blue is predicted binding mode of Beclabuvir.