Virtual High Throughput Screening based prediction of potential drugs for COVID-19

SARS-CoV-2 is a betacoronavirus that was first identified during the Wuhan COVID-19 epidemic in 2019. It was listed as a potential global health threat by WHO due to high mortality, high basic reproduction number and lack of clinically approved drugs and vaccines for COVID-19. The genomic sequence of the virus responsible for COVID-19, as well as the experimentally determined three dimensional structure of the Main protease (Mpro) are available. The reported structure of the target Mpro was utilized in this study to identify potential drugs for COVID-19 using virtual high throughput screening. The results of this study confirm earlier preliminary reports based on studies of homologs that some of the drugs approved for treatment of other viral infections also have the potential for treatment of COVID-19. Approved anti-viral drugs that target proteases were ranked for potential effectiveness against COVID-19 and novel candidates for drug repurposing were identified.


Introduction
COVID-19 is a highly infectious disease (Zhao et al., 2020) associated with high mortality (Hui et al., 2020;Ruy and Chun, 2020), and there are no approved drugs or vaccines for this disease Tian et al., 2020). World Health Organization has declared COVID-19 as a public health emergency of international concern (Eurosurveillance Editorial, 2020). SARS-CoV-2, the virus responsible for COVID-19, is a betacoronavirus . The previous name for this virus was 2019-nCoV. The genome of SARS-CoV-2 has been sequenced Chan et al., 2020). The genomic sequence of SARS-CoV-2 has 96% similarity to the bat-coronavirus and 76.5% identity to the SARS-CoV (Chen, 2020). Although there are no approved drugs or vaccines for COVID-19, a number of clinical trials are in progress . Lopinavir and Ritonavir, combined with Chinese herbal medicines, were used in preliminary clinical studies (Wang et al., 2020).
Computational methods can be utilized for design and engineering of drugs (Marshall, 1987;Kuntz, 1992;Macalino et al., 2015;Talluri, 2019). The low time requirements of computational methods are conducive for high throughput screening of available drugs to identify potential drugs for novel diseases as well as to predict the adverse effects of novel drugs (Jones et al., 1997;Freisner et al., 2004;Mohanasundaram and Talluri, 2018). Development of novel drugs is a time consuming process and generally several years of work are required for clinical approval (Stromgaard et al., 2017). Drug repositioning, also known as repurposing, is an effective strategy to combat novel diseases caused by infectious agents that spread rapidly (Ashburn and Thor, 2004;Li et al, 2016). Drugs that have been approved for some disease, are safe for human use, and only their effectiveness against the disease of interest needs to be established (Hurle et al., 2013). In life threatening cases, where there is no alternative medicine or vaccine, such a drug repurposing strategy is particularly attractive. However, clinical trials are necessary to ensure that such treatment is better than a placebo (Novac, 2013).
Lopinavir and Ritonavir were identified in earlier studies to target the Mpro of SARS virus. The protein sequences of COVID-19 Main protease (2019-nCoV Mpro) and SARS-CoV Mpro are 96% identical . In several early studies, the similarities in the sequence of a potential target for COVID-19 to that of the SARS Mpro were utilized to build a model for the structure of SARS-CoV-2 Mpro . Homology based models were utilized to screen a library of compounds to predict that Nelfinavir, an approved antiviral protease inhibitor, is a potential drug for COVID-19 (Xu et al., 2019). The sequence similarity of the SARS-CoV-2 Mpro to the SARS Mpro is sufficiently high to build a good model for the structure of SARS-CoV-2 Mpro (Li et al., 2016;. However, the predictions of virtual screening studies and binding energy calculations are generally more accurate if a high resolution experimental structure of the target is available. The recent availability of the the high resolution experimental structure of the target, the main protease of SARS-CoV-2, was utilized in this study as the target for virtual high throughput screening.
The predictions of this study have a higher probability to identify drugs that will be effective against COVID-19 and provide information that can be utilized for choice of candidate drugs for in vitro, in vivo and clinical trials.

Methods
Target preparation: The structure of the target protein SARS-CoV-2 Mpro, 6lu7.pdb , was obtained from RCSB Protein Data Bank (Burley et al., 2019). Hydrogens for pH 7.0 and gasteiger charges were added to the protein and a pdbqt format file was generated by using obabel (Babel, 2011). Autodock Vina requires receptor structure in pdbqt format. SMINA can utilize input receptor files in both pdb and pdbqt formats. Avogadro (Hanwell et al., 2012), Pymol (DeLano, 2002 and Rasmol (Sayre and Milner-White, 1995), were used for visualization. Active site residues were identified by their proximity to the ligand.
Ligand preparation: The list and structures of approved drugs were obtained from SuperDRUG2 database. SuperDRUG2 maintains a database of drugs approved for clinical use (Sriramshetty et al., 2018). The version of the database downloaded on 18 February, 2020 contained 3639 approved drugs. For each molecule in the 3D conformer list obtained from SuperDrug2, the first conformer was utilized in this study. Gasteiger charges were added, and structural optimization was carried out with MMFF94 forcefield using steepest descents followed by conjugate gradient minimization. Subsequently, the structures were converted into pdbqt format, for use in docking calculations with Vina. These structures were further converted to mol2 format for calculations with SMINA. Avogadro and Pymol were utilized for visualization of the ligands.
Docking: Virtual screening workflow was implemented using obabel, Autodock Vina (Trott and Olson, 2020), SMINA (Koes et al., 2013) and customized Python and shell scripts. Pymol and Rasmol were utilized for visualization of the docked results. Autodock Vina was used for docking calculations that did not involve flexible active sites residues. The exhaustiveness parameter that controls the extent of the search was chosen as 8, and 9 modes were generated for each ligand.
SMINA, a derivative of Vina, was used for docking calculations that involved flexible active site residues. The flex_hydrogens option was used in SMINA to enable comparability of Vina and SMINA results. The same set of parameters were utilized for Vina and SMINA, with the following exceptions. The exhaustiveness parameter was set to 16 for docking calculations involving flexible residues in the active site. Residues 41 and 145 of SARS-CoV-2 Mpro were designated as flexible for calculations reported in Table 1 and Table 2.
Ubuntu 14.04 operating system (64-bit) installed on a home built computer, equipped with Pentium Gold G5400 3.7GHz processor and 16GB memory, was utilized for all computational work.

Results
The inhibitor N3 was bound to Mpro in the structure of the complex (6lu7.pdb) determined from Xray crystallographic data. The binding energy and binding modes of N3 were predicted with Autodock Vina and SMINA. There were minor differences in the binding energies and binding modes determined by the two methods, due to small differences in conformational search algorithms and inherent randomness of conformational search. The second lowest energy binding mode predicted by Vina and the lowest energy binding mode predicted by SMINA were found to match the experimental binding mode. The energies of the first two binding modes reported by Vina were identical, but the first one was significantly different from the native binding mode, while the second had an excellent match with the native binding mode. The superposition of the predicted binding modes and the experimental result are displayed in Figure 1. These two methods may be deemed to yield similar results for the purpose of this study.
The list of approved drugs listed in SuperDrug2 database was used to identify drugs that have the potential to bind to Mpro of SARS-CoV-2 by using Vina. For most drugs the difference between the binding energy calculated with Vina and with flexible docking with SMINA were less than 0.5 kcal/mol. The drug with the best binding in the database was Beclabuvir, with a predicted binding energy of -10.4 kcal/mol (Vina) and a predicted binding energy of -10.0 kcal/mol for flexible docking with SMINA. Although the binding mode of Beclabuvir is similar to that of N3 in the experimentally determined structure of the complex of SARS-CoV-2 Mpro and N3, there are some significant differences in the sites of interaction ( Figure 2).
Saquinavir, an antiviral protease inhibitor, was predicted to have a binding energy of -9.3 kcal/mol with Vina and -9.2 kcal/mol with SMINA. The predicted binding mode of Saquinavir displays a high degree of similarity to that of the ligand N3 in the experimentally determined structure of the complex of SARS-CoV-2 Mpro and N3 (Figure 3).
The predicted binding energies for selected approved antiviral drugs that are under consideration for therapy for COVID-19 are displayed in Table 1. The mean difference between the two methods for predicting binding energies is 0.33 kcal/mol and the maximum difference in energy is 0.9 kcal/mol, for the drugs in Table 1.
The list of highest ranked approved drugs, based on Vina scores, are displayed in Table 2. The corresponding binding energies for flexible docking, estimated by using SMINA are also shown in Table 2. The mean difference is 0.42 kcal/mol and the maximum difference in energy is 1.0 kcal/mol between the two methods for the binding energies displayed in Table 2. These data indicate that Nilotinib, Tadalafil, Lifitegrast, Digitoxin, Digoxin and Tirilazad are also potential drug candidates for COVID-19 therapy.

Discussion
It has been reported that Chloroquine phosphate shows anti COVID-19 activity (Gao et al., 2020), however, the results of this study indicate that Chloroquine has a low binding energy (-5.9 kcal/mol), hence it mechanism of action is not likely to involve inhibition of the SARS-CoV-2 Mpro.
Lopinavir, Ritonavir and Nelfinavir have been reported in earlier studies as potential drug candidates that target Mpro of SARS-CoV-2. The results of this virtual high throughput screening study, based on the reported structure of Mpro obtained from X-ray crystallographic data, are consonant with the earlier predictions. The medium range binding constants for Lopinavir and Ritonavir (-7.4 to -7.7 kcal/mol), predicted in this study, are consistent with preliminary clinical data indicating limited effectiveness for these drugs (Wang et al., 2020). Potentially more effective drugs, with lower binding energy, have been identified in this study. Beclabuvir, with a predicted binding energy of -10.4 kcal/mol (Vina), was identified as the drug with the best binding energy (Table 2). However, based on similarity of binding mode, Saquinavir with a predicted binding energy of -9.3 kcal/mol is the best candidate drug (Table 1 and Figure 3).
Earlier computational studies for prediction of drugs for COVID-19 utilized comparative modeling Lin et al., 2020;Nguyen et al., 2020;Xu et al., 2020). Studies based on the modeled structure of SARS-CoV-2 Mpro, molecular docking and free energy calculations, predicted Nelfinavir to be potential drug for COVID-19 (Xu et al., 2020). Although results of the current study confirm the potential of Nelfinavir to bind and inhibit SARS-CoV-2 Mpro, it is ranked lower than several other clinically approved antiviral protease inhibitors (Table  1).
A Drug Target Interaction Deep Learning Model has been used to predict that Atazanavir and Efivirenz have better binding that Ritonavir (Beck et al., 2020). However, the binding constants estimated with AutoDock Vina, whose scoring function has been validated in multiple studies, predict that both Atazanavir (-7.2 kcal/mol) and Efavirenz (-6.4 kcal/mol) are weaker binders than Ritonavir (-7.5 kcal/mol).
Machine intelligence based Generative Network Complex has been used to design drugs based on the experimental structure of SARS-CoV-2 Mpro with best predicted binding energy of -10.56 kcal/mol (Gao et al., 2020). The binding energies for the GNC based design were estimated by a method that differs from out method. However, the binding energies estimated by their method for Lopinavir and Ritonavir differ from our estimates by less than 1.0 kcal/mol. Gao et al., have also reported the use of high throughput virtual scanning for a limited set of compounds and have reported an estimated binding constant of -10.02 kcal/mol for CHEMBL222234 . The GNC based method is also a promising approach, however, designed molecules require a longer experimental testing phase before they can be used in clincal studies, compared to drugs predicted using the approach described in this study which uses clinically approved drugs.
The primary limitations of the docking based virtual screening strategy are the high false positive rates and low correlation coefficients between estimated binding energy and experimental measures of activity reported in some earlier studies (Leach et al., 2006;Gallard, 2018). Despite these limitations, the estimates of binding energy provide valuable information that can inform and guide further studies (Tanchuck et al., 2016;Masters et al., 2020), such as, Molecular Dynamics based estimates of free energies, in vitro studies and in vivo studies.

Conclusion
Saquinavir and Beclabuvir were identified as the best candidates for COVID-19 therapy based on virtual high throughput screening of clinically approved drugs and the structure of SARS-CoV-2 Mpro determined from X-ray diffraction data. The results of this study also rationalize the limited data regarding effectiveness of drugs for COVID-19 therapy, and provide information that can be utilized for choice of candidate drugs for in vitro studies and in vivo studies. The predicted binding and ranking of drugs will also be useful to interpret the results of ongoing clinical trials that are testing existing drugs for effectiveness against COVID-19.   Figure 1a : Superposition of experimentally determined binding mode of ligand N3 and binding mode predicted with Vina. Green is experimentally determined ligand and blue is predicted binding mode.