Potential therapeutic agents for COVID-19 based on the analysis of protease and RNA polymerase docking

The outbreak of novel coronavirus (COVID-19) infections in 2019 is in dire need of finding potential therapeutic agents. In this study, we used molecular docking to repurpose HIV protease inhibitors and nucleoside analogues for COVID-19, with evaluations based on docking scores calculated by AutoDock Vina and RosettaCommons. Our results suggest that Indinavir and Remdesivir possess the best docking scores, and comparison of the docking sites of the two drugs reveal a near perfect dock in the overlapping region of the protein pockets. After further investigation of the functional regions inferred from the proteins of SARS coronavirus, we discovered that Indinavir does not dock on any active sites of the protease, which may give rise to concern in regards to the efficacy of Indinavir. On the other hand, the docking site of Remdesivir is not compatible with any known functional regions, including template binding motifs, polymerization motifs and nucleoside triphosphate (NTP) binding motifs. However, when we tested the active form (CHEMBL2016761) of Remdesivir, the docking site revealed a perfect dock in the overlapping region of the NTP binding motif. This result suggests that Remdesivir could be a potential therapeutic agent. Clinical trials still must be done in order to confirm the curative effect of these drugs. Introduction In the concluding weeks of 2019, an outbreak of novel coronavirus (COVID-19) infections occurred in Wuhan, China. As of February 25, 2020, more than 80,000 cases and 2,700 deaths have been reported. With no proven antiviral agent available, medical professionals have resorted to supportive care to contain the infection. However, current research now suggests that certain drugs with the appropriate viral restraining mechanisms can yield promising results. At the Rajavithi Hospital in Thailand, the infectious disease team used a combination of Oseltamivir (anti-influenza agent) and Lopinavir/Ritonavir (anti-HIV agent) to successfully improve patients with severe conditions. Lopinavir and Ritonavir are both HIV protease inhibitors that suppress the cleaving of a polyprotein into multiple functional proteins1. Likewise, various clinical trials are also Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 29 February 2020 © 2020 by the author(s). Distributed under a Creative Commons CC BY license. doi:10.20944/preprints202002.0242.v2


Introduction
In the concluding weeks of 2019, an outbreak of novel coronavirus (COVID- 19) infections occurred in Wuhan, China. As of February 25, 2020, more than 80,000 cases and 2,700 deaths have been reported. With no proven antiviral agent available, medical professionals have resorted to supportive care to contain the infection. However, current research now suggests that certain drugs with the appropriate viral restraining mechanisms can yield promising results. At the Rajavithi Hospital in Thailand, the infectious disease team used a combination of Oseltamivir (anti-influenza agent) and Lopinavir/Ritonavir (anti-HIV agent) to successfully improve patients with severe conditions. Lopinavir and Ritonavir are both HIV protease inhibitors that suppress the cleaving of a polyprotein into multiple functional proteins 1 . Likewise, various clinical trials are also now being undergone on nucleoside analogue drugs such as Remdesivir, an antiviral drug proven to be effective against a wide range of RNA viruses in vitro 2 . This study aims to determine whether the protease of COVID-19 can be a target protein for Lopinavir and Ritonavir, and attempts to identify other HIV protease inhibitors with even stronger binding affinities. Additionally, we also tested a set of RNA virus agents for potential binding with RNA-dependent RNA polymerase (RdRp) of COVID-19 in this study.

Methods
For our target receptors, we chose the 3-chymotrypsin-like protease (3CL-protease) and RdRp, the main protease used to cleave polyproteins into replication-related proteins, and the main protein for RNA replication respectively. The 3CL-protease structure (PDB ID: 6LU7) 3,4 was obtained from the RCSB Protein Data Bank 3 , which was released on February 5th, 2020 4 . Because the structure of the COVID-19 polymerase is currently unavailable, we adapted an approximate model from the sequence of the patient, BetaCoV/Taiwan/2/2020|EPI_ISL_406031 obtained from the Global Initiative on Sharing All Influenza Data (GISAID) 5,6 using homology modeling. Our current model is based on the template of the SARS coronavirus polymerase (PDB ID: 6NUR) 7 built using the Swiss-model 8,9,10,11,12 . We achieved a root-mean-square deviation (RMSD) of 0.073 angstroms between the 793 aligned atom pairs, indicating that these two proteins share very similar structures (see Supplement Figure 1 and 2).
To simulate binding affinity between protein and ligands, two docking tools, AutoDock Vina (version 1.1.2) 17 and RosettaCommons (version 3.11) 18-20 were used. Since AutoDock Vina only takes Protein Data Bank, Partial Charge (Q), & Atom Type (T) (PDBQT) formats as input, we used OpenBabel (version 3.0.0) 21 to convert SDF to PDBQT. The entire protein is taken as the search space. We also used RosettaCommons to dock the selected drugs to their corresponding protease and polymerase under the standard default settings. This process was repeated 100 times for each ligand, and the final mean affinity score was taken. From this, we built a heat map for the residues in 5 Å of binding sites to represent the binding frequency of each ligand.
Being a valid drug, we anticipate the ligand to dock at either a protein pocket or a functional region. Therefore, we used the CASTp 22 tool to predict potential pockets of our target proteins, and confirmed whether the highest frequency binding sites of the heat map were located in a protein pocket. Additionally, due to the high similarity between the SARS and COVID-19, the catalytic sites of SARS protease obtained from the Uniprot database 23 and the functional motif of SARS polymerase obtained from previous study 24 were used to label the corresponding region of COVID-19. Finally, we superimposed the heat map, pocket sites and functional regions to visualize the binding poses. Table 1 are the results of the 10 ligands docked with the 3CL-protease. These scores, which are the original raw outputs from both docking tools, represent the relative binding affinity. For AutoDock Vina, Indinavir has the best docking score, even outperforming Lopinavir and Ritonavir, the two drugs currently in use in Thailand. As for the results of RosettaCommons, all ten drugs displayed similar scores, though Amprenavir, Atazanavir, and Darunavir have slightly better performances. We therefore chose Indinavir for further investigation. Visualization of Indinavir docking is shown on Figure 1. Figure 1a shows the overlap of the ligand heat map and the protein pocket. The purple colored region highlights that the binding sites are mostly located at the protein pocket. However, one thing to note is that the active sites [His41, Cys145] of 3CLprotease does not overlap with the binding site (Figure 1b), which may be a concern in the case of inhibition. Guangdi Li and Erik De Clercq also noted that the C2-symmetric site, which is the optimized fitting site of HIV protease inhibitors, was absent in 3CL-protease 25 . Lastly, HIV protease belongs to the aspartic protease family, while the 3CL-like protease is of the cysteine protease family 25 . These inconsistencies may lead to unwanted noises in our docking results.

Listed in
The docking scores of the four ligands to RdRp using AutoDock Vina and RosettaCommons are listed in Table 2. Remdesivir has the best docking performance by both AutoDock Vina and RosettaCommons. This suggests that Remdesivir has the most stable docked structure compared to that of the other drug candidates. Although Remdesivir has the lowest docking score, owing to Remdesivir's nucleoside analog prodrug properties, we have to further examine its different formations 26 . We applied AutoDock Vina and RosettaCommons on three formations of Remdesivir, including Remdesivir (Nucleoside Analogue Monophosphate Prodrug with Protect Group), GS441524 (Nucleoside Analogue) and CHEMBL2016761 (Nucleoside Analogue Triphosphate). Listed in Table 3, these results show that only CHEMBL2016761, the active form of nucleoside analogues that act by blocking viral replication, still has a good docking score. We therefore conclude that the formation of a drug will affect the docking score estimation, and if available, the active form should be used for more accurate estimation.   We examined the docking results of the given structural complex between CHEMBL2016761, the active form of Remdesivir, and RdRp by looking at the contact frequencies between CHEMBL2016761 and RdRp, shown in Figure 2a. We observed that the docking orientation with the highest docking consensus lies in the binding pocket of the RdRp. After further investigation of the functional regions inferred from the proteins of SARS coronavirus, which include the template binding motif, polymerization motifs and the NTP binding motif 24 , (See Supplement Table  1), we discovered that the docking site does not dock on template binding motifs and polymerization motifs but docks perfectly in the overlapping region of the NTP binding motif (Figure 2b). This result agrees with the previous study published in Cell Research 27 , which suggests that Remdesivir is highly effective in controlling COVID-19 infection in vitro. Although our docking results agree with in vitro studies, the association between these docking results and the effectiveness of treating COVID-19 still needs further examination. We urge facilities with the appropriate equipment to continue this study in vitro.
In conclusion, our findings reveal that Indinavir and Remdesivir possess relatively low docking scores, as well as docking sites that strongly overlap with the protein pockets. We infer that this means better structural stability of their protein-ligand complexes. However, the docking site of Indinavir is not located in the catalytic sites of 3CL-protease. This may lead to concerns regarding it's inhibition ability. As for Remdesivir, the docking site of CHEMBL2016761 is perfectly located in the NTP binding motif, which is expected to block the replication of RNA sequence. Because both drugs have been used in clinical practices with limited toxicity, we recommend that they should be taken into consideration while treating for COVID-19.  (colored in dark blue) and the SARS polymerase structure (RdRp, PDB ID: 6NUR, colored in orange). The RMSD between 793 aligned atom pairs is 0.073 angstroms, which indicates that these two proteins shared very similar structure.