Pharmacoinformatics and molecular dynamic simulation studies reveal potential inhibitors of SARS-CoV-2 main protease 3CL

The SARS-CoV-2 was confirmed to cause the regional outbreak of coronavirus disease 2019 (COVID-19) in Wuhan, China. The 3C-like protease (3CLpro), an essential enzyme for viral replication, is a valid target to compacts SARS-CoV and MERS-CoV. In this research, an integrated library consisting of 1000 compounds from Asinex Focused Covalent (AFCL) library and 16 FDA-approved protease inhibitors were screened against SARS-CoV-2 3CLpro. Top compounds with significant docking scores and making stable interactions with catalytic dyad residues were obtained. The screening results in identification of compound 621 from AFCL library as well as Paritaprevir and Simeprevir from FDA-approved protease inhibitors as potential inhibitors of SARS-CoV-2 3CLpro. The mechanism and dynamic stability of binding between the identified compounds and SARS-CoV-2 3CLpro were characterized using 50 nanoseconds (ns) molecular dynamic (MD) simulation approach. The identified compounds are potential inhibitors worthy of further development as SARS-CoV-2 3CLpro inhibitors/drugs. Importantly, the identified FDA-approved therapeutics could be ready for clinical trials to treat infected patients and help to curb the COVID-19.


Introduction
In last 2 decades, several pathogens spilled over and causes outbreaks. Among them, emergence and reemergence of coronavirus related epidemics widely spread fatal respiratory illness 1 .
Coronaviruses are enveloped RNA viruses that are distributed broadly among humans, other mammals, and birds and that cause respiratory, enteric, hepatic, and neurologic diseases 2 February  19 th , 2020, the death toll reached 2009, with 74,284 laboratory-confirmed cases and 5248 suspected cases. The SARS-CoV-2 also widely spread to over 20 different countries.
Recent studies showed that the SARS-CoV-2 belongs to the beta-corona-virus family and it is closely related to SARS-CoV coronavirus 9 . Similar to other beta-corona-virsues, SARS-CoV-2 produces an 800-kDa polypeptide upon transcription of its genome 5 . This polypeptide is proteolytically cleaved to generate various proteins 5,9 . The proteolytic processing is mediated by papain-like protease (PL pro ) and 3-chymotrypsin-like protease (3CL pro ). 3CL pro cleaves the polyprotein at 11 distinct sites to generate many of the non-structural proteins which are important in viral replication. Thus, this protease plays a critical role in replication of virus 10,11 . Structurebased activity studies and various high-throughput studies have identified distinct inhibitors of SARS-CoV and MERS-CoV 3CL pro . Thus, it is essential to identify novel inhibitors of SARS- Traditional methods for identification of inhibitors are expensive and time consuming.
Therefore, the use of in silico techniques for identification of inhibitors has gained importance in recent years 12,13 . The available small molecule database could be utilised for either ligand-based or structure based molecular modelling and effective identification of inhibitors 14 . Moreover, 3CL pro is highly conserved across coronaviruses, therefore, there is a potential target for identification of compounds that could have broad spectrum anti-viral activity 14 . In this contribution, a combined virtual screening approaches, molecular docking and molecular dynamic simulation were utilized to explore potential inhibitors of SARS-CoV-2 3CL pro enzyme as anti-SARS-CoV-2 drugs.

Sequence and Structural Alignment Analysis
A multiple sequence and structure alignment analysis was carried out to find out the evolutionary conserved functional residues among SARS-CoV-2, SARS-CoV and MERS-CoV which could be further targeted as probable targets for the discovery of drug hits. Sequence and 3D structures of SARS-CoV-2 (PDB ID: 6LU7), SRAS-CoV (PDB ID: 2A5I) and MERS-CoV (PDB ID: 5WKK) 3CL pro were retrieved from protein data bank (PDB). The 3CL pro sequences were aligned using Mega v6.0 15 . To ensure broad spectrum relevance of these protein targets, conserved functional residues recognition within active pockets were analyzed through structural alignment as well. Structural alignment/superposition analysis was done using PyMOL tool 16 .

Chemical libraries preparation
Two chemical libraries were obtained; the commercially available Asinex Focused Covalent (AFCL) library, which consist of 1000 molecules, was retrieved from (http://www.asinex.com/) and FDA-approved protease inhibitors including 16 anti-HIV and anti-Hepatitis C antiviral agents were downloaded individually from Pubchem (https://pubchem.ncbi.nlm.nih.gov/) in SDF format.
Discovery studio visualizer 17 was used to combined both libraries in one SDF file.

Structure-based virtual screening
The chemical libraries were screened against the 3CL pro active site within SARS-CoV-2 structure (PDB ID: 6LU7) using Autodock vina in PyRx program 18 . The chemical compounds were initially imported into OpenBable tools in PyRx for energy minimization 19 . The latter, was used also to convert the compounds' SDF files into PDBQT files. The grid box which represent the docking search area, was set to cover the active site of 3CL pro . Compounds were ranked based on their Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 23 February 2020 Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 23 February 2020 doi:10.20944/preprints202002.0308.v1 docking scores in Kcal/mole. Autdock tools 1.5.6 program 20 was used to convert the PyRx output files into PDB files. The molecular interactions and binding modes of top poses were determined using Discovery studio Visualizer 17 and Pymol programs 16 .

Molecular docking
The docking was performed for candidate compounds against the SARS-CoV-2 3CL pro structure using Autodock vina 1.1.2 program 21 . Discovery studio Visualizer was used initially to prepare the protein PDB file. Autodock tool 1.5.6 was used to add the polar hydrogens to the protein structure and to convert the PDB files into PDBQT. The same program was used also to obtain the three-dimensional grid box for docking simulation in which the box with size of 24x22x26 was centered using the following dimension; -1.549 x 2.454 x 7.117 to cover the active site along with the essential residues within the binding pocket. Discovery studio Visualizer and Pymol 1.3.
programs were used for data analysis.
The predicted inhibitory constant (pKi) was calculated using the following equation 22

Molecular dynamic (MD) simulation
The structure of SARS-CoV-2 3CL pro and candidate molecules were prepared for MD simulation using Chimera 1.14 25 . The MD simulation of 3CL pro -inhibitor complexes were carried out at 50 ns using Gromacs 2018.1 package 26 using the OPLS-AA/L force field. The parameters of candidate inhibitors were generated by Swissparam online server (http://www.swissparam.ch/) 27 .
The simulation started by solvating the 3CL pro -inhibitor complexes in triclinic box using TIP3P water model. The counter ions were added to the neutralized the system. Periodic boundary conditions were used. The system was energy minimized using a steepest decent algorithm with a maximum step size of 0.01nm and tolerance of 1000kJ/mol/nm. The system was then equilibrated using NVT and NPT ensemble for 100 ps. Finally, 50 ns production MD was performed for the system. The trajectories were set to be generated every 2 fs and save every 2ps. The 3CL proinhibitor complexes' results were analyzed for root mean square deviation (RMSD), root mean square fluctuations (RMSF), radius of gyration (Rg) and bond potential energy (BPE).

Analysis of 3CL pro for conserveness among coronaviruses
The sequence alignment showed that the 3CL

Structure-based virtual screening
The generated structure was used to perform the structure-based virtual screening against an integrated library of 1016 compounds including 1000 covalent protease inhibitors and 16 FDAapproved protease inhibitors. The latter approach was applied to make use of both de novo drug design as well as the drug repurposing strategies. Interestingly, Paritaprevir and Simeprevir are acylsulfonamide FDA-approved drugs that act as anti-hepatitis C by targeting the NS3/4A protease 28 . Up to our knowledge, there were no clinical data available regarding using of these two compounds to treat SARS-CoV-2, MERS-CoV or SARA-CoV. Pro168

Molecular interaction and binding mode
In order to understand the mechanism of interaction of these compounds with SARS-CoV-2

Radius of gyration (Rg)
The radius of gyration (Rg) of the protein is associated with its size and compactness. The Rg values of three complexes were found to be 2.16 nm at the initial state. The Rg values of the complex of protein with Paritaprevir and Simeprevir were stabilized after initial increase at 5 ns supporting that the systems have reached equilibrium state. In the other hand, the Rg value for compound 621 was decreased from 10 ns to 40 ns and then it slightly increases up to 50 ns. The latter, indicates that the binding of 621 to the protein stabilized it secondary structure ( Figure 5). protease in complex with the three candidate compounds.

Conclusion
In the current study, pharmacoinformatics and molecular dynamic approaches were unitized to identified inhibitors of SARS-CoV-2 3CL pro as treatments for the new outbreak coronavirus disease 2019. Three compounds, compound 621, Paritaprevir and Simeprevir, were identified as potential inhibitors of SARS-CoV-2 3CL pro enzyme with predicted inhibitory constant in low micromolar concentration range. The binding affinity, mechanism and stability of binding of these compounds to SARS-CoV-2 3CL pro were confirmed by molecular docking and molecular dynamic simulation. Compound 621 could be used as a seed for de novo drug design of potential inhibitors to target the 3CL pro enzyme of SARS-CoV-2 as well as MERS-CoV and SARS-CoV. The clinical agents, Paritaprevir and Simeprevir may also play a role in expediting the drug discovery process and be tested in clinical trials as a treatment for coronavirus disease 2019.