Fragment-based virtual screening to discover potential Plas- modium PI4KIIIβ ligands

Plasmodium species that cause malaria, a disease responsible for about half a million deaths per annum despite concerted efforts to combat it. The causative agent depends on type III beta phosphatidylinositol 4-kinase (PPI4K) during the development of merozoite. PPI4K is the only clinically validated Plasmodium kinase so far and its inhibitors are effective both in vitro and in vivo. In this work, a small library of ~22 000 fragments was virtually screened using PPI4K homology model to discover potential ligands of the enzyme. 16 virtual hits were selected based on ≤ -9.0 kcal/mol binding energy cut off and were subjected to similarity and substructure searching after they had passed PAINS screening. The derivatives obtained showed improved binding energies, which ranged from -10.00 to -13.80 kcal/mol. Moreover, the topmost ranking compound 31, with interesting drug-like quality was stable within the protein’s binding cavity during the 10 ns molecular dynamics simulation period. In addition, analysis of its binding pose revealed some unique binding interactions with PPI4K active site residues as the basis for the observed improved binding affinity. Overall, compound 31 appears to be a viable starting point for the development of PPI4K inhibitors with antimalarial activity.


Introduction
Malaria is a fatal human parasitic disease caused by five species of Plasmodium, with P. falciparum as the deadliest species, and was responsible for more than 220 million clinical cases and about half a million deaths in 2018 alone [1][2][3]. Children under the age of five and pregnant women are most vulnerable and above 90% of the cases occurred in the sub-Saharan African region alone according to World Health Organization (WHO) 2019 report [4][5]. There has been progress in controlling the disease through drug therapies, the use of insecticides, and design of vaccines. However, it has persisted because of a plethora of reasons spanning from non-adherence to prescription by patients, emergence of Plasmodium drug-resistant strains to resistance to commonly used insecticides among the mosquito vector population [6][7][8]. Undoubtedly, the need for an effective antimalarial drug is urgent and many researchers tend to agree that targeting an enzyme that is vital in all Plasmodium life cycle will do the job. An example of such a target is type III beta phosphatidylinositol 4-kinase (PI4K) [9][10][11]. PI4KIIIβ is a ubiquitous eukaryotic enzyme that plays an essential role in merozoite development through the regulation of intracellular signaling, cytokinesis, and membrane trafficking by catalyzing the conversion of phosphatidylinositol (PI) to phosphatidylinositol-4-phosphate (PI4P) [9][10][11]. Sternberg and Roepe [12] study identified PPI4K as an operational enzyme in all Plasmodium life-cycle stages and further provided genetic, chemical, and biochemical evidence to show that PPI4K is a suitable target for antimalarial drug and its inhibition can cure, prevent and block the transmission of the disease. In addition, the fact several scientists have virtually screened many small-molecule libraries [13][14] and chemical series such as aminopyridine/pyrazine series [15], imidazopyridazine series [16], and naphthyridine series [17] to discover PPI4K inhibitors underpins the relevance of this enzyme in antimalaria drug search. However, no study to the best of our knowledge has applied a fragment-based screening strategy to discover PPI4K ligands. Over the past two decades fragment-based drug discovery has demonstrated to be one of the most prominent alternatives to high throughput screening utilized in the identification of lead compounds in drug discovery. It begins with finding low fragments or low molecular weight compounds that bind with weak affinity to the target of interest. The fragments that form high quality interactions are maximized to lead compounds with high affinity and selectivity [18].
In the present study, a PPI4K homology model was constructed using PI4K of human crystal structure as a template, validated a molecular docking protocol using a series of PI4K inhibitors from PubMed, and applied it to screen a small library of fragment molecules. Pools of compounds from fragment hits were re-docked to investigate improved binding affinity and molecular dynamics simulation was applied to evaluate binding stability. The result of this study is capable of leading to new PPI4K inhibitor templates for designing novel antimalarial drugs.

2.1.Homology modelling
Since there was no X-ray crystal structure of Plasmodium PI4K (PPI4K), a model of it was constructed using its amino acid sequence which was taken from the NCBI website [19]. During BLAST, the target sequence was used to search RCSB Protein Data Bank to identify the template structure (PDB ID: 6GL3) [20]. The Modellerver 9.21 [21] was used to build the homology model and subsequently the selection of the homology model for this study was based on the assessment of the Modeller objective function and Discrete Optimization Protein Energy. PROCHECK [22] and Discovery Studio [23] programs were used to evaluate the quality of the best model.

Preparation of molecules and virtual screening
All the compounds used in this study (fragment and fragment-derivatives from the ZINC database [24]; inhibitors and non-inhibitors of PI4K from ChEMBL library) were energy minimized with Openbabel software.
The homology modeled PPI4K structure was prepared for molecular simulation purposes following the previous protocol described by Ibezim et al, 2017 [25]. Protonate 3D module and molecular mechanics MMFF94 force field [26] implemented in Molecular Operating Environment (MOE) [27] were respectively used to protonate the protein residues (with the pH set at 7.0 ± 2.0) and optimally orient the protein atoms to the lowest energy level (potential energy gradient of 10-15 kcal/mol).
To conduct molecular docking, first, the modelled protein was aligned to the template protein-ligand complex (PDB ID: 6GL3) to locate the binding site. Afterwards, a grid box measuring 42 x 22 x 20 points with 0.375 Å point spacing was generated around the protein active site and then both the inhibitors and non-inhibitors were docked to the grid using the AutoDock suite as described in our earlier work [28]. The AUC plot was generated from an R program. The validated virtual screening protocol was used to virtually screen the study compounds.
An online platform was used to filter the fragment hits against PAINS [29] while MACSS structural keys were used to assess their structural novelty. Candidates which passed both assessments were submitted for similarity and substructure analysis among compounds in the ZINC database.

Molecular dynamics simulation
The parameter file of ligand was generated by Automated Topology Builder (ATB) [30] whereas the molecular dynamics simulation was carried out by Gromacs ver4.5.5 using GROMOS96 (ffG53a6) force-field [31]. The protein-ligand complex was submerged into solvated cubic box simple point charge (SPC) comprising an explicit water model and sufficient sodium and chloride ions which serve to neutralize the system. Steepest-descent integrator and conjugate gradient algorithm were applied to perform energy minimization. The system was agitated for a period of 10 ns during which solvent molecules and ions were allowed to equilibrate around the solute molecule at 300 k and 1 bar while all the non-hydrogen atoms were subjected to position restraining force.

Construction of PPI4K homology model
The Plasmodium PI4K homology model was built using the protein sequence (sequence ID: KNG744841.1) with Modeler 9.21. Blast similarity search identified the human PI4K with an access ID: 6GL3 ( Figure S1), as the suitable template for constructing the model since there, exists a sequence identity of 44.36% between them and the template structure (6GL3) has an inhibitor bound within the ATP-site which stretches beyond the ribose pocket to create ATP binding pocket with large conformation in the P-loops to accommodate molecules of varying structural sizes. After several models were built by the modeler program, the best model was selected based on the DOPE scoring function. PROCHECK identified that 77.0%, 19.70%, 2.60%, and 0.70% of the protein model residues are respectively located in the core, allowed, the general and disallowed region as shown by the Ramachandran plot in Figure 1Since the stereo-chemical quality of the homology model was satisfactory, it was then used in this study.

Evaluation of virtual screening workflow
To ensure our virtual screening protocol is reliable, its ability to discriminate between PI4K actives and inactives were tested using a dataset of compounds extracted from the ChEMBL database in which compounds with IC50 < 10 µM are grouped as PI4K inhibitors and those with IC50 > 10 µM are grouped as non-inhibitors. The ratio of actives to inactives in the validation set was 1:5 (that is 30 out of 150 molecules). Figure 2 shows the enrichment curve and explains that the enrichment is better than random since according to the area under the ROC (AUR), the protocol demonstrated 82.53% overall prediction accuracy. Therefore, it can retrieve potential PI4K actives from a database of diverse molecules in a virtual screening scheme.

Fragment-based virtual screening
A subset of 21 844 unique fragments was extracted from the ZINC database which is composed of 727 842 purchasable compounds and a total of over 34 million molecules. These fragments were virtually screened to identify new PI4K ligands. They showed binding energies which spanned from positive values to as low as -9.31 kcal/mol. Using binding energy ≤ -9.00 kcal/mol as cut-off, 16 best-ranked fragments were selected and submitted for pan-assay interference compounds (PAINS) filtration to identify frequent hits which often are promiscuous binders. Two molecules were caught by the filter while the remaining 14 fragments scaled through. To focus our study on molecules with unusual structural scaffolds different from already known PI4K inhibitors, Tanimoto coefficient calculation using MACCS structural fingerprints at the overlap of 60% to known PI4K ligands curated from the ChEMBL database was performed and resulted in seven fragments with 2 uncommon scaffolds ( Figure 3). Through visual inspection, the two scaffolds were identified as isoquinoline and indazolone. To explore ligand space of the uncommon scaffolds, they were used to conduct substructure and similarity search on the >34 000 000 molecules of the ZINC library. This led to 131 molecules that were docked into the PPI4K binding site. Although most of the derived molecules exhibited greater binding affinities than their starting materials, it appeared the protein preferred interaction with the indeno-isoquinolinedione and its derivatives since most of them ranked ≤ -11.00 kcal/mol (Table 1).
There was no clear structure-activity relationship observed among the indeno-isoquinolinediones, however, it seems the N-2 position does not allow flexible moieties since all the derivatives (compounds 5, 6, 10, 11, 13, 14, 22) with a flexible group at that position have relatively low scores, ≥ -10.00 kcal/mol. On the other hand, the presence of rigid group(s) and moieties capable of making electrostatic interaction at the N-2 and para positions of the two benzene rings tends to enhance binding affinity with PI4K.
That is probably the reason compound 29 has enhanced binding energy over those of 4 and 7 (Table 1). Of all the derivatives, compound 31 exhibited the highest binding affinity for PPI4K (-13.80 kcal/mol). In addition, results of the computed molecular descriptors (Table 2) suggest that this compound possesses attractive physicochemical properties. Hence, it is an appropriate candidate in this series worthy of further attention for PI4K inhibitor development. Using MD simulations for a period of 10 ns, we investigated the stability of 31 in the PI4K binding pocket and the results showed that the compound maintained stable binding interaction with the protein's residues throughout the agitation time. According to Figure 4A and 4B respectively for the entire simulation period, compound 31 did not deviate more than ~2.0 Å away from the protein and the binding energy of the protein-ligand complex remained approximately the same. The pose of 31 ( Figure 5A) retrieved after 9 ns of simulation revealed the compound fits tightly into PI4K binding pocket and Figure 5B showed the major intermolecular interactions which informed the observed stable binding of 31 to PI4K. Hydrogen bond was observed between compound 31 hydroxyl group and L15 backbone while carbonyl groups of the ligand made contact with the NH groups of W17 and N98 side chains. Overall, it appears polar bond is the main force behind the intercourse between the protein and ligand, even though the presence of tyrosine and tryptophan residues in the binding site suggests the pi-pi bond might have played a role.

Conclusions
A homology model of PPI4K based on the human PI4K template was built and a fragment-based virtual screen was performed to identify novel ligands of the protein. Two scaffolds uncommon among known PI4K ligands were identified amidst the fragment virtual hits. However, only the derivatives of one of them (indo) exhibited an interesting greater binding affinity. Importantly, the best-scored derivative has attractive physical features required for a drug candidate and was found to make a stable binding interaction with the PPI4K active site residues.