Phytochemicals as therapeutics against COVID-19: An in-silico study

—Since December 2019, the worldwide spread of COVID-19 has brought the majority of the world to a standstill, affecting daily lives as well as economy. Under these conditions, it is imperative to develop a cure as soon as possible. On account of some of the adverse side effects of the existing conventional drugs, researchers all around the world are screening natural antiviral phytochemicals as potential therapeutic agents against COVID-19. This paper aims to review interactions of some specific phytochemicals with the receptor binding domain (RBD) of the Spike glycoprotein of SARS-CoV-2 and suggest their possible therapeutic applications. Literature search was done based on the wide array of in-silico studies conducted using broad spectrum phytochemicals against SARS-CoV-2 and other viruses. 26 such phytochemicals specifically targeting the S protein and its interactions with host receptors were shortlisted initially. To validate the previously published results, molecular docking was performed using the AutoDock Vina application and 6 high potential phytochemicals were identified for therapeutic use based on their binding energies. Besides this, availability of these compounds, their mode of action, toxicity data and cost-effectiveness were also taken into consideration. This review specifically identifies 6 phytochemicals that can be used as potential treatments for COVID-19 based on their availability, toxicology results and low costs of production. However, all these compounds need to be further validated by wet lab experiments and should be approved for clinical use only after appropriate trials.


INTRODUCTION
Coronaviruses are a diverse group of enveloped positive strand viruses infecting many different animals, and they can cause mild to severe respiratory infections in humans [1][2][3]. Among them 3 epidemics caused by the coronavirus family, that is, Severe Acute Respiratory Syndrome (SARS) [1], Middle East respiratory syndrome (MERS) [4], and COVID-19 caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [2,3,5] affected large number of people throughout the world. These 3 types of corona viruses are zoonotic in nature and cause respiratory tract infections which can sometimes be fatal [6]. But SARS-CoV-2 with its high transmissibility thrust into the spotlight situating a world-wide pandemic [2,6]. Towards  unknown aetiology appeared in Wuhan, China [2,3,[5][6][7][8][9][10][11]. Cluster of patients exhibited symptoms of viral pneumonia including fever, difficulty in breathing, and bilateral lung infiltration with unknown cause, which interestingly had similarity with patients with SARS and MERS [3,[9][10][11][12][13]. They were connected with the Huanan seafood market of Wuhan, which sells not only sea food but also live animals, including poultry and wildlife [2,3,5]. Bronchoalveolarlavage fluids were collected from patients [3,5]. From this fluid nucleic acids were extracted and polymerase chain reaction (PCR) performed with nucleic acid of an uninfected person as a negative control [3,5,[13][14][15]. Followed by RT PCR and gene sequencing, scientists concluded the emergence of a novel coronavirus (2019-nCoV, also known as SARS-CoV-2) [3,5,13,14]. On January 30, 2020 the World Health Organization (WHO) declared the outbreak as a "Public health emergency of international concern" and subsequently on March 11, a global pandemic was declared by the name of COVID-19 (Coronavirus Infectious Disease 2019) [6]. From the day SARS-CoV-2 was first encountered to late November 2020 the infection has been spreading in a humongous manner. More than 208, 736, 618 cases and more than 4,384,159 deaths have been recorded worldwide till now ( Figure 1).
Another dangerous aspect of this virus is rapid and severe progression of clinical manifestations of individuals with underlying health conditions or comorbidities often leads to multiorgan failure, and even death [2,[5][6][7][8][9][10].
These severe clinical outcomes are only treatable by symptomatic treatments combined with the use of conventional antiviral drugs, and even the prognosis is poor in most critical cases [16]. Moreover, owing to the increased risk of opportunistic secondary infections caused by antibiotic-resistant pathogens, along with the high production cost and risk of adverse side effects caused by conventional drugs, scientists worldwide are on the lookout for a safer alternative. Plant Secondary Metabolites (PSMs), more commonly referred to as phytochemicals, are chemical compounds secreted by plants as a means of defense against herbivores and harmful pathogens and have been found to possess broad spectrum antiviral properties [17]. These phytochemicals have been used as a part of Chinese and Indian traditional medicines for treating various diseases since ages and are well known for their antiviral effects [18]. In addition, since they are products of natural origin, they are easier to produce as compared to conventional drugs and show little adverse side effects. Hence, phytochemicals can be proposed as viable alternatives in the treatment of COVID-19.
The pleomorphic SARS-CoV-2 is roughly spherical in shape, with crown-like projections on the surface which are referred to as S glycoproteins [19]. It has a positive single stranded RNA genome, roughly 30 kb in length, and shares about 88% similarity with SARS-CoV and about 67% similarity with MERS-CoV genomes [20]. It is, however, most closely related to Bat coronavirus RaTG13 (sequence similarity almost 96%), suggesting the virus first originated in bats and then passed on to humans after mutating in an intermediate host, moist likely pangolins [20][21][22][23]. The SARS-CoV-2 reference genome (NC_045512.2) consists of two polyproteins ORF1a, ORF1ab (the largest open reading frame in the genome), 4 structural proteins Spike (S), Envelope (E), Membrane (M) and Nucleocapsid (N). It also contains 6 accessory proteins, ORF3a, ORF6, ORF7a, ORF7b, ORF8 and ORF10. The Spike protein (S) is a trimeric glycoprotein required for viral attachment to host cells and entry, E and M proteins are necessary for viral assembly, organization and host-viral interactions, whereas N protein is usually associated with the RNA genome of SARS-CoV-2 [24][25][26].
The life cycle of SARS-CoV-2 starts from its entry into the human respiratory tract, usually through nose and mouth, sometimes eyes. On reaching the lower respiratory tract, the virus attaches to the Angiotensinogen Converting Enzyme 2 (ACE2) receptors expressed on the surfaces of epithelial cells lining the respiratory tract via the Receptor Binding Domain (RBD) of the S protein. This binding is also facilitated by cellular Transmembrane serine protease 2 (TMPRSS2) co-receptor [27]. In addition, Furin, a host cell protease cleaves the conserved Furin cleavage site between the S1 and S2 domains of the S protein after its initial attachment to ACE2 receptors, a process critical for membrane fusion and viral entry [26]. After entering the cell via receptor mediated endocytosis, the virus releases its genetic material into the cytoplasm and hijacks the host protein synthesis machinery to translate ORF1a and ORF1ab. This process generates two polyproteins pp1a and pp1ab polyproteins, which are cleaved by 3CL proteases to form 16 non-structural proteins (NSPs) which play important roles in replication and assembly [27]. One of these, NSP12 is the RNA dependent RNA Polymerase  (RdRP), which transcribesstrands of RNA genome and again uses thisstrand as template to synthesize hundreds of + stranded RNA genomes and sub-genomic mRNAs ( Figure 2) [11,25]. The latter are used for synthesizing various structural and accessory proteins which are transported from the cytoplasm to the Endoplasmic reticulum Golgi Intermediate Compartment (ERGIC) [11]. Simultaneously, the newly synthesized + stranded RNA genomes are targeted to ERGIC and there the virion assembly takes place. After assembly, the newly synthesized virions are transported to the cell membrane by vesicles and leave the cell by exocytosis to infect nearby host cells [11]. The increased pressure of viral load in the ER ultimately kills the host cells, causes the immune system of the body to release inflammatory cytokines, accumulation of mucus in the respiratory tract and ultimately death in severe cases [11,19,[28][29][30].
RECEPTOR BINDING COMPARISON BETWEEN SARS-COV AND SARS-COV-2 SARS-CoV and SARS-CoV-2 both belong to the genus of beta coronavirus [31]. The first report of SARS-CoV came in 2002 in Guangdong, China. It had a fatality rate of 10% [32]. SARS-CoV-2 as named by Coronaviridae Study Group was reported from Wuhan, China around December 2019 [32]. The name corona was suggested because its spike proteins resemble a crown. SARS-CoV and SARS-CoV-2 have a whole genome sequence similarity of 88% while Receptor Binding Domain (RBD) has a similarity of 73%-78% [31,33].
The most prominent difference between SARS-CoV and SARS-CoV-2 is their differential affinity of RBD towards ACE2. It has been reported that 14 amino acid residues of SARS-CoV S-protein RBD interact with human ACE 2 which are Tyr436, Tyr440, Tyr442, Leu443, Leu472, Asn473, Tyr475, Asn479, Gly482, Tyr484, Thr486, Thr487, Gly488 and Tyr491. However, in SARS-CoV-2 only 8 out of the 14 are found to be conserved (Tyr449, Tyr453, Asn487, Tyr489, Gly496, Thr500, Gly502, and Tyr505) and substitution has occurred in Leu455, Phe456, Phe486, Gln493, Gln498 and Asn501 [34]. SARS-CoV-2 has a higher affinity than SARS-CoV for ACE2 receptor [33,34]. On the contrary it is very astonishing to find that binding affinity of entire S protein of SARS-CoV-2 with ACE2 receptor is comparable or lower than entire S protein and ACE2 interaction in SARS-CoV [33]. This anomaly is mainly due to the fact that RBD alternates between "standing up" and "lying down" position [34]. Various experiments using flow cytometry and Cryo EM have proved that spikes alternate between standing up and lying down position respectively [34].
Entry of both SARS-CoV and SARS-CoV-2 into the host cell involves two proteases Transmembrane protease Serine 2 (TMPRSS2) and Cathepsins [34]. However, in addition to both of these CoV-2 contains an additional furin like cleavage site which proves to be very important for their entry into the lung cells [34]. Serine protease cleavage site is very much responsible for thrombotic complications as one of the serine proteases is activated by thrombin [34]. This may lead to various complications which ultimately results into plasma leakage and alveolar obstruction [34].
Researchers proposed that a mutation V404 -K417 resulted in higher binding interactions which created a salt bridge between K417 and D30 [35]. Whereas weakening of interaction with E329 is due to R426 -N439 mutation [35]. However, literature survey study has come into the conclusion that former mutation has caused strong electrostatic interactions which have suppressed the mutation of the second [35].

USE OF PHYTOCHEMICALS AGAINST SARS-COV-2 AND
THEIR ADVANTAGES AGAINST CONVENTIONAL DRUGS ACE2 is the membrane bound receptor which facilitates the entry site of SARS-CoV-2 in the human host. ACE2 shows the catalytic activity in many tissues such as heart, kidney, intestine apart from lungs ( Figure 3). The presence of ACE2 in such tissue confirms the risk of heart attack,  [36].
In SARS-CoV-2 infection, the S-protein receptor binding domain binds with the ACE2 receptor with the help of TMPRSS2 (Transmembrane protease serine 2). This protease is used as priming the S-protein-ACE2 complex formation. The S-protein is composed of few peptide subunits which mediates viral entry via ectodomain, which consists of 3 S1 subunit heads responsible for receptor binding and a trimeric S2 subunit facilitates the membrane fusion [36].
Recently, in the wake of the pandemic, several previously used broad spectrum drugs have been repurposed for treating complications associated with COVID-19. However, their efficacy remains in question due to their non-specific mode of action against SARS-like coronaviruses, unintended adverse side effects and increased resistance of microbial pathogens against allopathic drugs [37]. Thus, efforts are being made to revert back to natural products, particularly Plant Secondary Metabolites (PSMs) which used to be extensively used for treating several kinds of human diseases before the advent of modern medicine. Almost every culture around the globe have been using phytochemicals as traditional medicines as is evident from different historical records. PSMs are chemical compounds produced by plants as a natural defense mechanism against herbivores and microbes. They are a source of natural antiviral compounds that could be a viable alternative to conventional drugs, as they are mostly safer and more cost effective [37]. Other than that, they are available from natural crude extracts of plants and have been shown to specifically target proteins of SARS like coronaviruses resulting in inhibition of entry into host cells, interfering with intermediate metabolic pathways or inhibition of DNA/RNA synthesis [37]. For example, several phytochemicals interfere with the cell entry mechanisms of SARS-CoV-2 by binding with the receptor binding motifs of S-protein.
In this paper 26 phytochemicals have been screened to determine their feasibility as potential inhibitors against the viral proteins. Among them some phytochemicals bind with the S-protein and few interact with the ACE2 receptor (Table 1).

MOLECULAR DOCKING OF PHYTOCHEMICALS AGAINST S-PROTEIN AND ACE2 RECEPTOR
Numerous in-silico studies have already been conducted to predict the binding affinity of various phytochemicals to different SARS-CoV-2 proteins like S-protein, Main Protease (MPro), RNA dependent RNA Polymerase (RdRP) and human ACE2 receptor [38,39]. For this study, primarily the phytochemicals showing interactions with S protein were selected. Literature search was performed on PUBMED CENTRAL (PMC) and Google Scholar to screen for phytochemicals showing the highest binding affinities to Sprotein or S-protein-ACE2 RBD [38][39][40][41][42][43][44][45][46][47]. Keywords like "SARS COV 2", "PSM", "Phytochemicals" etc were used for the searches and the search set was till 20 November 2020. Additional details such as the sources, availability, costs and modes of action were noted along with their respective binding energies wherever possible. In order to validate and standardize those results, molecular docking was done again using the 26 shortlisted phytochemicals ( Table 2) (Table 3). Interestingly, all of these six phytochemicals have previously been reported to have anti-SARS-CoV-2 activities as validated by literature survey. Hesperidin have been shown to be a potent inhibitor of pro-inflammatory cytokines such as IFN-δ, IL-2 and IL-1β stimulated immune responses, which are primarily responsible for the "Cytokine storm" seen in critical COVID-19 patients [48,49]. EGCG, a polyphenol found in green tea, have been reported to inhibit SARS-CoV-2 M-Protease enzyme activity as well  Table 3 PSMs showing highest potential as therapeutics. Binding energy = -7.8 kcal/mol Chain A (S-protein) = Arg-393, Asn-394 Binds to the same pocket as the EGCG ligand. (Fig. 4f) as inhibit the replication of HCoV-OC43 and HCoV-229E coronaviruses in-vitro [50]. Likewise, Rosmanol, obtained from Rosmarinus officinalis shows potential inhibitory activity against M-Protease as reported in the in-silico studies performed by Kundu et al. [51]. Luteolin, a flavonoid shows immunomodulatory effects in mice models by inhibiting NF-κβ pathway, decreasing TNF-α, IL-6 and IL-1β levels [52]. In addition, it has also been shown to decrease the number of immune cells like CD19 + B cells, CD4 + T cells etc. thus lowering the risk of cytokine storm in the lung of inflamed airway mice model. In-vitro studies conducted using Vero cell lines infected with SARS-CoV-2 by Pasquereau et al. [53] showed Resveratrol to be the most potent inhibitor of coronavirus replication among other similar compounds. Quercetin, another flavonoid known for its broad-spectrum antiviral properties, have been shown to be a promising treatment for COVID-19 associated tissue injury by network pharmacology and molecular docking studies [54]. Furthermore, coadministering Vitamin C and Quercetin have been recorded to significantly improve COVID-19 prognosis in highly susceptible populations.

Protein-ligand Docking
Molecular docking was performed using the 26 plant secondary metabolites in order to find their affinity towards the receptor binding domain (RBD) of the S-protein interacting with the ACE 2 receptor, present in the host cell.
To perform the protein-ligand docking process, the Crystal structure of SARS-CoV-2 spike receptor-binding domain bound with ACE2 (PDB ID -6M0J) [55] was used as the receptor and the 26 plant secondary metabolites (PSMs) were the ligand molecules. The protein and ligand preparations were done using the AutoDock Tools (version 1.5.6). The downloaded Protein Data Bank structure 6M0J contained Chain A of the Spike Glycoprotein RBD and Chain E of the ACE 2 receptor domain. At first all the water molecules and heteroatoms were deleted and Kollman charges were added evenly throughout the residues of the receptor protein and then the ligands were Fig. 4 The results of the docking performed via AutoDock Vina a: Hesperidin binding into the pocket formed between the RBD of S-protein and ACE 2, interacting with His-34, Lys-353 of S-protein and Arg-403, Arg-408 of ACE 2. Thus, proving it to be the most potential candidate against COVID-19. b: EGCG, having the second highest binding energy sits inside a pocket of the RBD interface of the S-protein and interacting with the 3 residues viz. Arg-393, Asn-394, Asp-350. c: Rosmanol binds to a different pocket inside the S-protein and interacts with 2 residues viz. Ser-47, Asn-51. d: Luteolin binds to the same pocket formed between the S-protein and ACE 2 interface and interacts with the same residues as the hesperidin. e: esveratrol binds to a pocket inside the chain A of the S-protein and interacts with Tyr-196, Asn-210, Glu-208. f: Quercetin binds to the same pocket on which the EGCG binds comparatively lower binding energy.
prepared by adding Gasteiger charges, further the torsion roots were detected and the number of torsions were checked.
In the next step the grid parameters were set, for performing blind docking process. The grid was set to cover the whole protein in order to get the best binding site for the ligands. Then finally the docking process was done using AutoDock Vina [56] which uses the united-atom scoring function method to find the best fit for the ligand. For the docking process the energy range was taken to be 4 and the exhaustiveness, which is to be the time spent on the search depending upon the number of the atoms, was set to 8 in order to get the best results.
The output files were analyzed using the PyMOL 3.7 software and the results of the docking process have been represented ( Table 2). The most potential PSMs are shown along with their interaction points with the protein interfaces ( Table 3).
The results of the docking performed via AutoDock Vina, showed that Hesperidin, EGCG has the highest binding interactions followed by Rosmanol. Hesperidin binds in such a way that it interacts with Chain A of the S protein and Chain E of ACE 2 and binds inside the pocket of S protein and ACE binding interface (Figure 4a). Similar kinds of interactions have been found in Luteolin ( Figure  4d) however some alterations have been found which decreased its binding energy by 0.5 kcal/mol. EGCG and Quercetin both bind into the pocket inside the RBD interface of the viral S protein however an additional Asp350 interaction makes the binding of the former a more suitable one (Figure 4b, Figure 4f).
Rosmanol binds to a pocket of S protein (Figure 4c) causing alteration in its structure and Resveratrol binds inside the RBD interface of viral S protein showing binding energy of -7.8 kcal/mol (Figure 4e). Literature study showed no evidence of binding of Catechin and Cordioside with viral S protein, however binding energy data from our docking studies suggest that it binds with the viral S protein.

CONCLUSION
In the worldwide quest to develop a long-term cure for the disease, with conventional medications either failing or producing unforeseen side effects, our focus has shifted to develop a phytochemical based treatment which could be used in prevention of infection.
This paper reviews the anti-SARS-CoV-2 activities of phytochemicals, starting with the epidemiology and pathogenesis of COVID-19, receptor-binding comparisons between the Spike proteins of SARS-CoV and SARS-CoV-2 and lastly selection of six candidate phytochemicals on the basis of binding affinities to the viral S-glycoprotein, bioavailability, market costs and modes of interaction with the viral proteins. These six phytochemicals, namely Hesperidin, Epigallocatechin Gallate (EGCG), Rosmanol, Luteolin, Resveratrol and Quercetin were shortlisted by reviewing and docking a total of 26 phytochemicals as reported in the literature for antiviral activities.
This paper is an in-silico based review which only takes into consideration the binding energies to predict the suitability of the six shortlisted phytochemicals as potential drug candidates against COVID-19. In future, our goal is to further corroborate these findings through wet lab experiments to test their efficacy in both in-vitro and invivo setups. Apart from this, concerns remain regarding delivery methods, hence our proposal would be to use these phytochemical formulations as a form of spray which would allow quicker, more effective and faster actions. We hope that this review would contribute to the already existing plethora of information on SARS-CoV-2 and usage of these phytochemicals as a remedy against this dreadful virus which has taken the life of millions.