Structural proteomics-driven targeted design of favipiravir- binding site in the RdRp of SARS-CoV-2 unravels susceptible hotspots and resistance mutations

Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, Yokohama, Kanagawa 230-0045, Japan Centre of Excellence in Integrated Omics and Computational Biology, Utkal University, Bhubaneswar 751004, Odisha, India Post Graduate Department of Biotechnology, Utkal University, Bhubaneswar 751004, Odisha, India Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33620, United States Molecular and Structural Biophysics Laboratory, Department of Biochemistry, NorthEastern Hill University, Shillong793022, India


Introduction
incorporated into the nascent viral RNA by viral RdRp resulting in chain termination or accumulation of deleterious mutations [12][13][14]. Recently, a cryo-EM structure of the SARS-CoV-2 RdRp in complex with favipiravir-RTP and template primer doublestranded RNA has been determined (PDB ID: 7AAP) [15]. Favipiravir binds to the active site of the viral RdRp, being mistaken as a purine nucleotide, transforming the binding pocket into a catalytically non-productive conformation. It stacks onto the 3' nucleotide of the primer strand and forms a non-canonical base pair with the template RNA strand using its amide group (Figure 1a-1c). Therefore, favipiravir functions as a chain terminator, inhibiting viral RdRp by the termination of chain elongation at the site of incorporation.
Since its emergence in December 2019, SARS-CoV-2 has undergone more than 50,000 mutations and recombination events, as compared to the Wuhan reference genome (NC_045512.2) (http://cov-glue.cvr.gla.ac.uk/#/home). Until December 2020, at least 198 sites in the viral genome have already undergone recurrent, independent mutations [16]. Although the rate is lower than that of influenza and human immunodeficiency virus (HIV), SARS-CoV-2 is presently accumulating at least two mutations per month in its genome [17]. The continuous emerging mutations are a cause of concern, as they could hinder the development of long-lasting and effective therapeutics. These concerns are supported by the observations of the natural favipiravir resistance emergence in Chikungunya virus [18] and enterovirus 71 [19], and development of the favipiravir resistance by the pandemic H1N1 influenza A virus in the laboratory studies [20]. Chances of the emergence of favipiravir resistant SARS-CoV-2 strains increase due to its intensive use as the COVID-19 pandemic progresses. We recently identified potential residues that could contribute to remdesivir and molnupiravir resistance in SARS-CoV-2 [21], as well as identified potential signatures of SARS-CoV-2 main protease adaptation against boceprevir and telapravir [22]. We believe the knowledge of potential mutants is essential and can help design new and more effective drugs and guide the clinicians to properly manage and administer the treatment regimen.
In this study, we used a high-throughput interface-design strategy to predict potential mutation hotspots and resistance sites in the favipiravir binding site in the RdRp of SARS-CoV-2.

Identification favipiravir and nsp12 interacting residues
The cryo-EM structure of SARS-CoV-2 RdRp (nsp7-nsp8-nsp12) in complex with the template:primer dsRNA and favipiravir-RTP (PDB code: 7AAP) was used to determine the interactions between nsp12 and favipiravir. This analysis resulted in 72 nsp12interacting residues with favipiravir. The complex structure was subsequently processed using the Molecular Operating Environment (MOE 2018.01, v2021.03).

MOE-based resistance scan design methodology
To identify potential single point mutations of nsp12 that could develop resistance against favipiravir, MOE's resistance scan methodology was used (MOE 2018.01, v2021.03) [23].
The prepared complex structure was used as an input, and 41 nsp12 residues that interacted with favipiravir were designed with naturally sampled SNPs, whereas the remaining residues that interacted with RNA were not selected for the design. During the design, the template:primer dsRNA was retained in the complex structure. The rotamer explorer option of the ensemble protein design protocol was enabled due to the flexibility of the designed site. As discussed in our recent report [22], various physico-chemical parameters were set accordingly, and a total of 350 designs were generated. Parameters such as affinity and stability of the designed complexes were examined. The relative binding affinity of the mutation to the wild-type protein (dAffinity) was computed as described in our recent report [22]. This methodology served as an indirect computational fitness test model, sampling only those mutations that were not deleterious and were more likely to develop naturally.

Rosetta-based structural refinement and ligand-based interface design
The complex structure prepared using MOE was next used for Rosetta-based design [24][25][26][27]. First, parameters for favipiravir compatible with Rosetta forcefield were generated by adding appropriate charges and following Rosetta's appropriate parameter generation protocol. Second, the complex was energy minimized using Rosetta relax [25]. Five models were generated, and the lowest energy complex structure of the favipiravir-bound complex was used for the targeted interface design experiment. The Rosetta macromolecular modeling suite was used to redesign nsp12 with backbone flexibility [24].
In this step, 41 residues of nsp12 that constituted the favipiravir-interacting site were designed with the naturally sampled SNPs, whereas the remaining residues that interact with RNA and all other residues of nsp7, nsp8, and nsp12 were allowed to only repack without design. In the design experiments, a modified RosettaScript was used to sample the favipiravir-interacting residues of nsp12, which considered minor movements of the backbone of nsp12 to prevent steric clashes with ligands after introducing mutations [26].
The Rosetta all-atom forcefield was used to sample the designs. A total of 100,000 designs of favipiravir-bound complexes were generated, and the Rosetta total score, RMSD from the initial structure, Rosetta interface delta (denoting the binding affinities between the designed nsp12 and favipiravir), and the percent sequence identities from the initial nsp12 sequence were analyzed to understand the mutational landscapes and effect of mutations on the physicochemical characteristics of nsp12 and the overall complex.

Validation of the design protocol
The accuracy and predicting ability of our MOE-based and Rosetta-based interface design protocol were validated on a well-studied remdesivir-resistant mutant of nsp12, named V557L of SARS-CoV [28]. As described in our recent study [22], first, the MOE resistance scan methodology was used, where Val557 was designed with A, D, E, G, I, L, M, and F, and the affinities between designed nsp12 and remdesivir were calculated.
Second, the Rosetta interface design protocol was used, where Val557 was designed with naturally sampled SNP residues, although the remaining nsp12 residues were only repacked without design. In this run, 5000 designs were generated and analyzed for the designed mutations with their respective total scores. Finally, the resistance mutations from our design calculations were compared with the known SARS-CoV-2 nsp12 sequences available in the CoV-GLUE database [29]. The frequency of the mutations retrieved from CoV-GLUE was compared with the favipiravir binding site of nsp12 to determine the accuracy of our design methodology.

Mutational landscapes of the top-ranked affinity-attenuating and affinity-enhancing designs
WebLogo was used to obtain and plot the type and frequency of designed amino acid residues in the 41 favipiravir-interacting residues of nsp12 from the top-ranked affinityenhanced, and affinity-attenuated designs [30].

Calculation of intermolecular interactions between favipiravir-bound nsp12 designs
The intermolecular interactions between top-ranked affinity-attenuating and affinityenhancing favipiravir-nsp12 designs were identified using the Arpeggio web-server [31].

Analysis of nsp12-favipiravir interacting residues and selection of nsp12 residues for designing
The cryo-EM structure of SARS-CoV-2 RdRp in complex with favipiravir-RTP revealed the interacting residues between nsp12 and favipiravir ( Figure 1a) [32]. A total of 72 residues of nsp12 were found to be interacting with favipiravir, of which those that interacted with the template:primer dsRNA were identified as essential for the catalytic activity of RdRp. Overall, 41 residues of nsp12 were identified and subjected to design; these were primarily located within 6 Å distance of favipiravir. Certain critical residues, such as Asn691, Lys545, and Ser814, formed hydrogen bonds with favipiravir.

Identification of resistance mutations from MOE-based design experiments
Next, we designed 41 shortlisted residues of nsp12 using resistance mutation scan methodology employed in Molecular Operating Environment (MOE), where each residue was mutated with only a single nucleotide polymorphism (SNP) of the wild-type sequence.
As a result, the mutations were confined to SNPs to mimic the variations that may occur naturally during the evolution of the virus. Table 1 shows a list of SNPs sampled and designed for 41 favipiravir-interacting residues of nsp12. A total of 350 single point mutants potentially associated with the resistance development were generated, and the affinity between the designed proteins and favipiravir was calculated. The relative binding affinity of the mutation in comparison with the wild-type protein (dAffinity) was calculated.
Here, a positive value indicates that the mutation had a lower affinity to favipiravir, suggesting that such mutant could become resistant to the ligand.  Figures 2 and 3. Furthermore, a stringent dAffinity cutoff revealed that 13 mutants had dAffinities higher than 0.2 kcal/mol ( Figure 4). These mutants with the highest dAffinity values can be significant for the development of the resistance against favipiravir. Notably, most of these 13 mutants originated from mutations in four nsp12 residues, namely His439, Cys622, Asp623, and Thr680 ( Figure   4). Furthermore, certain residues, such as Asp452, Tyr456, Met542, Tyr546, Ala554, Trp617, Tyr619, Cys622, Arg624, Gly679, and Trp800 were susceptible to developing resistance during evolution if an immune and drug response was mounted on SARS-CoV-2, as revealed by their affinity profiles (Figures 2-4). Using this alternative method, we then carried out a computational fitness test to sample only those mutations that were not lethal to the virus and that more likely to evolve naturally over time to assist RdRp in maintaining structural and functional integrity.

Rosetta-based design of favipiravir-binding nsp12 region and associated physicochemical features
After we identified single point mutants that has the potential to induce favipiravir resistance, Rosetta-based ligand interface-design of the favipiravir-binding site of nsp12 in the nsp7-nsp8-nsp12-dsRNA complex was performed [24]. This was conducted to obtain the hotspot residues of nsp12 and better understand the mutational landscape during the emergence of drug resistance. Although 72 residues of nsp12 were found to interact with favipiravir, 41 were selected to design with backbone flexibility, as they did not interact with the template:primer dsRNA and were not involved in the catalytic activity.
In this step, each of the 41 residues was mutated based on the corresponding SNPs to mimic the mutations that are more likely to occur naturally during the evolution of the protein (Table 1). During the design experiment, nsp12 residues other than the interfacial residues were only repacked. A total of 100,000 designs were generated, and their physicochemical parameters were analyzed. The 100,000 designs enclosing the favipiravir-binding site of nsp12 were categorized into affinity-enhancing and affinityattenuating designs based on their binding affinities and control values.
First, 100,000 designs were examined for Rosetta total scores versus their root mean square deviations (RMSDs). The Rosetta total score is the sum of individual energy terms, including physical forces and other statistical terms, whereas the RMSD is computed between the designed structures with respect to the wild-type nsp7-nsp8-nsp12-template:primer dsRNA complex structure with favipiravir-RTP. It was observed that nearly all designs retained RMSDs below 1.5 Å, and more than half of them retained RMSDs below 1 Å, suggesting they did not considerably change from the initial complex structure during designing 41 interface residues of nsp12 and introducing mutations (Figure 5a). At this stage, we performed a control run, where 41 residues of nsp12 were only repacked without designing. This distinguished the affinity-attenuating and affinity-enhancing designs, indicating that the affinity-attenuating designs showed a minor increase in RMSDs in their overall structures and comparatively unfavorable energetics due to the introduction of missense mutations.
Second, the designs were analyzed for binding affinities between the designed nsp12 with favipiravir (represented as interface delta) versus the Rosetta total scores.
The affinity-attenuating designs had considerably lower binding affinities and lower total scores than the affinity-enhancing and control designs (Figure 5b). This suggested that unfavorable energetics and lower total scores influenced the binding affinity in the affinityattenuating designs.
Third, we computed the RMSDs versus interface delta for the 100,000 designs.
We found that several affinity-attenuating designs with lower binding affinities displayed higher RMSDs of over 1.2 Å, suggesting that the corresponding mutations destabilized the interactions (Figure 5c). However, several affinity-attenuating designs with lower binding affinities retained RMSDs as compared to the affinity-enhancing and control designs as well, indicating that residue-specific side-chain movements could have contributed to the reduced binding affinities in those cases.
Finally, the interface delta of the designs was evaluated against the percentage sequence identities of the generated designs. Interestingly, although nsp12 has 41 residues interacting with favipiravir, only a few of them were potentially hotspot residues and prone to mutations (Figure 5d). On average, the designs retained a sequence identity of 97 to 98% as compared to the wild-type structure. The affinity-attenuating designs with lower binding affinities were less conserved than their counterparts (Figure 5d), demonstrating that nsp12 developed resistance against favipiravir with only selected mutations at a few hotspot residues whenever SARS-CoV-2 encounters evolutionary and immune/drug response.
A comparison of MOE-derived single point resistance-developing mutant with that of the Rosetta-generated mutational landscape demonstrated that both methods predicted certain commonly occurring mutants, such as D623A and D623G in affinityattenuating designs (Figures 2, 3, 4, and 6). Furthermore, the Rosetta-based design revealed a few more mutation types for hotspot residues. This sequence-specific conservation and diversity of the favipiravir-bound nsp12 designs indicated that, in the future, these hotspot residues could undergo selective mutations to establish a resistance to favipiravir and similar drugs. This can facilitate the spread and survival of SARS-CoV-2 or a related virus.

Validation of design protocols
To confirm the predictive ability and accuracy of the MOE-based and Rosetta-based design methodologies, a well-defined remdesivir-resistant V557L mutant of SARS-CoV was assessed. The chikungunya virus and H1N1 influenza A virus have been shown to develop resistance against favipiravir. However, since their 3D structures are not available, we validated our findings with SARS-CoV V557L mutant, as Val557 is conserved in the nsp12 of SARS-CoV-2. To evaluate whether our MOE and Rosetta design methodologies could rank L557 among the low-affinity designs, we scanned residues in the 557 th position with A, D, E, G, I, L, M, and F residues and realized that V557L was ranked as the low-affinity mutant in the design calculations (Supplementary Figure 2). These observations validated the capability of our design methodologies to score and rank-order of the affinity-attenuating designs that can develop favipiravir resistance in SARS-CoV-2. However, because V557L is a remdesivir-resistant mutation of SARS-CoV, the SARS-CoV-2 may not mutate Val557 residue to acquire favipiravir resistance. Instead, it could use any other hotspot residues to develop resistance. Finally, mutations at the favipiravir binding site of RdRp were obtained from the CoV-GLUE database, and their frequency of occurrence was determined. We observed that out of 134 mutations, 63 mutations were already predicted as resistant, as shown in our MOEbased design calculations, thus attaining ~47% correlation match with the sequencing data ( Figure 7). Notably, many high-frequency mutations such as A554S, A554T, A554V, W617C, Y619C, and H810Y were already found to be favipiravir resistant in our design calculations ( Figure 7). However, it should be noted that the frequencies of the mutations recorded in the CoV-GLUE database may not directly correlate with the calculated dAffinities. As the pandemic advances and more sequencing data becomes available, it is possible that the other sampled mutations from our designs may evolve and correlate with the new sequences. In conclusion, the control experiments validated the design methodology in scoring and rank-ordering the affinity-attenuating designs, leading to the emergence of potential mutations for favipiravir resistance in SARS-CoV-2.

Intermolecular interactions between favipiravir and nsp12 in the top-ranked affinity-attenuating and affinity-enhancing designs
We examined the intermolecular interactions between favipiravir and nsp12 in the topscored affinity-attenuating and affinity-enhancing designs. The top-scored affinityenhancing design formed 237 interactions, while the affinity-attenuating design formed 221 interactions ( Table 2). The van der Waals, proximal, and metal-complex interactions played a significant role in decreasing the affinity between favipiravir and nsp12 in the affinity-attenuating designs (Figures 8a, 8b). A ligand interaction diagram from the affinityattenuating and affinity-enhancing designs showed that the lack of a hydrogen bond between Asn691-favipiravir and metal-ion bonds significantly contributed to the lower binding affinity in the affinity-attenuating designs (Figures 8c, 8d). These findings suggest that a combination of different strategies encompassing multiple vaccines and antiviral therapeutics will be required to effectively manage COVID-19.

Discussion
SARS-CoV-2 jumped to humans in late 2019 via a possible transmission through bats, civets, or pangolin. Therefore, SARS-CoV-2 may not be completely adapted to its human host due to its recent association with this host. However, since its emergence, the SARS-CoV-2 has undergone a considerable number of mutations and recombination events. Although the majority of mutations detected were neutral, these are gradually accumulating, leading to increased genomic diversity. Although the current rate of sequence variations among SARS-CoV-2 isolates is modest, this RNA virus has ample opportunity to generate multiple variants because of the rapid rate of mutations and recombination events over the course of the pandemic (https://www.gisaid.org/). All this increases the probability of the emergence of more infectious strains in the future [33].
According to the population genetics theory, the neutral and beneficial mutations that affect viral fitness can reach higher frequencies. The development and spread of a mutation are governed by several factors, including population growth, founder effects, range expansion, and random genetic drift, as well as by a potential positive selection pressure to confer enhanced transmissibility or drug and immune resistance. Therefore, SARS-CoV-2 may diverge into phenotypically distinct lineages as it establishes itself as an endemic human pathogen. The non-synonymous D614G mutation in the viral Sprotein was the first mutation detected in early March 2020 that rapidly dominated by July 2020 [34]. Although the D614G mutant isolate is not linked to increased mortality or clinical severity, it is more transmissible [35,36] and enhances the viral replication rate [37,38]. This was followed by the discovery of several mutations in the S-protein RBD in mink and humans associated with mink farms [39]. Favipiravir and remdesivir are the key drugs currently used to treat patients with COVID-19 in several countries [42]. They are being tested in clinical trials globally for treating SARS-CoV-2 infection. Favipiravir is considerably cheaper than remdesivir and is, therefore, more affordable for developing and poorer countries. Like other drug therapies, a major challenge with the widespread use of remdesivir and favipiravir could be the potential development of resistance [21]. These drugs are nucleoside analogs and target the RdRp of SARS-CoV-2 to stop viral replication and transcription. The viral RdRp has a high mutation rate, which could facilitate antigenic drift in response to the host immune pressure and allow the emergence of resistance against the antivirals [43].
Although no natural favipiravir-resistant mutants have been found in influenza viruses [14,[44][45][46][47][48], an in vitro study has reported that the K229R mutation in the influenza virus RdRp induces favipiravir resistance [20]. Although the K229R mutation reduces the polymerase activity, the fitness cost of this mutation is compensated by a P653L mutation in RdRp [20]. Therefore, P653L mutation can restore RdRp activity while maintaining favipiravir resistance. A combination of K229R and P653L mutations resulted in a virus strain that was 30-fold less susceptible to favipiravir than the wild-type virus while maintaining the replication kinetics. Notably, significant favipiravir resistance was reported from experimental evolution studies in the enterovirus 71 [19] and Chikungunya virus [18].
Overall, these data indicate a potentially universal mechanism for favipiravir resistance functioning under evolutionary and survival pressure. The present study will help better understand the structural dynamics of susceptible hotspots and resistance mutations and their possible effects on the efficacy of favipiravir. The validation will require close monitoring of SARS-CoV-2 evolution using multiple high-throughput genome analyses and SNP evaluation in the coming days of the pandemic.

Limitations of the study
This study had a few limitations. First, computationally predicted resistance mutations of nsp12 against favipiravir should be validated using biochemical experiments to quantify the gain/loss and strength of interactions between nsp12 designs and favipiravir to get a complete understanding of the evolution of favipiravir-resistant mutations. Second, as the Rosetta-based interface design protocol aims to design mutants with improved binding affinity to favipiravir, a more appropriate design methodology and scoring function is required to scan for resistance mutations (designs that reduce binding affinity to a ligand).
However, to ensure that our design methodology correctly predicted the affinityattenuating designs representing the resistance mutations, we conducted adequate control experiments on a SARS-CoV remdesivir-resistance mutant. Using the publicly available SARS-CoV-2 nsp12 sequences, we also ensured that the most plausible single point mutations characterizing the favipiravir resistance mutants are identified from our design experiments. In addition, the MOE-based resistance design approach was employed to ensure that the most plausible single point mutants characterizing the resistance mutations are identified and predicted quantitatively.

Conclusions
Pandemics have been affecting humanity for centuries. One positive aspect of the COVID-19 pandemic has been the unprecedented availability of scientific and technological advances that allowed the rapid development of therapeutic strategies. The SARS-CoV-2 has acquired only modest genetic variations to date with no specific mutation contributing to drug resistance. However, it is impossible to rule out the future development of resistance against the currently used drug regimens. Considering the critical importance of defining possible signatures for adaptation in SARS-CoV-2 against the current drugs, we conclude that our work will provide grounds for a better understanding of the favipiravir resistance and COVID-19 management and may offer structural strategies for the development of more effective therapeutics. coordination, and drafted the manuscript. All authors read and approved the final manuscript.

CONFLICT OF INTEREST
The authors declare no conflict of interest.    affinity-attenuating designs are displayed. The X-axis represents the residue index, and the Y-axis shows the sequence conservation of an amino acid at that position. The height of symbols within the stack indicates the relative frequency of a particular amino acid at that position. The red arrow shows residues that experienced diverse sequence variations between the two groups with a relatively higher number of sampled amino acids, whereas the blue arrow shows residues that exhibited diverse mutations between the two groups but with fewer amino acids sampled.