SARS-CoV-2 prion-like domains in spike proteins enable higher affinity to ACE2

Currently, the world is struggling with the coronavirus disease 2019 (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Prion-like domains are critical for virulence and the development of therapeutic targets; however, the prionlike domains in the SARS-CoV-2 proteome have not been analyzed. In this in silico study, using the PLAAC algorithm, we identified the presence of prion-like domains in the SARS-CoV-2 spike protein. Compared with other viruses, a striking difference was observed in the distribution of prion-like domains in the spike protein, since SARS-CoV-2 was the only coronavirus with a prionlike domain found in the receptor-binding domain of the S1 region of the spike protein. The presence and unique distribution of prion-like domains in the SARS-CoV-2 receptor-binding domains of the spike protein is particularly interesting, since although the SARS-CoV-2 and SARS-CoV S proteins share the same host cell receptor, angiotensin-converting enzyme 2 (ACE2), SARS-CoV-2 demonstrates a 10to 20-fold higher affinity for ACE2. Finally, we identified prion-like domains in the α1 helix of the ACE2 receptor that interact with the viral receptor-binding domain of SARS-CoV-2. Taken together, the present findings indicate that the identified PrDs in the SARS-CoV-2 receptor-binding domain (RBD) and ACE2 region that interact with RBD have important functional roles in viral adhesion and entry.


Introduction
The world is struggling with the pandemic caused by a novel coronavirus (now named severe acute respiratory syndrome-2 or SARS-CoV-2, causing the disease COVID-19) that has expanded from Wuhan throughout China (1). By March 30, 2020, the virus had caused over 775,000 confirmed cases worldwide and contributed to over 37,000 deaths (https://www.worldometers.info/coronavirus/). SARS-CoV-2 is a new member of the Betacoronavirus (β-CoV) genus of large, enveloped singlestranded RNA viruses (2). This genus not only includes viruses that cause deadly human infections such as severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS), but also encompasses viruses that cause non-life-threatening common colds, including human coronavirus OC43 (HCoV-OC43) and human coronavirus HKU1 (HCoV-HKU1) (3). Although these viruses predominantly infect lung epithelial cells, the clinical severity and pathogenesis of the infections they cause varies between different coronaviruses (4). While severe pneumonia and pulmonary fibrosis are fundamental to the pathogenesis of COVID-19, SARS, and MERS, these symptoms are not typical of infections caused by HCoV-OC43 and HCoV-HKU1 (5,6).
Like other β-CoVs, the genome of the novel SARS-CoV-2 virus encodes structural proteins required for the efficient formation of infectious virions; these include the spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins (7).
The key determinant of the host specificity of a β-CoVs is the surface-located S protein, which plays critical roles in infection by mediating viral attachment to host cell surface receptors and facilitating viral entry (8). The S protein consists of two large regions: N-terminal S1 and C-terminal S2 (9). S1 is responsible for recognizing host-cell receptors, including the receptor-binding domain (RBD), and has higher sequence variability than S2 (S1 shares around 70% identity with that of other human β-CoVs). Moreover, the membrane-embedded S2 region responsible for fusion is more highly conserved than that of S1 (8,9) In SARS-CoV-2, the RBD in S1 allows the virus to bind directly to the peptidase domain of the host angiotensin-converting enzyme 2 (ACE2) complex, mediating virus entry into sensitive cells (10). Notably, compared to SARS-CoV, SARS-CoV-2 has a higher binding affinity to ACE2 (which is the common receptor for both SARS-CoV-2 and SARS-CoV), with a broader interaction with ACE2 expressed not only in the lungs but also in kidney, testis, and heart (10,11).
Recently we have conducted an analysis and identified for the first time viral prion-like domains (PrDs), which we suggest are novel regulators of virion assembly with a role in virus-host cell interactions (12,13). These studies were in alignment with previous studies, showing that in addition to the pathological role of prions that they play in humans being implicated in Alzheimer's and Parkinson's diseases, diabetes, and many other human pathologies, protein misfolding plays important physiological roles in eukaryotes and prokaryotes (14)(15)(16)(17).
Though the detailed molecular mechanisms underlying prion formation remain elusive, asparagine (Q)-and glutamine (N)-rich regions characterized by altered hydrophobicity and net sequence charge are known to drive prion formation. This is the basis for a number of algorithms for identifying candidate prionogenic domains (18,19). One such algorithm is prion-like amino acid composition (PLAAC) analysis, which allows the evaluation of prion-like domains based on the hidden Markov model (HMM) (20).
In this study, for the first time, we performed a detailed study of the prion-like domains in spike protein of SARS-CoV-2 and a comparison of SARS-CoV-2 to other human-pathogenic β-CoVs.
Our findings can contribute to a better understanding of the pathogenicity of SARS-CoV-2 and will help to uncover new targets for the development of drugs and vaccines based on the prionogenic properties of particular viral protein regions.

Results
Using the prion-prediction PLAAC algorithm, we analyzed structural proteins derived from UniProtKB and NCBI databases and identified PrDs in the S proteins of all β-CoVs analyzed in this study (Supplementary figure S1). The LLR scores of PrDs of the S proteins were practically identical within the studied β-CoVs, ranging from 4.431 to 4.991 (Supplementary Figure S2).
Notably, with more precise mapping of PrDs within these proteins, we found a striking difference in their localization with SARS-CoV-2 being the only virus with PrDs identified within the RBD of the S protein ( Table 1).
Considering that although SARS-CoV-2 and SARS-CoV (which are the closest related human β-CoVs pathogens) share the same host-cell receptor ACE2, SARS-CoV-2 binds tighter to it; therefore, we hypothesized that the presence of PrDs in the RBD of the SARS-CoV-2 might explain this phenomenon (10). Consistent with this hypothesis, we found that SARS-CoV-2 along with other residue substitutions has five substituted amino acids in the RBD compared to SARS- and K343 of a non-PrD of ACE2. Notably, only K417 and Y453 were the only residues of the SARS-CoV-2 RBD that were outside the viral PrD and bound to a non-PrD of ACE2 ( Figure 2B).

Discussion
This study is the most complete evaluation of PrDs in the S protein of SARS-CoV-2. The results highlight some previously unknown, unique characteristics of SARS-CoV-2 that may play important roles in the pathogenesis and inform the development of new therapeutic strategies.
In this study, we used a high threshold of the PLAAC score for protein identification: only proteins with a high probability of prionogenic properties were included in the analysis. We found that all members of β-CoVs members contain PrDs in the S proteins. However, SARS-CoV-2 is the only member of β-CoVs that has a PrD in the RBD of the S protein that binds to the ACE2 receptor employed for host cell entry. Furthermore, we discovered specific amino acids (Q474, N481, Q493, Q498 and N501) that enable the prionogenity of the SARS-CoV-2 RBD that are not found in the RBD of SARS-CoV, of which Q474, Q498 and N501 directly contact within ACE2.
From these analyses, we conclude that the presence of these intrinsically disordered regions in the SARS-CoV-2 RBD, might be the reason for its optimized binding to the human ACE2 receptor in comparison to the RBD of SARS-CoV, since the distinguishing characteristic of PrDs is their ability to rapidly shift between multiple conformations due to residue hydrophobicity and net sequence charge (18,22).
Notably, since five of seven amino-acid interactions that occur between the RBD of SARS-CoV-2 and ACE2 are within their PrDs, it is also interesting to consider whether the prion-prion interaction between the virus and human receptor participates in COVID-19 and does it add a special value for the higher affinity to their binding. Since other β-CoVs were shown to lack the PrDs in the RBD, this means that the presence of PrDs is beneficial, but not necessary, for receptor-mediated virion attachment to the host cell. One of the critical goals of our previous studies was to show that PrDs identified in viruses may have important functional roles in virulence and are particularly associated with viral adhesion and entry.
This study provides a proof of this concept, showing that the presence of PrDs in the RBD of SARS-CoV-2 enhances viral binding to its host receptor compared to that of SARS-CoV, which lacks PrDs in its RBD structure. Further analyses of these PrD-containing proteins in SARS-CoV-2 may improve our understanding of COVID-19 infection and provide new insights into its pathophysiology novel targets for developing therapies.

Protein Sequences
To identify the PrDs present in viral proteomes, protein sequences were obtained from the UniProt Knowledge Base and National Center for Biotechnology Information (NCBI) database (23, http://www.ncbi.nlm.nih.gov/). Protein functions were manually curated using information from UniProt and NCBI databases. The structure of the RBD-ACE2 complex was established based on the data from PDB ID: 6VW1 and visualized using the YASARA software (http://www.yasara.com) (24,25).

Identification of PrDs in viral proteomes
The presence of PrDs in β-CoV proteomes found using the PLAAC algorithm and the output probabilities for the PrDs were constructed based on amino-acid frequencies and similarities with PrDs in Saccharomyces cerevisiae. We used a cutoff of 3.0 log-likelihood ratio (LLR) and alpha = 1.0, representing S. cerevisiae background scanning, to identify the PrDs. Prion-like domain amino acid positions were determined based on the PLAAC algorithm program analysis or manually.

Statistical analysis
All statistical analyses were conducted using the Statistical package for Windows (version 5.0) (StatSoft, Inc.). Data were compared between viruses using a χ 2 test or Fisher's exact test. To detect differences in multiple comparisons, one-way analysis of variance (ANOVA) was fitted with the standard confidence interval of 95%. P values < 0.05 were considered statistically significant.   (B to D) Detailed analysis of the interface between the SARS-CoV-2 RBD and ACE2. The structure of the RBD-ACE2 complex was established based on the data from PDB ID: 6VW1 and visualized using the YASARA software (http://www.yasara.com). The ACE2 and RBD molecules are stained green and blue respectively. Amino acids within the PrD of the RBD that interact with amino acid residues of ACE2 are stained purple, while those in the PrDs of ACE2 that interact with amino acid residues of the RBD are stained green; the interactions are indicated by red, dashed lines. Amino acids within the non-PrD of the RBD that interacts with amino acid residues of ACE2 are yellow, while those in the non-PrDs of ACE2 that interact with amino acid residues of the RBD are orange, and these interactions are indicated by a black dashed line.