Cross-Reactivity Between SARS-CoV-2 Proteins and Proteins in Pneumococcal Vaccines May Protect Against Symptomatic SARS-CoV-2 Disease and Death

A significant inverse correlation exists between rates of pneumococcal vaccination, at both national and local levels, and symptomatic cases of SARS-CoV-2 infection and death. No correlations exist to BCG, Hib, diphtheria-tetanus-pertussis, measles-mumps-rubella, or poliovirus vaccinations. This paper explored the possibility that pneumococcal vaccines contain antigens that might be cross-reactive with SARS-CoV-2 antigens and that such cross-reactive antigens are absent from other vaccines. Comparison of the glycosylation structures of SARS-CoV-2 with the polysaccharide structures of pneumococcal vaccines yielded no obvious similarities. However, while pneumococcal vaccines are primarily composed of capsular polysaccharides, they also contain about three percent protein contaminants, including the pneumococcal surface proteins PsaA, PspA and probably PspC. These proteins have very high degrees of similarity, using very stringent criteria, with several SARS-CoV-2 proteins including the spike protein, membrane protein and replicase 1a. Equivalently similarities were found at statistically significantly lower rates, or were completely absent, among the proteins in diphtheria, tetanus, pertussis, measles, mumps, rubella, and poliovirus vaccines. Appropriate data were not available for testing Hib and BCG similarities. Notably, PspA and PspC are highly antigenic and new pneumococcal vaccines based on them are currently in human clinical trials so that their effectiveness against SARS-CoV-2 disease is easily testable.


Introduction
In attempting to explain why people in some nations and locales are apparently much more susceptible to serious disease, a very significant inverse correlation has been found between rates of pneumococcal vaccination at both national and local population levels and rates of SARS-CoV-2 infections and death (Root-Bernstein, 2020). No correlation was found to the tuberculosis vaccine BCG (Bacillus Calmette Guerin), Haemophilus influenzae type B (Hib), diphtheria-tetanus-pertussis, measlesmumps-rubella, or poliovirus vaccinations. The purpose of this paper is to provide a possible mechanism for how pneumococcal vaccines might protect against SARS-CoV-2 while the other vaccines do not.
The hypothesis tested is that antigens in pneumococcal vaccines induce antibodies protective against SARS-CoV-2 by means of cross-reactivity with similar SARS-CoV-2 antigens. There are two types of antigens that might play such a role, one being the capsular polysaccharide antigens in current pneumococcal vaccines and the other the proteins that they contain. An extensive search for polysaccharide structures shared by SARS-CoV-2 glycosylated proteins (Watanabe, et al., 2020) and S. pneumoniae serotypes (Shajahan, et al., 2020) failed to identify any, so the search then shifted to possible protein similarities.
While current pneumococcal vaccines are composed primarily of capsular polysaccharides, they also contain one or both of two types of proteins. The polysaccharide component is never pure, generally containing around three percent of the cell surface proteins to which the polysaccharides are attached (WHO, 2010;Morais, et al., 2018;Lee, et al., 2020). Proteins identified in pneumococcal vaccines include pneumococcal surface protein A (PspA) and pneumococcal surface adhesin A (PsaA) (Yu, et al., 1999;Yu, et al., 2003). Because the presence of PsaA was identified only by immunological methods and PsaA cross-reacts strongly with an additional pneumococcal surface protein, PspC (also known as CbpA and SpsA) (Brooks, et al., 1999;Ogunniyi, et al., 2001), it is likely that PspC is also present in capsular polysaccharide-based pneumococcal vaccines. Additionally, pneumococcal conjugate vaccines covalently attach the polysaccharides to a modified diphtheria toxin protein called Cross-Reactive Material 197 (CRM197) which is also present in Hib and meningitis vaccines (Möginger, et al., 2016).
This study reports that SARS-CoV-2 proteins contain significant regions that mimic sequences within pneumococcal surface proteins but not CRM197 or proteins present in other vaccines.

METHODS
In order to ascertain whether PspA, PsaA, PspC and CRM197 have regions of significant similarity to SARS-CoV-2 proteins, LALIGN (at www.expasy.org) was employed to perform pair-wise protein comparisons. The parameters chosen were 20 best alignments to show; BLOSSUM80 (in order to maximize small, local similarities); E = 10; gap penalty of -10.0 (to maximize continuous sequence similarities). SARS-CoV-2 sequences were retrieved from https://viralzone.expasy.org/8996 as HTML files or using the accession numbers from the UniProtKB database (UniProtKB accession numbers P0DTC1-P0DTC9). Streptococcus pneumoniae PspA, PsaA and PspC sequences were retrieved as accession numbers (provided in the Tables below) from the UniProtKB database. Because different streptococcal serotypes have slightly different versions of these proteins, several were randomly selected for each search and the sequences similarities displayed in FIGURE 1 are representative of several serotype results.
The LALIGN results were culled by applying the criterion that any sequence similarity reported must have a region containing at least six out of ten identities, where a pair of acceptable substitutions (as determined by the BLOSSUM80 program) counted as one identity. This criterion is based on a number of experimental studies involving the average length of peptide recognized by major histocompatibility receptors and T cell receptors (Rudensky, et al., 1991;Hemmer, et al., 2000;Ekeruche-Makinde, et al., 2013) and the degree of similarity between two antigens that is likely to induce cross-reactive immune responses (Cunningham, et al., 1989;Hemmer, et al., 2000;Root-Bernstein, 2009;Root-Bernstein and Podufaly, 2012;Root-Bernstein, 2014).
As controls for the LALIGN results, all SARS-Cov-2 proteins were used to search for similarities to bacterial proteins used in diphtheria, pertussis, and tetanus vaccines (TABLE 1) and viral proteins incorporated into the measles, mumps, rubella and polio vaccines. The only identified proteins in Hib and meningitis vaccines are CRM197 or meningococcal outer membrane complex protein, so these were also examined for similarities to SARS-CoV-2 proteins (TABLE 1). The same criteria used above to screen the results for sequences having at least six identities in a span of ten amino acids. Bacillus Calmette Guerin (BCG) vaccine (a version of Mycobacterium bovis) was not available in the UniProtKB complete proteome dataset and was therefore not subject to similarity searching.
Only results that met the six-in-ten antigenic-cross-reactivity criterion just stated as well as an E value equal to or less than 0.1 and a Watermann-Eggert (W-E) score of 60 or above are reported since these are statistically unlikely to have appeared by chance.
Two statistical tests were used to evaluate the results. A student's T-test (https://www.usablestats.com/calcs/2samplet) was used to test the significance of the difference between the number of pneumococcal protein similarity matches satisfying the criteria listed above compared with the number of matches from other vaccine proteins satisfying the criteria. A chi squared test (https://www.graphpad.com/quickcalcs/chisquared2/) was used to determine the significance of the difference in the percent of protein pairs that had at least one significant similarity as compared with the number that had no similarities (36 possibilities for 9 SARS-CoV-2 proteins versus 4 streptococcal proteins; 288 possibilities for 9 SARS-CoV-2 proteins versus the 32 bacterial and viral proteins listed in TABLE 1).

RESULTS
Results of the similarity searches that satisfy the W-E score of 60 or greater, E values of 0.1 or less, and in which at least six identical amino acids are present in a sequence of ten, are presented in FIGURES 1 and 2. FIGURE 1 displays the similarities between nine SARS-CoV-2 proteins (P0DTC1-9) and streptococcal proteins PsaA, PspA, PspC. Twenty significant similarities were observed, ten of which are indicated in the figure in bold type as sequences that repeat within pair of proteins. Note that a significant sequence similarity was also found between SARS-CoV-2 proteins and the S. pneumoniae GRAM positive anchor protein (Q8DRK2), which serves as an anchor site for capsular polysaccharides. It is not known at this time whether this protein is among those contaminating capsular polysaccharide preparations but because of its association with polysaccharide anchoring, it is likely to be such a contaminant of the polysaccharide material used in pneumococcal vaccines. Each of the four streptococcal proteins was tested against each of the SARS-CoV-2 proteins yielding 36 pairwise tests. Six of these combinations yielded one or more matches that satisfied all similarity criteria employed here.
No significant similarities were found between CRM197 and any SARS-CoV-2 protein or between meningococcal outer membrane protein complex and any SARS-CoV-2 protein. FIGURE 2 displays the results for the pairwise tests of the nine SARS-CoV-2 proteins with the additional bacterial and viral proteins listed in TABLE 1 that are present in measles, mumps, rubella, polio, diphtheria, pertussis, and tetanus vaccines, for a total of 32 microbial proteins. Of these, six yielded one or more significant similarities for a total of nine matches out of 288 possible pairwise combinations.. Statistical tests demonstrate that the results reported above are highly significant. All four of the pneumococcal proteins tested had significant similarities to at least one of the nine SARS-CoV-2 proteins. Altogether, six of the 36 possible permutations of pneumococcal protein pairs yielded significant similarities, or 16.9 percent. In contrast, only six of the 32 other viral and bacterial vaccine proteins tested had significant matches to any of the nine SARS-CoV-2 proteins (2.2% of the 288 pairwise comparisons). The four pneumococcal proteins yielded 21 significant matches, for an average of 5.25 per protein, while the 32 other vaccine proteins yielded only nine significant matches, for an average of 0.28 per protein. The t-test comparing the number of matches from FIGURES 1 and 2 yielded a T Test Statistic t= 3.8081 corresponding to a P value of 0.0002. The chi squared test comparing the 16.7% of protein comparisons that yielded at least one significant similarity from FIGURE 1 (6 categories out of 36 pairwise comparisons) with the 2.2% (6 categories out of 288 pairwise comparisons) from FIGURE 2 yielded a chi squared value of 114.796 corresponding to a P value of 0.0001.

DISCUSSION
The Results of this study indicate that proteins known to contaminate pneumococcal vaccines significantly mimic SARS-CoV-2 proteins (FIGURE 1) while the CRM197 used to conjugate various polysaccharide vaccines and the proteins contained in other vaccines are statistically significantly less likely to do so (FIGURE 2). In particular, the Results point to potential cross-reactivity between SARS-CoV-2 proteins and the pneumococcal proteins PspA and PsaA, which are known to contaminate polysaccharide-based pneumococcal vaccines (WHO, 2010;Morais, et al., 2018;Lee, et al., 2020) as well as PspC, which it is reasonable to assume is another such contaminant since it derives from the same outer membrane protein complex and is highly cross-reactive with the antibodies against PspA used to demonstrate the presence of PspA in vaccines (Brooks, et al., 1999;Ogunniyi, et al., 2001).
The concentration of protein contaminants in pneumococcal vaccines is sufficient to induce immunity. In Prevnar-13, for example, there are 30.4 µg of capsular polysaccharides and 34.0 µg of CRM197 for a total of 64.4 micrograms of antigen per dose (FDA, 2017). Protein contaminants may make up an additional 3%, or 1.92 µg, of antigenic material according to WHO guidelines and confirmed by laboratory analysis (WHO, 2010;Morais, et al., 2018;Lee, et al., 2020). This 1.92 µg of protein is virtually identical to the 2.2 µg of each of twelve of the capsular polysaccharides present (plus 4.4 µg of serotype 6) or the 2.3 micrograms of CRM197 conjugated to each polysaccharide type (FDA, 2017) and is therefore quite sufficient to induce an immune response, especially since PspA and PspC are strongly cross-reactive. Pneumovax-23, in contrast, has 25 µg of each capsular polysaccharide, adding up to a total of 575 µg of antigen. The three percent protein contamination allowed by WHO (WHO, 2010;Morais, et al., 2018;Lee, et al., 2020) could result in 17.25 µg of PsaA, PspA and PspC per dose, which is certainly sufficient to induce immunity. For comparison, each 0.5-mL dose of Adacel®, a diphtheriatetanus-pertussis vaccine (Sanofi Pasteur) contains 2.5 µg detoxified pertussis toxoid, 5 µg FHA, 3 µg pertactin and 5 µg FIM acellular pertussis antigens (CDC, 2020).
In addition to being present in concentrations that could induce protective immunity, the pneumococcal-SARS-CoV-2 similarities reported here satisfy multiple criteria involving sequence identities and statistical measures for predicting potential antigenic cross-reactivity so that it is possible that pneumococcal vaccination can protect individuals against SARS-CoV-2 disease. Evidence of protection against SARS-CoV-2 by T cells reactive to unidentified, cross-reactive microbes has been reported by Grifoni, et al. (2020). The study reports that 40-to 60% of people unexposed to SARS-CoV-2 had SARS-CoV-2-reactive CD4+ T cells. The assumption made by the authors is that the cross-reactivity is to coronaviruses that cause colds. However, the study also reports that this cross-reactive immunity is greatest in young people and least in older people, which is not consistent with cold virus exposures. Such waning immunity is, however, consistent with waning childhood vaccination immunity. In light of the data presented here, it is therefore possible that at least some proportion of individuals with crossreactive immunity developed it through exposure to pneumococcal vaccinations. Such cross-reactivity would also explain the epidemiological observation that pneumococcal vaccination rates, but not vaccination rates with any other commonly used vaccine (DTP, MMR, polio, meningitis), correlate inversely with rates of serious SARS-CoV-2 disease and death (Root-Bernstein, 2020).
The observation that viral and bacterial proteins exhibit antigens similar enough to be crossreactive may be surprising but it is not novel. Härkönen, et al. (2000) found that rabbit antibodies to HSP65 of Mycobacterium bovis (from which BCG is derived) recognized capsid protein VP1 of coxsackievirus A9, VP1, and/or VP2 of coxsackievirus B4. Misko, et al. (1999) demonstrated that Epstein-Barr virus mimicked a Staphylococcus aureus replication initiation protein and induced antibodies crossreactive with it. Trama, et al., (2014) and Williams, et al. (2015) have documented antibodies against the gp41 protein of human immunodeficiency virus that cross-react with commensal bacteria in the human gut. Ross, et al. (1990) reported that sera from chickens inoculated with infectious bursal disease viruses or infectious bursal disease vaccines cross-reacted with Mycoplasma gallisepticum and Mycoplasma synoviae. And Bordenave (1973) found that antibodies against Salmonella abortusequi also recognized tobacco mosaic virus. In short, while the phenomenon may be rare -and, indeed, the data reported here suggests that such similarities may occur at a rate of about 1/200 pairwise combinations -bacterial antigens are known to occasionally induce antibodies that cross-react with viral antigens or vice versa.
The rarity of antigenic mimicry between bacteria and viruses is emphasized by the finding that there are no significant similarities between SARS-CoV-2 proteins and the vaccine conjugate proteins, CRM197 and meningococcal outer membrane protein. This lack of antigenic mimicry suggests that neither protein is likely to contribute to the possible SARS-CoV-2 protection associated with conjugated pneumococcal, Hib or meningococcal vaccines. This negative result is consistent with the absence of any epidemiological association between Hib and meningococcal vaccines and SARS-CoV-2 rates of disease or death (Root-Bernstein, 2020).
The almost completely negative results reported here for antigenic mimicry between SARS-CoV-2 proteins and proteins from poliovirus, measles, mumps, diphtheria, pertussis and tetanus are also consistent with the lack of association between these vaccines and SARS-CoV-2 rates of disease or death (Root-Bernstein, 2020). Franklin, et al., (2020) report "significant" similarities between both rubella and measles proteins and SARS-CoV-2, the key results or which were independently reproduced here in FIGURE 2. However, there are significantly fewer similarities between measles and rubella proteins and those of pneumococcal proteins and epidemiological evidence does not support measles containing vaccines (which often include rubella) as protective against SARS-CoV-2 (Root-Bernstein, 2020). The suggestion that polio vaccine be tested as a SARS-CoV-2 (Chumakov and Gallo, 2020) is likewise not supported by the data presented here (FIGURE 2) or by epidemiological data (Root-Bernstein, 2020).
Tuberculosis (BCG) vaccination has also been proposed to be epidemiologically associated with protection against SARS-CoV-2, but unfortunately, because the Bacillus Calmette Guerin (BCG) proteome is not in the UniProtKB database, it was not possible to perform the type of LALIGN analysis of BCG vaccine that was performed for the other vaccines. BCG vaccination was found to be associated with SARS-CoV-2 protection in one epidemiological study (Miller, et al., 2020) but not in another (Root-Bernstein, 2020) so that the current literature is divided on the issue and further research will be needed.
The observation of an inverse association of pneumococcal vaccinations with rates of SARS-CoV-2 rates of disease and death makes sense in terms of the particular protein contaminants of pneumococcal vaccines that are identified in this study as being potentially protective. These are PspA, PsaA and PspC, the specific regions identified in this study being known to be highly antigenic (van de Garde, et al., 2019). Moreover, these proteins are under active investigation as more effective and broadly protective pneumococcal vaccine components to replace the polysaccharide-based vaccines (Briles, et al, 2000;Ferreira, et al., 2009;Schachern, et al., 2014;Lagousi, et al., 2019). Some of these vaccine candidates are already in human trials (Lagousi, et al., 2019;Masomiam, et al., 2020 ). Thus, it should be possible rapidly to determine whether such pneumococcal protein-based vaccines are effective deterrents to SARS-CoV-2 disease and these vaccines may provide needed protection until a SARS-CoV-2 vaccine is produced in sufficient quantities to be effective worldwide. FIGURE 1: Similarities between the four known or probable pneumococcal vaccine protein contaminants PsaA, PspA, PspC and Gram-positive anchor protein and SARS-CoV-2 proteins. Multiple variants for each protein were examined and results provided here are representative of results but each serotype presents slightly different matches. 36 pairs of proteins were searched Only similarities satisfying criteria laid out in Methods are shown. FIGURE 2: Similarities between nine SARS-CoV-2 proteins and 32 proteins from measles, mumps, rubella, polio, Hib, meningitis, diphtheria, pertussis and tetanus vaccines (TABLE 1). 288 pairwise combinations were searched. Only similarities satisfying criteria laid out in Methods are shown.