Concerns Raised on Blood Group Determinants in Plasma Membrane Interaction of the SARS-CoV-2

The SARS-CoV-2 pandemic has resulted in the generation of evolutionary-related variants. The Sprotein of the B.1.1.7 variant (deletion N-terminal domain (NTD) His69Val70Tyr144) may contribute to altered infectivity. These mutations may have been presaged by animal mutations in minks housed in mink farms that according to the present analysis by modelling of protein ligand docking altered a high affinity binding site in the S-protein NTD. These mutants likely occurred only sporadically in humans. Tissue-adaptations and the size of the mink relative to the infected human population size back then may have comparatively increased the relative mutation rate. Simple, multi-threaded automated docking that is widely available, assigns increased binding of the blood type II A antigen to the SARS-Cov-2 S-protein NTD of B.1.1.7 with an overall increased docking interaction of blood group A harbouring glycolipids relative to group B or H (H, p=0.04). The top scoring glycan is identified as a DSGG (also classified as sialosyl-MSGG or disialosyl-Gb5) that may compete with heparin, which is similar to heparan sulfate linked to proteinaceous receptors on the tissue surface. Other glycolipids are found to interact with lower affinity, except long ligands that have suitable ligand binding poses to match the curved binding pocket.


Introduction
The cellular entry of viridae is shown to frequently include surface determinants of glycolipids and glycoproteins, whereas some viridae bind exclusively to proteinaceous receptors (1). Since genetic analyses have previously indicated, that surface loops of coronaviridae determine tissue-tropism in the animal (2), as imminent to simpler comparison in tissue-culture (3), the question of blood group glycolipid-or glycoprotein-determinant interaction has to be posed. Parvovirus, as one example of a DNA-virus (Erythrovirus) binds to the P-antigen (Globoside (Gb) 4) and can cause a transient aplastic anemia due to the abundance of Gb4 in red blood cells (1). Polyfucosylated N-linked glycopeptides and multiple glycolipids had previously been identified in the human intestine and have, moreover, suggested a high variability of individual O-glycomes, which may indicate individual differences in virus-receptor expression (4,5,6). Although the glycosphingolipid (GSL) and lipid variety in mammalian organisms and humans in particular is very high, succinct information on individual susceptibility to disease is still scarce (7). Transmission of SARS-CoV-2, a single-stranded RNA virus, in mink farms has been recently studied (8), anthropozoonotic infection of humans has been proposed in spill-over from minks back to the original host in this infectious cycle. Moreover, it has been proposed, that mutations that arose in the mink propagation of SARS-CoV-2 had introduced novel mutants into the human population (9,10). Since the multi-organ tropism of SARS-CoV-2 had been demonstrated, it is possible that prolonged anthropozoonotic amplification of host infections could alter the host and/or organ-range and tropisms that may increase disease lethality (11,12). The association of blood groups with the SARS-CoV-2 disease (COVID- 19) has recently been established in meta-analyses and suggests the likely increase in prevalence in blood group A individuals as well as linked elevated mortality (13,14). A multitude of explanations for a role of determinants of individual blood groups has been put forward and it has been theorized that an indirect effect of blood group associated expression of clotting factors could contribute to the severity of COVID-19 (15,16). Surface determinants alone, as shown in platelet clotting in vitro would provide the other line of thoughts to explain the AB0 blood group-dependent aetiology, just as the above mentioned direct interaction of the virus with the cell surface of SARS-CoV and SARS-CoV-2 target cells could include a co-receptor next to the ACE2 protein (17,18). In the current work, a drug-docking-like approach is tested to analyse interaction of carbohydrates of a library of GSL headgroups with the SARS-CoV-2 N-terminal domains (NTDs) of the SARS-CoV-2 wildtype virus (MN908947, NC045512) and the British mutant B.1.1.7 (8,19,20). The B.1.1.7 variant has recently been estimated to be associated with a 61 % increased hazard for death (21).

Material and Methods
The computational screen of carbohydrates involved analysis with the preparation of glycans from Woods at http:/www.ccrc.uga.edu (with multiple conformers) or preparation from pre-existing fragments from larger structures if not available as such. The PyRx modelling queue Version 0.8 was used with Intel processors on Windows 7, 8 or 10 operating systems. The MarvinView Dreiding force field utilized in some previous work was not utilized in the present experimental series, yet, files were processed by Chimera 1.14 (see (22)) and saved as mol2 file for import to PyRx docking. The Autodock VINA (23) implementation of PyRx from S. Dallakyan (http://pyrx.scripps.edu) was utilized with the grid size as indicated in single experiments. The algorithm installs OpenBabel (24) and a uff (united force field) for energy minimization, conjugate gradients with 200 steps and a cut-off for energy minimization of 0.1. Partial charges were added to receptors using PyBabel (MGL Tools; http://mgltools.scripps.edu). Authors mention the difference of this procedure to using OpenBabel for adding partial charges, and care should be taken especially for novel ligands that may not be recognized. No limits to torsions were allowed in the computational run. Single CPU time was up to 16 hours for longest/branched ligands in exhaustiveness 8. The analysed data were judged for surface binding in PyRx or in Chimera by the ViewDock import function. Sqlite data were analysed using SQLite (Hipp, D. R.) and DB Browser for SQLite from http://sqlitebrowser.org. Autodock/Vina redocking of ligands without torsional degrees of freedom was carried out to judge the top-scoring screen (exhaustiveness 3 or 6 with blood-group ligands). Re-dock of the top scoring ligand was also followed-up with the rotating side-chain function in Vina that allowed to validate the top scores independently and with slightly altered poses. For this step of the project, AutoDockTools Version 1.5.6 (http://mgltools.scripps.edu) was utilized to generate separate files of flexible and fixed amino-acid residues of the model (25). Further stepwise addition of poses was obtained with the flexdistance and autobox implemented in the SMINA program (https://sourceforge.net/projects/smina/files). Spreadsheet use and calculations were carried out in Microsoft Office 2013 Professional Plus. Further computational docking focused on the putative binding site was utilized to generate a high resolution of docking interaction, since the method is described to not only "home in" on the best interacting binding site but to stall on lowly evaluated interaction pockets if used in the "global" docking procedure. Therein the exhaustiveness was increased to 12. H-bonding was determined with ViewDock and with tolerances 0.4 Å, 20° (26) or 0.8 Å, 30° similar to calculations previously applied (27). Annotation of carbohydrates was from http://www.lipidmaps.org and from literature sources cited in the Results. Chimera 1.14 was used for further calculations and Coulombic surface charge presentations using default values. Structure files were scored as likely binding site ligands in pdb-care from http://www.glycosciences.de to test for structural intactness if not visually controlled. Figure 1: SARS-CoV-2 S-protein interaction with heparin. S-protein domains NTD (amino acids 14-291) and RBD (amino acids 334-524) were submitted for molecular ligand docking and results overlaid on the complete S-protein structure. The side view lacking the membrane proximal, transmembrane and cytoplasmic domains is presented on the left, the top "crown-view" is shown on the right with heparin presented with the pose that was obtained from ClusPro docking with lowest energy indicated. The current number of amino acids in Swiss-Model queue prediction is indicated (green) and more SARS-CoV-2 high-resolution structures are expected to validate heparin interaction in the future. Monomers are indicated with the chain A, B or C, separate colouring is shown in RBD and NTD backbone with the "crown-view".

Structures
were downloaded from RCSB (https://www.rcsb.org) or PDBe (https://www.ebi.ac.uk/pdbe). The Swiss-Model Server on http://www.sib.swiss was applied to predict structures of the SARS-CoV-2 S-protein including several versions of the modelling: Either the automatic queue was utilized or direct selection of templates was applied in obtaining best fit of structure and template (28). BLAST (29) and HHBlits (30) were used for the homology modelling. Templates that matched the primary sequence model query (amino acids 1-291) excluding the 13 residues of signal-sequence were used for modelling. These were represented by 7a25 A/B/C and 328 other templates for a general approach of ligand binding. The top templates corresponded to these 7a25 chains, chain A of 7cab and three chains of 7cai. Nine amino-acids were subjected to loop modelling although the structures of the S-protein was nearly complete (31). Previous models were not utilized, since the 6vxx and 6vsb structures were not completing the NTD and contained some gaps (32,33). The SwissModel7C_26J matched preferentially the C chain of 7a25 with RMSD of 0.129 Å and a QMean -2.07. Specific models matching 7a25 A, B or C were generated to compare the ligand binding characteristics of each conformer (SwissModel7A, 7B and 7C of QMean -1.72, -1.64 and -2.22. Evaluation of similarity included 1705 templates. RMSDs and further characteristics found for 4 the NTD and RBD are listed in the graphical description of models. Energy minimization of structures was carried through with a minimum of 100 steps of conjugate gradients applying the amber ff14SB force field (34) and further AM1-BCC charges. Molecular dynamics to generate random conformers in the first step was utilized with equilibration of 5000 steps and a production phase of further 5000 steps, and was visually controlled by the movie output. A Nosé thermostat with 298 K was applied (relaxation time 0.2). For the mutants generated in Modeller Version 9.12 (35,36) with a single structural template (and for the wildtype protein) the last third of the output was clustered and judged in frequent occurrence, the top scoring clusters with a maximal member number were selected. Automodel was applied in the Modeller suite for this procedure and full length NTD sequence 14-291 or 69Del70Del144Del of 14-291 (20) was used as input to the structural match of above described selfgenerated template (SwissModel7C_26J). The potential energy for the wildtype protein 7C_Mod-wt reached -15544.9 and for the mutant 7C_Mod-B-1-1-7 -14974.6 following the heating in the molecular dynamics, and -16429.9 and -15663.3 after the production procedure, respectively. Automodel (Modeller) and Swiss-Model (WWW) results were judged differently in energy and could not be comparatively analysed. They are indicated with RMSD values: SwissModel7C_26J -7C_Mod-wt 0.190 Å, 7C_Mod-wt -7C_Mod-B-1-1-7 0.341 Å and molecular dynamics clusters (high population number) 7C_Mod-wt-MD -7C_Mod-B-1-1-7-MD 2.403 Å. The SwissModel7C_26J models themselves differed by 0.084 Å RMSD from energy-minimised and 1.741 Å RMSD from molecular dynamics simulated form used for some experiments. Following the described model generation, ClusPro was used for further docking of heparin with rotating side-chains and generated best scoring ligand-bound poses with the SwissModel7A, 7B and 7C input files (37). Lowest energies are indicated in Figs. 1 and 2. Some genetic and epidemiological data were gleaned from www.datamonkey.org and www.nextstrain.org to confirm the spread of the wildtype and mutant SARS-CoV-2 sub-strains or clades. Table 1: Original poses of ClusPro high affinity interactions and residues in the proximity (5 Å). S-protein domains NTD (amino acids 14-291) and RBD (amino acids 334-524) were analysed for proximity to residues in 5 Å, chains are denoted with A, B, C and coloured as shown in the molecular overview (Fig. 2).

Results
In a first approach, the SARS-CoV-2 S-protein was subjected to molecular docking of a tetrasaccharide heparin using the ClusPro queue (37) to confirm the results on the S-protein RBD (38,39,40)(see SARS and protective role of lactoferrin (41)). The trimer of the S-protein is shown in Fig. 1 to demonstrate the different binding sites within S-protein RBDs and NTDs that can be described by docking each of chain A, B and C conformers of the SwissModel 7a25 (SwissModel7A, B, C) generated by the queue on 11 February 2021 (28,31). ClusPro delivers several high scoring docking solutions some of which largely correspond to the previously described ligand binding simulations (Fig. 2, B RBD and C RBD). The Autodock re-dock energies corresponded to the -14.4 kcal/mol (B RBD) and -13.5 kcal/mol (C RBD) which could not be directly compared to the entropic energy evaluations used in the original ClusPro docking poses. Novel to this docking analysis is the pose of the heparin bound to the A conformer of the SwissModel here found interacting with the "up" conformation of the S-protein, which is slightly displaced towards helix 304-308 of the RBD A, with an increased Autodock affinity of -15.8 kcal/mol. Although elongated heparin molecules or antennae of proteoglycans could span and connect the RBD with the NTD, the data do not provide an indication for the proximity of the tetrasaccharide to both, each RBD and neighbouring NTDs. The described bridging of RBD and ACE2 wherein the hexasaccharide heparan sulfate (GlcA(2S)-GlcNS(6S))3 suggested to interact with the RBD, would connect to ACE2, could not be demonstrated, since other binding sites showed highly increased affinity relative to the proposed interaction. A summary of potentially interacting residues (proximity 5 Å) is shown in Table 1 (Swiss-Model of residues 334-524 of S-protein). With vastly increased ClusPro affinity, a further binding site in the NTD of each SARS-CoV-2 S-protein protomer could be demonstrated and is shown with lowest energies in Fig. 2. The lowest energy of -944.4 corresponded to the Autodock re-dock energy of -14.3 kcal/mol for the B NTD, the A NTD had a re-dock affinity of -14.3 kcal/mol and the C chain of -15.2 kcal/mol. As compared by ClusPro energies, the binding to the N-terminal domain would be highly likely, more prevalent or of higher affinity than the interaction previously described, i.e. the binding to the RBD. The conformer of SwissModel NTD C docked to heparin was studied in the later analysis with docked CARB115 library residues to demonstrate the influence of side-chain rotamers (Suppl. Fig. 1) and/or sufficiency of the procedure. Residues within 5 Å distance of docked heparin for the SwissModel NTDs A, B, C (residues 14-291) of the S-protein are shown in Table 1. Evident from analysing the preliminary data with regard to natural heparan sulfate interaction, is the slightly different pose of the B NTD ligand, which is fully covered by the S-protein loop 245-251. This terminal interaction does not correspond to the interaction of the nitrous acid depolymerized isolate of heparin and may constitute the reducing end of heparin produced in an enzymatic digest (see (42)). As a note of caution, it should be stated, that only the interaction of heparin with the RBDs is currently validated by the full structure of the 7a25 trimer, whereas several of the NTD residues indicated in Fig. 1 that were introduced by the protein modelling show heparin interaction (5 of 9 for NTD A, 7 of 9 for NTD B, 6 of 9 for NTD C).  The blood group antigens or elongated glycolipids (with Glc at the reducing end) were tested for interaction in the next step. The glycolipids displaying antigenic determinants (Table 2) can be grouped into lacto (type I), neolacto (type II and type III) and globo (type IV) series of glycosphingolipids (GSL). A variety of different linkages generates at least 15 different GSL-headgroups that could be recognized by anti-blood group antibodies. For this approach, Autodock Vina was used with the localized binding pocket scrutinized in the Figs. 1-5 with the S-protein NTDs. The model used for heparin docking was further modified by the Modeller routine (35,43) to mutate the wildtype to the His69Val70Tyr144 deletion mutant B.1.1.7 (Suppl. Table shows the additional genetic changes of the variant virus). High-energy conformers were produced by molecular dynamics in Chimera (298 K) that could likely mimic one major binding mode of the S-protein NTD to be used for the interaction analyses. Localized docking shows, that the elongated blood type determinants have interaction energies (Autodock re-dock) of -15.0 to -21.6 kcal/mol (Fig. 3 A). Overall, a significantly stronger interaction of A versus H (0) blood group determinants could be determined with these procedures for the B.1.1.7 mutant S-protein NTDs which is shown in the comparison of blood type averages in Fig. 3 B. Although the result could be considered preliminary, one of the blood type II A presenting glycolipids (No. 5) shows clearly increased affinities to the B.1.1.7 binding pocket. Regardless of whether the minimized energy model only (not shown) or the molecular dynamics (cluster) model was subjected to docking, a highly increased interaction was simulated. are demonstrated by drug docking using a multithreaded procedure that is only partially available for glycan docking: Small glycan residues have previously been tested, the procedure is here used for glycans, that may be exceeding the computational capacity/force-field adjustments of Autodock (23) with difficult binding sites. The NTD was subjected to Autodock docking, re-docking in refinement with the model generated by Modeller of the SARS-CoV-2 (wildtype, B.1.1.7 mutant) S-protein NTD. The molecular dynamics conformer was obtained by a standard run in Chimera with a thermostat of 298 K and clustering with conformers in the equilibrated phase. The graph shows the binding energy of re-docking of each individual glycolipid "blood type" with underlaid green (type I), in blue (type II), red (type III) and ochre (type IV). The British S-protein NTD (lineage B.1.1.7 in orange) mutant and wildtype S-protein NTD (blue) are indicated. Numbering and structural (IUPAC) formulae are shown in the accompanying table. (B) A significant difference is found with the British S-protein NTD (lineage B.1.1.7 in orange) mutant for interaction of type A and H (0) (p=0.04 Mann-Whitney test). The wildtype S-protein NTD results are shown in blue. "Attached" molecular dynamics with fixed residues did not allow to model a suitable ligand binding pose, and model molecular dynamics of the full length trimer of SARS-CoV-2 S were not yet available from covid.molssi.org. Error bars are indicated with the confidence interval (CI) presented with an =0.05. The significant difference of type A and H (0) was also obtained when glycolipid 11 was left out in (B), one of the duplicates 6 incorporated for testing exhaustiveness (A) was deleted from results for the graph (B) and only top-scores were retained. Table 3: Carbohydrate-interaction screen of the SARS-CoV-2 S-protein NTD. Carbohydrate ligands utilized in Vina are indicated and listed with their common names. Ligands not expressed or metabolically produced in humans, or only found in very rare cell types and as human polymorphisms are indicated (*). Formulae (IUPAC style) of scoring glycans are provided in Table 4.
Previous analyses have suggested that the S-protein NTD may interact with ganglioside GM1 although the structure of the SARS-CoV-2 S-protein available was then including large gaps in loops and in particular at the N-terminal region (44). In determining the different binding sites of the entire Nterminal domain, which is subject to algorithmic hindrance due to a multitude of possible interaction sites, the half molecule (NTD) exposed to the viral exterior was here used with Autodock Vina (Fig. 4). Both, the elongated binding site demonstrated in Fig. 1, 2 and 5 and an N-terminal site could be shown. In Fig. 4 the top score of the carbohydrate screen Di-Sialosyl Galactosyl Globoside (DSGG) or di-sialosyl-Gb5 (45) which interacted with the affinity of-7.8 kcal/mol is displayed in violet and residues within 3 Å proximity are indicated. Table 3 lists the carbohydrates used in this screen. The topscore GalNAc-GM1b that was found to interact at the N-terminus with relatively high affinity of -6.6 kcal/mol was discarded as low affinity ligand. In previous screens with the similar procedure interactions of identical affinity were considered to be false-positives or nearly unreliable (46,47). This was proposed in cognate or non-cognate docking poses but would be exceeded in tetrasaccharides that serially interacted with larger binding pockets. Previously identified residues (44) are shown, yet, did only partially overlap with the here identified novel binding site which apparently includes the Nterminal Gln14 itself (H-bonded). Residues overlapping with the GM1 binding site are signified in grey (Fig. 4). Also here three amino acids are within 5 Å distance that were included from the modelling queue, and the result should thus not be considered as final.
In the final analysis of refinement of interactions, SwissModel7C_26J was used to generate docking in local binding mode. This included the area surrounding His69 which has a deepened, curved shape surface morphology. Table 4 lists the top-scoring glycans of the CARB115 library that could be visualized and placed ligands at appropriate distance within the binding pocket. Top-scoring is Di-Sialosyl Galactosyl Globoside (DSGG) or di-sialosyl-Gb5, a globoside, which showed a high affinity of -25.4 kcal/mol (refined). Although the blood group I H (0) antigen scored with -15.5 kcal/mol (refined), the ganglioside GalNAc-GM1b interacted in this place with the refined affinity of -21.3 kcal/mol (-7.6 kcal/mol original score) exceeding the interaction energy defined in the approach above (Fig. 4). Ganglioside GM1b was found to interact with the affinity of -18.2 kcal/mol, several neolacto and lacto series GSLs scored with the affinity of -14.2 kcal/mol to -25.6 kcal/mol, and globo series GSL Gb4 (named P antigen / belonging to another "blood group system"), which is a precursor of the topscoring DSGG, was defined in Autodock Vina with the re-dock affinity of -14.3 kcal/mol. Overall, when analysed with the hexameric heparin (gathered from 3ina), the increased energy of -29.7 kcal/mol could imply competitive interactions in the binding site of gangliosides, globosides etc. and heparins that may aid to deter the virus from cell binding. The docking queue results are presented for the top-score DSGG in Fig. 5 with the Coulombic surface presentation of the S-protein NTD. The side chain locations of charged residues are named and indicated (left) and demonstrate the likely large binding area that is formed in-between. Very demanding in computational task of docking is the large number of rotational degrees of freedom in particular with these positively charged residues and binding poses can only be approximated in the panel to the right (Fig. 5). For this task serial docking was applied where rigid receptor -flexible ligand and flexible receptor -rigid ligand docking was alternated to obtain the final pose. It was seen that the ligand was moving within the pocket from the left to right (Fig. 5, right panel) with side chains adapting to the new pose of similar energy (underlined). Moreover, terminal two saccharides were rotating with respect to the five residues at the reducing, ceramide end. If interaction with the globoside would prevail for a longer time-period, it could be envisioned, that conformational changes within the backbone of the SARS-CoV-2 NTD would be generated. These could be transmitted to another binding site or to the rest of the molecule. The interaction with ligands in this binding site is expected to tolerate few changes, the His69 is found in tyrosine His69Tyr sub-strains or as the discussed deletion B.1.1.7 mutant (in combination with the Val70 deletion since 2/20) that was studied with blood groups in detail above (Table 2). More work is necessary to elucidate the full panel of carbohydrates and glycolipid-headgroups that vastly exceeds computational capacities of even clustercomputations or supercomputing, since even several thousand ligands that harbour the very high torsional degrees of freedom would have to be docked to the entire surface. The first glimpse provided here and the data from datamonkey.org as well as the nextstrain.org list of mutants suggests that the loop with the Tyr145 and Trp152 indicated in the binding site -ligand interactions, is polymorph; it includes deletions of Val143 and Val143Phe replacements as well as the insertion of 2-15 amino acids, which makes it highly unlikely that a quick computational solution to the binding task will be installed. Table 4: Carbohydrate screen for the local docking to the identified binding site, list of top-scores. The box size x = 39.2, y = 26.5, z = 28.9 was used for the Autodock Vina screen, screen energies are listed (black) and refined local autodocking energies are indicated in green. These correspond to local energies obtained in SMINA. The shared terminal epitope of DSSG (Sialosyl-MSGG or Disialosyl-Gb5) found in GD1 was bound in grossly similar configuration to the S-protein NTD with N-acetylated residue GalNAc within the central binding pocket and with Autodock Vina affinity of -6.8/-18.4 kcal/mol. In this binding, the reducing end was likely not available and only partial low affinity binding to GD1 would be expected. Categories of glycolipids are denoted with series name and IUPAC formulae are indicated.
In the next analysis, the top-scoring ligands of the SwissModel7C_26J (Table 4) were tested for interaction with the surface pocket of the SARS-CoV-1 S (48). The structure nearly corresponded to the energy minimized conformer with little change (RMSD 0.099 Å) and only Lys142, Glu174, and Asp204 in the putative binding site subject to minimal side-chain rotation when energy minimized. Although the 5X4S structure contained gaps and some amino acids had not been resolved, the ligands docked to the structurally resolved surface area within the neighbourhood of these four residues. In the Autodock approach the distinctly lower binding affinities of both, heparin (3ina) and DSGG, are shown (Fig. 6). In comparison, the Gb4 (P-Antigen) and GalNAc-GM1b interacted also stronger with the SwissModel7C_26J than with the SARS-CoV-1 S-protein. Other ligands showed mostly comparable affinities. Finally, the recently published convalescent sera study was used to comparatively analyze the glycan binding site (49) (Suppl. Fig. 2). It appears, that the major antigenic site in the NTD (S-protein) would extend from Tyr144, His146 to Val143 and Leu141 that has now been defined. Only the first two residues are exposed, the residual amino-acids that grossly alter antigenicity are located to the interior of the domain and none of the amino acids in the binding site within direct proximity in rotamers of side-chains or side-chains themselves alter the antigenicity. Figure 5: Surface presentation of the SARS-CoV-2 NTD with half-side view onto putative binding sites of glycans. The surface is coloured by Coulombic electrostatic surface charges, the ligand is coloured by the indicated IUPAC code and major side-chain rotations in refinement are: Asn74, Trp152, Lys182, Gln183, Asn185, Arg214 and Arg246 (underlined). Energies gathered in the refined poses were increased from -7.7 kcal/mol to -10.2 kcal/mol and corresponded to the -26.1 kcal/mol and -26.0 kcal/mol obtained in the local or freely-rotating side-chain poses, respectively. Computational resources for the overall approach of no restriction to backbone movements and/or freely rotating side-chains in ligands docked without restricted torsional degrees of freedom were not available. Charged six Lys and one Arg amino acids in the binding site are denoted. The likely location of ceramide is indicated. Annotations of residues, mutants and first occurrence is provided by www.datamonkey.org. Glycans are coloured in IUPAC style yellow Gal and GalNAc, blue Glc and purple NeuAc.

Discussion
Based on two recent analyses, I would like to suggest, that the putative glycan binding site established with this work on Autodock and carbohydrate ligands is not directly involved in "immuneescape". This theory holds, that surface residues of viral proteins, evade immune recognition by mutation and structural change and surface patches may also be indirectly affected by altering internal residues. Two most recent studies have mapped the immune epitopes recognized by the antibodies in humans. These are consistent with the assumption that monoclonal antibodies and convalescent sera against the SARS-CoV-2 Wuhan isolate bind to a surface area distinctly different from the surface patch surrounding His69 of the SARS-CoV-2 S-protein (49,50), the putative glycan binding pocket. Previous analyses in genetics have supported the role of glycans in the susceptibility of the human population to SARS-CoV-1 and -2 infection and/or severity of disease . Although different models have been suggested that could explain the relative or absolute protection of individuals with blood group H or 0, the interaction of glycans with the S-protein itself had not been demonstrated. In this approach, the SwissModel generated conformer SwissModel7C_26J with a highest similarity to SARS-CoV-2 S-protein structure 7a25 C was automatically generated to maximize the fit to any structural entry available in the end of January 2021 (31). The model differed by only 9 amino acids to the reported structure 7a25 with residues introduced by the modelling (amino acids 71-75 and 248-251).
Since it is to be expected, that SARS-CoV-2 just as many other viridae that incorporated a lectin domain during evolution, may bind to carbohydrates of distinct structure the Autodock Vina approach was further tested for the carbohydrate interaction. The approach is criticized by some due to the lack of modelling of pi-interactions and force field changes have been introduced in the novel modelling methods (51) wherein each carbohydrate-pi interaction may, however, contribute 0.8-1.0 kcal/mol. In the described binding site (Figs. 4 and 5) glycans in the vicinity could (with the static structure) contribute only little. These can possibly contact the rings of Trp64, Tyr145, Phe186 and Trp258, but the glycans are, in the docking poses, positioned at or largely exceeding the dCX distance exclusion limit of 4.5 Å (52). In contrast, with blood type antigens several poses have been found that would allow some pi interactions in particular with Gal and Fuc to Tyr145 in the wildtype S-protein, or of Fuc with the Trp152 or Phe186 (according to wildtype numbering). Whereas the expected energies in scoring would thus not differ in the screening run with the general CARB115 library, it may be worthwhile and affordable to use high-precision force fields and molecular dynamics to generate a sufficient ranking of blood type antigen interactions. Visually inspecting the binding site environment, it could be inferred from Coulombic surface colouring (Fig. 5), that non-blood group ligands would be attracted by low-affinity, transient binding events that may include charged groups of heparin, proteoglycans or sialylated molecules. Low-affinity interaction would then be followed by highaffinity induced fit. The blood groups associated with the SARS-CoV-2 infection and severity of disease could not be identified in this study and interpreted in an easy way. However, when comparing the protein conformers of the predicted wildtype S-protein NTD with the mutant B.1.1.7 which harbours the His69Val70Tyr145 deletion, a consistent observation is the highly increased affinity of a glycolipid of the A type II antigen (No. 5). Apparently, a H (0) type III antigen interacted less in the mutant B.1.1.7 strain. The type III B antigen that was included in this study, was measured to complete the series of lipidic antigens that may be produced in the human body, but is described so far linked to O-glycans: The enzymatic reaction of the A-or B-transferase (AB0) may link terminal Gal-just as GalNAcresidues to the type III precursor. Since the type III A GSL has been found (LipidMaps) it is a matter of further research, to elucidate the full sphingolipid glycome. This particular GSL, however, interacted less with the B.1.1.7 mutant clade S-protein NTD and it may allow to speculate, that a large variety of change to tropism may set in once a glycan binding site has altered in specificity, even if single linkages only were recognized differently. I would like to suggest that the terminal GalNAc of blood group A would be bound, yet, the affinity of interaction does currently not allow to pinpoint towards the exact binding site geometry. Only the large screen with the CARB115 library has allowed to collect ligands of highest binding affinity that may allow to conclude, that the His69<->Lys182 central binding area is most often filled with Neu5Ac or N-acetylated glycan residues. However, results of the previous docking study on the S-protein, demonstrating Neu5Ac bound to the NTD (53) were found to be largely discordant with my present result (Suppl. Fig. 3). The S-protein structure that was used at that time included larger gaps and depended on simulation for a large fraction of residues including the N-terminal domain. Yet, since the structures of ABH determinants are found on N-, O-glycans as well as glycolipids and the type I, II and III form is, for example, expressed in gastrointestinal tissues (4,5,6) this study could alert to a change in tissue tropism that may adapt the SARS-CoV-2 to conform to the clinical view on other coronaviridae including SARS-CoV-1 (54). Gastrointestinal symptoms had been more often reported with the ancient SARS-CoV or MERS-CoV. The ligand with the current top scoring affinity of -26.1 kcal/mol (Fig. 5) DSGG fully fills the binding pocket and likely would contact residues in similar locations to the Asp72Asn and Ala219Ser that have been defined previously in the Transmissible GastroEnteritis corona Virus (TGEV) of piglets (2). These mutations have been found to alter tissue tropism from the respiratory and gastrointestinal system towards the respiratory tract. Growth of the TGEV was measured in different tissues and established a correlation to define the tropism measured. Binding of viral S-protein to the cell surface aminopeptidase N, the proteinaceous viral receptor, may be enhanced by bivalent interaction of the S-protein to the protein receptor and to glycans on the host. Expression of MSGG (Mono Sialosyl Galactosyl Globoside), the desialylated DSGG, and of DSGG is found in human erythrocytes and in kidney within the distal tubule and Henle's loop (45). GSL expression can vary in different tissues and MSGG has, for example, been characterized in embryonic stem cells, dorsal root ganglia and tumour tissues. Parvovirus B19 (55), in contrast to SARS-CoV-2, causes anemia due to erythrocyte infection. This is likely due to binding of Gb4s (P antigen), Gb5 and MSGG among others. Although the similar binding profile could be ascribed to the SARS-CoV-2 virus with a differential binding mode, the aplastic anemia has only been observed in a single case (56,57) and clearly co-receptors are the major determinant of the observed respiratory tract interaction and viral uptake, the ACE2 receptor. Complexity increases, when relegating part or all of the initial SARS-CoV-2 interactions to the glycan shield and glycan-glycan interactions of coronaviridae which is essentially unexplored, in simulations as well as in biochemical studies (58,59,60). Finally, when considering zoonosis and anthropozoonotic cycles of infection, it remains to be shown whether influenza viridae are teaching a lesson suggesting, that although lectin domains are displayed on the viral surface, glycan interactions seem sometimes non-essential (61,62,63). The differences of lectin-activities of SARS-CoV-1, if any, and SARS-CoV-2 S-protein (Fig. 6) remain to be analysed in high resolution and structurally in the future.