The Ecology of Phage Resistance: The Key to Successful Phage Therapy?

As antibiotic resistance undermines efforts to treat bacterial infections, phage therapy is being increasingly considered as an alternative in clinical settings and agriculture. However, a major concern in using phages is that pathogens will develop resistance to the phage. Due to the constant evolutionary pressure by phages, bacteria have evolved numerous mechanisms to block infection. If we determine the most common among them, we could use this knowledge to guide phage therapeutics. Here we compile data from 88 peer-reviewed studies where phage resistance was experimentally observed and linked to a bacterial gene, then assessed these data for patterns. In total, 141 host genes were identified to block infection against one or more of 80 phages (representing five families of the Caudovirales) across 16 microbial host genera. These data suggest that bacterial phage resistance is diverse, but even well-studied systems are understudied, and there are gaping holes in our knowledge of phage resistance across lesser-studied regions of microbial and viral sequence space. Fortunately, scalable approaches are newly available that, if broadly adopted, can provide data to power ecosystem-aware models that will guide harvesting natural variation towards designing effective, broadly applicable phage therapy cocktails as an alternative to antibiotics.


Introduction
With the rapid development of bacterial antibiotic resistance threatening the efficacy of antibiotic treatments, and lack of development of new antibiotics due to regulatory obstacles and costs [1]. Phages, viruses that infect bacteria, stand out as an alternative approach to combat bacterial antibiotic resistance in clinical and agricultural settings [2,3]. Phages have been utilized therapeutically to treat and prevent bacterial infections in humans, plants, and livestock [4]. In the United States, Adaptive Phage Therapeutics (APT) have a growing stock of phages against multidrug resistant bacterial pathogens, and report 7 successful cases using their precision-matched phage technology after normal standard of care failed [5]. Concurrently, the FDA has approved the use of SalmoFresh consisting of lytic phages that target Salmonella and has been approved to treat fish, shellfish, poultry, fresh fruits, and vegetables in Canada [6]. Though phages are being increasingly considered as an alternative to combat the rise of bacterial resistance to phage Here, we review what is known about mechanisms found to block phage infection. Our goal is to develop an understanding of possible emergent patterns that could be predictive and informative for combatting resistance in applications of phage therapy. Specifically, we compiled data from 88 peer-reviewed studies where phage resistance was both observed and experimentally linked to a bacterial gene. We then analyzed the data to determine if there are links between infection blocking mechanisms and phage genera. With enough knowledge, we could predict the patterns of resistance and use this knowledge to develop models to improve the therapeutic use of phages.

Phage resistance genes and mechanisms identified in previous studies
In this study, we sought to review the literature to identify studies where specific genes had been shown experimentally to cause phage resistance. This led initially to the identification of 135 studies where bacterial colonies were described as resistant to phage and the host gene was possibly identified. We then evaluated these descriptions to establish a systematic definition of "resistance" such that any findings where efficiency of plating, spot tests, or optical density was reduced at least 10-fold after target gene mutations demonstrating strong experimental evidence of that gene that block phage infection. Due to the nature of data presented in many studies being qualitative to relatively quantitative, we could not refine these responses to less than the robust 10-fold phenotypic response. Further, we required phage genomes to be sequenced and host ranges to be documented. Of these 135 studies, 58 failed to meet our criteria for further analysis, but 11 of these studies did have genes that were used in the analysis for other phages (Table S1 in Supplemental data 1) [14 -70].
Across the remaining 88 studies that met these criteria, 16 bacterial genera were represented (only genus-level taxonomy was evaluated since many experimentally evaluated strains lacked 'host' bacterial genomes) [71] and 80 phages with complete genomes. Though the phage taxonomy was largely unknown, the availability of complete genomes could be used in gene-sharing networks [72,73] to establish phage taxonomy. This revealed that, of the 80 phages, most (n=47) were from the family Siphoviridae, with the remaining from the families Myoviridae (n=17), Podoviridae (n=12), Helleviridae (n=3), or Microviridae (n=1) ( Table S2 in Supplemental data 1). Thus, the Siphoviridae and Myoviridae might be relatively well represented, but the others are certainly not.
To understand the types of resistance systems found at each step of the lytic cycle ( Figure 1) we give an overview of some of the infection blocking mechanisms that have been experimentally verified. The initial step for a successful infection, phage adsorption, begins with the phage binding to a specific bacterial surface receptor. Bacteria have found ways to block the initial step by alteration, masking, or the physical blocking of the receptor [13,87]. For example, phage 52 that infects Staphylococcus aureus normally binds to OmpA, but an outer membrane lipoprotein (TraT) can overlap OmpA and block the phage receptor [88]. Another example, T5, known to infect Escherichia coli, is blocked from the receptor by being outcompeted by the antimicrobial peptide MccJ25 that competitively binds to receptor FhuA [89]. Alternatively, bacteria can reduce expression of proteins, termed phase variation, that can reduce the expression of bacterial phage receptors [90][91][92]. Examples include Haemophilus influenzae altering the structure of a lipooligosaccharide in response to phage HP1c1 and Bordetella bronchiseptica differentially expressing the outer membrane protein Prn to prevent adsorption of phage BPP-1 [91]. Once the phage has adsorbed to the bacteria the next step in the lytic cycle is injection of phage DNA into the cell. At this point, bacteria can stop the phage by blocking or degrading the phage DNA. Superinfection exclusion (Sie) and superimmunity (Sim) system are proteins found to block DNA entry and have been found in Lactococcus, Streptomyces, Escherichia, and Salmonella [93][94][95][96][97][98][99]. Once the DNA has entered the cell it can be degraded by restriction modification systems [74,86,[100][101][102][103]. DISARM (Defense island system associated with restriction-modification) is a recently discovered multi-gene restriction-modification system with broad anti-phage activities that can degrade phage DNA after phage DNA entry [100]. Clustered regularly interspaced short palindromic repeats (CRISPR)-Cas systems are also adopted by many bacteria as an immune system that can target phage genomes and plasmids [104][105][106][107][108][109][110][111][112] Examples include Streptococcus thermophilus in defending against phage 858 and Pseudomonas aeruginosa in defending against phage DMS3 [104,108,111].
If phage genomic material has successfully entered the cells and avoided degradation the phage must take over the host to transcribe and translate its genome and build phage progeny. Bacterial hosts have several methods to block genome replication, as well as transcription and translation of intracellular phage genes. Phage replication can be blocked by the bacteriophage exclusion (BREX) system by blocking phage DNA replication without DNA degradation [68]. Bacterial host proteins the play a role in replication of phage DNA can also inhibit phage infection. In E. coli, dnaJ is required for phage DNA replication, so when the gene is disrupted phage infection is inhibited [76] Similarly, host proteins involved in phage transcription and translation regulation have been found to be inhibit infection when disrupted [74,76,113]. Abortive Infection Systems (Abi) are a collective term representing systems that cause premature cell death due to phage infection, in turn aborting the phage replication cycle [87,114]. A series of Abi systems from AbiA to AbiZ have been reported in Lactococcus to inhibit phage infection these ABI systems are diverse and can be triggered at any point after phage entry [55,63,66,68,[115][116][117][118][119][120][121][122][123][124][125][126][127][128][129][130][131][132][133][134]. Finally, the phage takes over the host and the phage progeny are produced the phage must find a way to release the resultant phage progeny through host cell lysis. Here too, Virion assembly can be prevented by Abi systems which block either DNA packaging or capsid assembly via cell death. All genes related to virion assembly blockage were identified in Lactococcus lactis against lactococcal phages [119,123,124,132,[135][136][137][138].
Similarly, our analysis has found only abortive infection systems involved in host cell lysis [56,103,[139][140][141][142][143]. Examples include AbiZ system found in L. lactis which causes premature cell lysis upon phage ul36 infection [56], hok/sok system in E. coli that leads to post segregation killing after the cells are infected by phage T4 [139], and rexA/B genes found in E. coli that reduce cell metabolisms after infection of lambda phage [142,144].

Analyses of patterns of phage resistance
To begin to assess patterns across this meta-analysis, we first asked which of the above 5 steps in the phage lytic cycle were best studied as inferred from the abundance of known infection-blocking genes at each step ( Figure 1). This revealed that adsorption had the most genes with 53 genes across 13 bacterial genera (Table S3 in Supplemental data 1). We found that 48 of the 53 genes preventing adsorption were from mutations or knockouts (Table S3 in Supplemental data 1) [54,64,65, 67,75,76,86,[145][146][147][148][149][150][151][152][153][154][155][156]. This is mainly receptors or regulators of receptors that when mutated or disrupted inhibit the first step of phage infection. Genes involved in DNA entry had 20 genes across 9 bacterial genera and included 5 bacterial resistance systems. DNA replication, transcription and translation covered 21 genes and 6 bacterial genera. Host cell lysis and phage assembly had the fewest genes with 7 and 4 genes, respectively, that covered 3 and 1 bacterial genera, respectively. Anomalously, abortive infection systems made up a large proportion of the dataset for DNA replication, transcription, and translation (16/21), as well as all of the phage assembly (4 genes) and host cell lysis (7 genes) considered in this meta-analysis. We interpret this to be due to focused studies of phage infections in Lactococcus where such mechanisms were targeted [80,87,157,158]. This is a first strong indicator of the need for diverse research efforts in this area.
Vast swaths of viral and microbial dark matter have been explored to reveal 5389 genera of bacteria [71], and phage diversity also being high with 1133 genera from cultured reference genomes [73] and the global oceans surveys identifying 867 viral clusters (defined as approximately as genuslevel groups) in GOV 1.0 and further expanding the dataset ~12 fold in GOV 2.0 [159,160]. With the caveat that our meta-analysis data currently available is not robust enough to cover this vast natural diversity now known for bacteria or viruses, we still sought to see if any patterns emerged as follows. To evaluate patterns of phage resistance, we first examined numbers of identified infection-blocking genes with known functions found in each host species. We wanted to examine the abundance of genes across genera to determine if there are some bacteria that have more genes involved in phage infections and resistance compared to other bacteria. Escherichia had the most genes consisting of 67 genes known to inhibit infection followed by Lactococcus 26, significantly more than the other 14 genera (Figure 2a). We again interpret this to reflect researcher sampling bias, i.e., resistance systems have long been intensively studied in these model systems [161,162].
Escherichia has several model phages that have been studied extensively such as phages T7, T4, and lambda [163][164][165]. We can see this in Figure 2b where phage T7 has 19 genes shown to inhibit phage infection followed by lambda with 18 genes, and T4 with 14 genes, and phage 186 with 11 genes. Similarly, we can see this with the Lactococcus phages 712, c2, sk1, BIL170, jj50 having a high abundance of genes due to the discovery of plasmids with ABI systems from research into defense systems in lactococcal phages [166]. Thus, not surprisingly, currently available data are biased towards extensively studied model phages. To statistically evaluate whether there might be patterns in phage resistance across phage taxa, we established a genome-based phylogeny and layered on available metadata that was assessed for structure (non-randomness) in the data (details in Figure 3). To compare the 80 phages in our analysis and determine if there was any significance between phage genera and steps of resistance inhibited, we used the PERMANOVA procedure in the GUniFrac package (statistical analysis citation). In this package the UniFrac distances are calculated and used to test the hypothesis if the steps of phage resistance groupings by genera are due to chance [167]. GUniFrac adds an additional parameter α that controls the weight on abundant lineages so phages with numerous genes inhibiting one step of the infection cycle did not skew the analysis [168]. To test the hypothesis that the steps of phage resistance grouping were nonrandom, we ran PERMANOVA using alpha of 0.5 without rarefication when comparing bacterial and phage genera [168,169]. . Phylogenetic tree of the 80 phages used in the analysis. Tree was created using VICTOR that compares nucleotide sequences using genome-BLAST Distance Phylogeny method via standard prokaryotic virus settings [170] with VICTOR [171] and data was analyzed using iTOL [172]. All bootstrap values below 70% were removed. Phage genera are grouped by black lines determined by vConTACT 2.0 analysis [73]. To determine if any of these groupings of infection-blocking systems are non-random. GUniFrac and PERMANOVA were used to compare infection blocking systems to phage genera [168,169]. Statistically significant results (<0.01) are denoted by asterisks. *, P<0.05; **, P<0.001.
Analysis using PERMANOVA showed that all steps of phage resistance showed no statistical significance in resistance groupings in terms of bacterial genera (Table S1 in Supplemental data 2). We also ran PERMANOVA using the groupings from vConTACT 2.0 [73]. A total of 51 'viral subclusters' from vConTACT 2.0 emerged from the 80 phages. When steps of phage resistance were compared using PERMANOVA only one phage group was found to have statistical significance. This group only contained phage T4. We examined the types of genes found in each cluster and what phage infection steps were blocked for each cluster. From these analyses it showed that many of the clusters only consisted of one phage showing the diversity of phages studied. Only 15 clusters contained more than one phage (figure 3). Clusters 224_0 and 145/147/279 showed the greatest diversity in types of resistance types with both groups blocking 4 of the 5 steps in the lytic phage cycle. Cluster 224_0 only consisted of one phage lambda and had genes that were found to be used/required for 3 of the steps in phage infection Adsorption, DNA entry and stability, DNA replication, phage assembly, and host cell lysis. This is likely due to lambda being a model phage and in tur has been extensively studied with some studies using high-throughput experiments to find host genes involved in phage infection [67,76]. Cluster 145/147/279 was the second group with the most resistance types. This group consisted of 2 phages ul36 and Tuc2009 with genes that could inhibit phage infection at DNA entry, DNA replication, phage assembly, and host cell lysis.

The need for high throughput, systematic phage resistance studies
Our meta-analyses of the readily available phage resistance literature (that met our criteria) demonstrates that very little generalizable knowledge exists for phage resistance, and yet this is critical for viral ecology and application (e.g., phage therapy). To better elucidate the potential phage infection-blocking patterns and instruct phage therapy development, future research of phage infection-blocking mechanisms should focus on using high-throughput techniques to allow researchers to screen for resistance-conferring host genes involved in phage infection. With sufficient data to dig in to, comparisons of infection blocking mechanisms in the context of either host or phage classification as conducted in this study would provide answers to multiple profound questions, such as: what mechanisms do bacteria tend to adopt when defending against certain groups of phages? which mechanisms are most widely-spread among different phage-host pairs? what are the potential interactions between two or more infection blocking mechanisms when defending against certain phage? If systematic, high-throughput research is conducted and across known phage-host diversity, we would better be able to predict patterns between bacteria species in defending against certain phage genera.
Fortunately, such scalable genetic approaches are emerging. Traditionally, finding resistanceconferring genes was largely manual requiring bacterial colony isolation, sequencing, single-gene knockouts, or genome-wide transposon screens [38,54,74,86,149]. Although these techniques have allowed people to find new genes required for phage infection it is expensive and labor intensive, studies are largely limited to studying few (n<3) phages. Recently, however, high-throughput techniques have been developed that may lead to more easier screening for infection blocking host genes. Specifically, we highlight RB-Tnseq, which uses DNA-barcoded transposons to generate mutant libraries of single-gene insertion mutants that are each coupled to a unique barcode so as to be scalably assayed by next-generation sequencing [173]. This approach will work well for studying non-essential genes. To evaluate essential genes, researchers are currently limited to more targeted approaches, with the most sophisticated likely being CRISPRi, which uses guide RNA to decrease the expression of genes after induction [67,76]. To evaluate the effect of gene overexpression, Dub-seq combines the DNA barcoding sequencing of RB-Tnseq with gene overexpression to determine the effects of gene overexpression [67,174]. All three approaches (RB-Tnseq, CRISPRi, and Dub-seq) were recently applied to study 14 E. coli phages [67]. Though these include some of the most well-studied phages in science, the new, scalable approaches helped 'see' both known mechanisms to block phage infection, as well as completely new ones with RB-Tnseq identifying 52 unique infection blocking genes and CRISPRi identifying 542 infection-blocking genes, 44 promoter regions and 44 transcription factor binding sites that when disrupted or downregulated decreased susceptibility to phage and Dub-seq with 129 multicopy suppressors of phage infection [67,76].

Conclusions
Phage biology has provided much of the foundations for molecular biology and genetics, but as a result it has long been known to be a field that is 'a mile deep and an inch wide'. While we are now well into the third age of phage [175] thanks to a phage (meta)genomic revolution [176,177] that is changing how we view viral sequence space in the broader natural world, this meta-analysis reveals that the breadth of model systems to which we can ascribe mechanistic knowledge is woefully thin. While we cannot hope to study in detail all 10 31 viruses thought to inhabit the Earth [178] the data available from even the well-studied model systems are insufficient to draw the kinds of ecologyscale mechanistic conclusions needed to power models to better design phage therapy cocktails. Though we are far from understanding phage resistance across the virosphere, such scalable methods coupled to increasingly culture-independent virus-host linkage methods [179] should help us rapidly come up to speed for better understanding the environment. Complementarily, focused similar application to pathogen-related phages will provide the knowledgebase likely to be so critical for phage therapy cocktail design that will drive success in the greater than century-old idea [180] to use phages to combat nuisance bacteria.

Supplementary Materials:
The following are available online at www.mdpi.com/xxx/s1, Supplementary data 1: Phages and genes discussed in the manuscript. Table S1: Phages excluded from the analysis and reasons for exclusion, Table S2: Phages used in the analysis and the phage family, Table S3: Genes found to block each phage used in the analysis by type of resistance mechanism; Supplementary data 2: Statistical analysis for figure 3.