In Silico Epitope-based Peptide Vaccine Design against Invasive Non Typhoidal Salmonella (iNTS) Through Immunoinformatic Approaches

Salmonella, especially invasive non-typhoidal Salmonella (iNTS) are responsible for developing various invasive diseases, and possess higher mortality rate, due to their higher antibiotic resistance profile than the other bacteria. Therefore, the present study was concerned to develop epitope based peptide vaccine against iNTS species as a successive and substitute protective measures. The study considered comprehensive Immunoinformatic approaches, followed by molecular docking and molecular dynamics simulation to predict the efficient vaccine candidate T cell and B cell epitopes, based on the outer membrane proteins. The study identified two best epitopes YGIFAITAL and KVLYGIFAI from total iNTS outer membrane proteins, which showed higher immunity, nonallergenicity, non-toxicity and also showed higher conservancy and population coverage values. Both epitopes showed higher binding affinity and stability towards HLA-C* 03:03. The MMPBSA binding free energy showed the YGIFAITAL epitope binds more tightly with both MHC-I and MHC-II molecules. The total contact, H-bond analysis and RMSF results also validate the efficiency of these epitopes as vaccine candidate. The projected B cell epitopes AAPVQVGEAAGS, TGGGDGSNT and TGGGDGSNTGTTTT showed higher antigenicity. Overall, the study concluded that these epitopes can be considered as the potential vaccine candidate to make a successive vaccine against iNTS species. However, this result further needs to be validate by wet lab research to make successive vaccine with these projected epitopes.

Salmonella causes several foods borne illnesses in humans. Children at the age of under 5 years are most commonly affected by salmonella (Aoki et al., 2017;Cellucci, Seabrook, Chagla, Bannister, & Salvadori, 2010;Scallan et al., 2011). NTS are leading to cause non-typhoid diarrhea, gastroenteritis, bacteremia and enteric fever. They are estimated cause approximately 93 million cases of enteric infections and among them 155,000 patients die globally each year due to diarrhea (Majowicz et al., 2010). NTS can be transmitted through animal based food items like dairy foodstuffs, eggs and poultry and can also be transmitted through person to person contact or person to household pets contact (Braden, 2006;Haeusler & Curtis, 2013;Katz, Ben-Chetrit, Sherer, Cohen, & Muhsen, 2019). Patients with sickle cell diseases, HIV and who reside in rural areas are most susceptible to devastating infection (Uche, MacLennan, & Saul, 2017).
There is no particular commercial vaccine for NTS infection is available in the market. So, development of an effective vaccine against NTS infection is mandatory before severe pandemic outbreaks. Due to the overuse and misuse of antibiotics the bacterial species are becoming resistance to antibiotics. Vaccine can limit the spread of antimicrobial resistance (AMR) (Lipsitch & Siber, 2016). There is a projection that about 10 million people will die each year owing to AMR after 2050, which is 7,00,000 deaths/year at present (Tagliabue & Rappuoli, 2018). Besides, the cost of antimicrobial drugs is becoming higher. Antibiotics are prescribed after the infection occurred, but vaccine can be taken before infection. So, vaccine can induce faster immunity rather than antibiotics. Peptide vaccine production is a cost-effective way to produce a successful vaccine within a short time compared to other methods.
In our study, we have retrieved the outer membrane protein sequences of common NTS species (S. Typhimurium, S. Enteritidis, S. Choleraesuis, S. Salame) for the isolation of most antigenic protein to design epitope-based (T cell and B cell epitope) peptide vaccine using in silico approaches. Outer membrane proteins (OMPs) are important for host-pathogen interaction during infection. OMPs are responsible for maintaining molecular integrity and permeability through the cell membrane in gram negative bacteria and 2-3% genes encode OMPs (Buchanan, 1999;Koebnik, Locher, & Van Gelder, 2000). This study provides vast knowledge about the possible vaccine candidate epitope. The wet lab researchers can use this prediction to validate this result.

Protein sequence retrieval
The total outer membrane protein sequences of NTS (S. Typhimurium, S. Enteritidis, S. Choleraesuis, S. Salamae) has been recovered from the Universal Protein Resource database (UniProt) at http://www.uniprot.org (Consortium, 2014) and the FASTA formatted sequences were saved for further analysis.

Highest Antigenic Protein Identification
For designing an effective peptide vaccine highest antigenic protein selection is mandatory. The supremist antigenic protein was identified by inserting the FASTA formatted outer membrane protein sequences in the online based server VaxiJen v2.0 (Doytchinova & Flower, 2007) and helps to assess the most antigenic protein intended for more investigation.

T-Cell Epitope Identification
T cell epitopes refers to the antigenic part of proteins which is recognized by T cell. The online T cell epitope prediction software NetCTL 1.2 has been used in terms of T cell epitopes identification (Larsen et al., 2007) and this can generate CTL epitopes for any inserted protein. In this study, all of the 12 MHC supertypes were considered at the default value and at 0.75 thresholds for epitope identification. The combined score was used to select the best epitopes provided by NetCTL.
MHC-I binding allele was identified by IEDB tool. This tool use Stabilized Matrix-based Method (SMM) to predict alleles (Bjoern Peters & Sette, 2005). The epitope length was set to 9.0 and the alleles were selected based on IC 50 value (Fleri et al., 2017). The binding affinity depends on the IC 50 value, IC 50<50nM, IC 50<500nM, and IC 50 <5000nM indicates higher, intermediate and lower binding affinity (Adhikari, Tayebi, & Rahman, 2018). IEDB (Immune Epitope Database) tool has also been cast off to identify the processing of MHC-I with their interactive alleles through the SMM method by taking 1 as maximum precaution extension and 0.2 as an alpha factor (Björn Peters, Bulik, Tampe, Van Endert, & Holzhütter, 2003;Tenzer et al., 2005). In this study, MHC-II binding alleles of our isolated epitopes were also predicted by SMM align method .

Epitope Conservancy and immunogenicity identification
For designing an effective vaccine conservancy and immunogenicity analysis is very important.
Conservancy indicates as the segments of protein sequences where epitope lies. In our study, we used IEDB (Kim et al., 2012) integrated tools for conservancy and immunogenicity analysis for the selected epitopes (Bui, Sidney, Li, Fusseder, & Sette, 2007;Calis et al., 2013). Immunogenicity is the capability of a substance to induce immune response. Epitope with higher immunogenicity value is better than less value containing epitope. So, the most immunogenic peptides were screened for further evaluation.

Prediction of population coverage
Population coverage prediction is also an important step in vaccine design cause it provides information about peptide binding frequencies depending on Human Leukocyte Antigen (HLA) genotypic frequencies, MHC molecules binding and T cell restriction data (Bui et al., 2006) . In our study, the IEDB integrated tool (http://tools.iedb.org/population/) (Bui et al., 2006) was used to predict the population coverage of our selected epitope with their corresponding alleles.

Allergenicity and toxicity prediction
One of the foremost drawbacks of vaccine is that in some cases they cause allergic reactions to the patients. So, the AllerTOP v 2.0 (http://www.ddg-pharmfac.net/AllerTOP/) has been used to predict the allergenicity of the selected epitopes (Dimitrov, Bangov, Flower, & Doytchinova, 2014). The AllerTOP v 2.0 use k-nearest neighbors (kNN) methods to predict both allergen and nonallergen peptides. Distinguished results among different allergenicity prediction tools showed that AllerTOP v 2.0 is the best allergenicity prediction server with 88.7% precise results which is greater than AllergenFP v 1.0 (Dimitrov et al., 2014). To eliminate the toxic epitopes we used the ToxinPred server to predict toxicity of our selected epitopes (Gupta et al., 2013).

Three-dimensional structure design of best epitopes and HLA proteins
After analyzing immunogenicity, conservancy, allergenicity and nontoxicity best performing epitopes were selected to design there three-dimensional structure using PEPstrMOD ( http://osddlinux.osdd.net/raghava/pepstrmod ) (Kaur, Garg, & Raghava, 2007;Singh et al., 2015) . PEPstrMOD is the modified method of PEPstr which predict the tertiary structure of the peptide using β-turn information and regular secondary structure (Kaur et al., 2007). We used natural peptide beginner method to predict the structure which works on PEPstr algorithm using AMBER11. We didn't find any annotated structure in Protein Data Bank (PDB) that's why homology modeling has been used to predict the 3-D structure of our epitope molecules using Phyre2 (Kelley & Sternberg, 2009). For the prediction of our HLA-C*03:03 protein structure we used protein structure prediction wizard Maestro (Release, 2018) and for homology modelling we used Prime, both developed by Schrödinger (Jacobson, Friesner, Xiang, & Honig, 2002;Jacobson et al., 2004;Kopp & Schwede, 2004). Prime uses a mixture of primary and secondary structure data to calculate alignments. Structures are constructed using atom locations from the template(s) matched parts, bringing into consideration solvent, ligand, power field and other variables via a sequence of algorithms. Portions of the request set that are not aligned with the model, such as loops, are constructed using a solvation ab initio approach (Kopp & Schwede, 2004). After that, correction and minimization of the predicted structure was done using ModRefiner (Xu & Zhang, 2011) . To validate the predicted structure PROCHECK (Laskowski, Rullmann, MacArthur, Kaptein, & Thornton, 1996), PROVE (Pontius, Richelle, & Wodak, 1996), and ERRAT (Colovos & Yeates, 1993) was used.

Docking analysis
For docking analysis, we used Fast Interaction Refinement in molecular Docking (FireDock), an online based molecular docking platform worked on shape complementarity of soft molecular surfaces. FireDock operates in a manner to optimize lateral string conformations and rigid body orientations, thus providing a large degree of refinement. The user-friendly design and 3-D outcomes visualization improves FireDock's usefulness to a higher degree (Andrusier, Nussinov, & Wolfson, 2007;Mashiach, Schneidman-Duhovny, Andrusier, Nussinov, & Wolfson, 2008). We considered HLA molecule as receptor and our desired epitope as ligand in a blind manner means the epitope will bind to the best conceivable place not to the predetermined place. FireDock provide results on the basis of global rank with greater refinement (Pradhan & Sharma, 2014).
Later, the coordinate files were converted to pdbqt format from pdb format.

Molecular dynamics simulation study and binding free energy calculations
To assess the stability of predicted epitope with their respective receptors we used Molecular dynamics simulation in an explicit solvent system (water) by YASARA (Land & Humble, 2018) dynamic software set with AMBER force field (Dickson et al., 2014). The best receptor-ligand position provided by molecular docking computations was used for selected for further analysis.
In this prospect, a simulation cell was generated in a clean and optimized system and our receptorligand complex placed inside it. We used an unambiguous TIP3 (at 0.997 g/L -1 , 25°C, and 1 atm) water model solvation system included 46406 atoms, where we used steepest gradient technique minimize energy by simulated annealing methods that allowed system symmetry for the executed environments (25 °C, pH 7.4, and 0.9% NaCl) (Cojocaru & Clima, 2019;Krieger, Nielsen, Spronk, & Vriend, 2006). For better clarity, we omitted the rendering water molecules. According to the calculated pKa, hydrogen atoms have been supplied to the protein structure in favor of simulation pH in the suitable ionizable communities (i.e. if the pKa becomes higher compared to pH, one hydrogen atom will be supplied). The pKa was calculated using the Ewald technique for each residue (Krieger et al., 2006). Subsequently, by using a step-size equivalent to 2.5 fs, the manufacturing MD simulation was operated at YASARA force field level around 100 ns time scale (Krieger & Vriend, 2015). The molecular gestures were recorded as snapshots of simulation time frames, thus setting up the trajectory of the system. An assessment of the MD trajectories was saved every 250 ps. This trajectory was used further for the analysis of RMSD, RMSF, H-bond, total contact and binding free energy MM-PBSA calculations. Where, G = <Gintra> + <Ginte> + <Gpol> + <Gnp> − TΔS.
ABCpred predict epitopes by recurrent neural network with 65.93% accuracy (Saha & Raghava, 2006). The threshold value was set to 0.51 and epitope length was set to 16mer while overlapping filter was on during this prediction.

Identification of highest antigenic protein
All of the outer membrane proteins of NTS species S. Typhimurium were evaluated by Vaxijen v2.0 server. The protein sequence with UniProtKB id: A0A0F6BA63 showed highest antigenicity 1.2685 with 0.4 threshold value. This protein sequence was also found common to other NTS species (S. enteritidis, S. Choleraesuis, S. Salamae) with same amino acid sequence but different input id at UniProtKB database. This sequence contains 80 amino acids and used for later analysis in our study.

Identification of T cell epitopes
In a preselected manner, web-based T cell epitope prediction tools NetCTL generated epitopes of our query proteins. Firstly, we selected seven best epitopes (Supplementary Table 1) depending on their highest combined score from all MHC supertypes.
The isolated T cell epitopes has been used for the prediction of MHC-I binding allele by SMM methods. The epitopes which provided higher binding affinity IC 50 >200 nM was used for further study (Table 1).
Proteins are broken down into peptides by cellular, indigenous proteasome complex and presented by antigen presenting cells (APC) through MHC-I molecules to the helper T cells. MHC-I processing tools were used to predict the overall processing score (TAP score, proteasome score, MHC-I score, processing score) of all potential epitopes that could be bind with T cells generated from the protein sequence. The higher processing potentiality of the peptide is greatly depending on the higher overall score. Peptide contains higher value has highest processing ability.

Epitope conservancy and immunogenicity analysis
A successful vaccine is not only depending on HLA allele interaction but also relies on the population coverage with immunogenicity of the epitopes. So, we predicted epitope conservancy and immunogenicity of all predicted epitopes (Table 1). All the selected epitopes showed 100% conservancy were epitopes YGIFAITAL and KVLYGIFAI represents highest immunogenicity 0.38 and 0.273 respectively.

Population coverage analysis of selected 2 epitopes
The distribution of HLA alleles varies across the world and across various ethnic communities and also geological locations. The population coverage of the two selected epitopes YGIFAITAL and KVLYGIFAI with their respected MHC-I alleles were predicted (Figure 1). Republic showed no significant population coverage.

Allergenicity and Toxicity analysis
Allergenicity and toxicity prediction is a crucial step in vaccine design. Evidence showed that most of the vaccines induce an allergic reaction by producing antibody E and T helper cell type-ii (McKeever, Lewis, Smith, & Hubbard, 2004). For this reason, we predicted allergenicity and toxicity through online server AllerTOP v 2.0 and ToxinPred respectively. The server showed that epitopes SSATSVSTV, MKKVLYGIF and SVSTVSSAV are allergic to humans where epitopes KVLYGIFAI, APVQVGEAA, YGIFAITAL and VSSAVGVAL are nonallergic and ToxinPred predicted that all peptides are non-toxic for human hosts (Table 2).

Validation of predicted Protein structure:
We already mentioned in "Materials and Methods" parts, the 3-D structure was predicted using homology modeling technique ( Figure 2a). Afterwards, the structure was validated using various online based protein structure authentication tools. As depicted in figure (2b), the PROCHECK generated Ramachandran plot showed that our predicted model has almost 91.7% amino acid residues in the favorable region, 7.9% residues are located in additional allowed region, 0.4% residues are located in generally allowed regions and 0.0% residues are located in disallowed regions. A high quality protein model contains >90% amino acid residues present in the central regions (Dash et al., 2016). Afterwards, the predicted model was analyzed by PROVE and ERRAT. The PROVE resulted Z-score mean was 0.187 and Z-score RMS was 1.249 ( Figure 2c).
The predicted protein structure validated by ERRAT which scored above 80% is said a good model   On the basis of BepiPred 2.0, 2 epitopes were generated with different length from our respected protein sequence (Supplementary Table 3). The antigenicity of the respected peptides has been predicted by online based server Vaxijen, where 60 TGGGDGSNTGTTTTTT 76 showed highest score 3.1035. All these two epitopes were 100% conserved. The allergenicity and toxicity of these epitopes were also predicted by online servers AllerTOP v2.0 and ToxinPred respectively. The first epitope 16 ATSVSAAPVQVGEAAGSAATSVSAGSSSATSVSTV 50 was predicted as allergenic for humans by the server so, it can't be a good epitope for vaccine design. Toxicity prediction results showed that these two epitopes were nontoxic for humans (Supplementary Table   4).
BCPREDS server generates potential B cell epitopes. The epitope length was fixed to 16mer with 75% classifier specificity for the prediction of these epitopes. A total three epitopes were predicted by this tool (Supplementary Table 4). Conservancy results showed that all the epitopes have 100% conservancy rate. Antigenicity, allergenicity and toxicity also predicted for these epitopes by the tools described earlier. The epitope 38 SAGSSSATSVSTVSSA 53 was predicted as allergic for humans. So, it could not be a good epitope for vaccine design. Observing all the epitopes it can be said that, the epitope 59 AATGGGDGSNTGTTTT 74 could be a good peptide for vaccine design as it is non allergenic, non-toxic and also it has highest antigenicity score 2.9631 (Supplementary Table 4).
Five most conserved B cell epitopes were found after searching the likeness among BepiPred 2.0, BCPREDS and ABCpred methods (Table 2). These five peptides are 100% conserved. The epitopes 28 EAAGSAATS 36 , 39 AGSSSATSVSTV 50 were predicted as allergic by AllerTOP server so, they couldn't be good vaccine candidates (Table 2). On the other hand, epitope 21 AAPVQVGEAAGS 32 , 61 TGGGDGSNT 69 and 61 TGGGDGSNTGTTTT 74 were predicted as nonallergic and nontoxic for humans ( Table 2). The epitope 61 TGGGDGSNT 69 showed the highest antigenic score predicted by Vaxijen server so, it could be used as the best B cell epitope for designing effective vaccine.

MD simulations
MD simulation was tested for the stability of

Discussion
Identification of a suitable and effective vaccine candidate is the first step of designing an effective vaccine. Vaccine refers to a pharmacological product that can provide the best cost-benefit ratio to fight against diseases. Efficient vaccine development and manufacturing, however, are expensive and can require years to complete. Recent advancement of bioinformatics made it easy for the researchers to find out the probable vaccine candidate within a short course of time using bioinformatics tools and approaches (María, Arturo, Alicia, Paulina, & Gerardo, 2017). Outer In this study, the total outer membrane proteins of iNTS species (S. Typhimurium, S. Enteritidis, S. Choleraesuis, S. Salame) were retrieved and screened. Interestingly we found a common most antigenic protein with the highest Vaxijen score among the above iNTS species. Afterwards, the selected antigenic protein was analyzed by various immunoinformatic tools.
T cell plays a major role in protective immunity against various pathogens (Esser et al., 2003). At the beginning seven potential nonameric T cell epitopes were generated. The NetCTL 1.2 an online based server was used to predict the T cell epitopes. At 0.4 threshold value NetCTL 1.2 server predicts highest number of epitopes within a distinct specificity and sensitivity, considering all MHC-I supertypes (Larsen et al., 2007). The selected epitopes with their IC 50 value were represented in table 1. Finally, we selected two most potential epitopes based on several parameters like MHC-I binding efficiency, conservancy, immunogenicity, allergenicity and toxicity.
According to the above parameters YGIFAITAL and KVLYGIFAI were identified as the best epitopes.
We used molecular docking simulations and MM-PBSA methods to validate our selected epitopes with both HLA-C*03:03 and HLA-DRB1*04:01 proteins. HLA-C*03:03 and HLA-DRB1*04:01 showed higher affinity thus we selected these MHC-I and MHC-II proteins. The epitope YGIFAITAL attained maximum binding affinity with both MHC proteins. The free energy calculation method MM-PBSA also support this higher affinity. Before going to the further process, we prepared the three-dimensional structure of HLA-C*03:03proteins using Schrödinger protein structure building software Prime (Jacobson et al., 2002;Jacobson et al., 2004). In this process the crystal structure of HLA-B*07:02 in complex with an NY-ESO-1 peptide (Pdb id 6AT5) showed the maximum similarity.
We performed MD simulation calculation RMSD to observe the stability of the epitope-HLA complex. The higher RMSD value represents the higher flexibility of HLA-epitope binding (Dash et al., 2017;Fu, Zhao, & Chen, 2018). In case of bonding with HLA-C*03:03, YGIFAITAL was much more stable compared to KVLYGIFAI-HLA-C*03:03 throughout the simulations. But, YGIFAITAL showed less stability with HLA-DRB1*04:01, where KVLYGIFAI showed better stability with the receptor allele HLA-DRB1*04:01. We also calculated RMSF for further evaluation. Higher RMSF values indicates the mobility of the epitopes and lower RMSF indicates the stability of interacted epitopes with HLA (Dash et al., 2017). The lower RMSF of YGIFAITAL complexed with HLA-C*03:03 indicates the greater stability of these epitopes and the higher values of KVLYGIFAI represents the flexibility of this epitopes with the same HLA.
The number of hydrogen bonds were calculated because this is another significant element influencing protein stabilization. The higher number of hydrogen bonds represents the higher stability of ligands with the proteins (Fu et al., 2018). In this study, the epitope YGIFAITAL showed maximum number of hydrogen bonds with both HLA proteins, which depicts the better binding ability of YGIFAITAL epitope. In total contact analysis, the epitope YGIFAITAL showed much higher number of contacts in both HLA proteins, whereas KVLYGIFAI showed lower contact with the projected HLA proteins.
Analysis of population coverage is also an important factor for developing effective vaccines because the HLA alleles are different from ethnic group to groups and also differs in geographical regions. So, during vaccine design selection of wide range of populations must be considered. Our selected epitopes showed variable range of population coverage throughout the world populations ( Figure). The highest number of population coverage were seen in Mexican Amerindian populations. This result suggest that the proposed vaccine can be applied to the figured mentioned populations.
Identification of B cell epitope is also a key criterion during an effective vaccine design. B cell produce humoral immunity. Humoral immunity is much more strong and has higher efficacy (Adhikari et al., 2018). In this study, the most conserved B cell epitopes were selected for using as vaccine candidates table. We found the most common conserved sequence through all of the predicted methods that ranging from 61-74.
Our expected outcomes in silico, however, were focused on diligent sequence analysis and analysis of different genetic databases. This sort of research has currently been experimentally validated (Khan et al., 2014) and we have therefore suggested that the proposed epitope could cause an effective immune response as an in vitro peptide vaccine.

Conclusion
Non Typhoidal salmonella species are responsible for developing several invasive diseases. The mortality rates also higher. Besides, these bacteria are getting resistance through the worldwide.
Due to the significant amount of antibiotic resistance profile and mortality rates iNTS species are much concerned towards the scientists. Thinking about all the benefits of peptide vaccine, in this study we used Immunoinformatic approaches to design an effective vaccine against the iNTS species. In our study, we predicted both T cell and B cell to initiate immunity against iNTS bacteria. We also evaluated the efficiency, antigenicity, allergenicity and toxicity of the selected epitopes to validate themselves. Molecular docking and MD simulation calculation were also done to validate the ligand receptor binding efficiency and stability. We do believe that our vaccine candidate will be much effective against iNTS species but, further research and in vitro, in-vivo validation is also essential to produce an effective vaccine.
Author Contribution:

Conflicts of interest:
All authors declared there is no conflicts of interest.