COVID-19 Vaccine Candidates by Identification of B and T Cell Multi-Epitopes Against SARS-COV-2

Coronavirus disease (COVID-19) is a new discovered strain where WHO officially declares the disease as COVID-19 while the virus responsible for it called Severe Acute Respiratory Syndrome Coronavirus 2 or SARS-CoV-2. The incubation period of this disease is between 14 days. Ordinary clinical symptoms that reported around the world include fever, cough, fatigue, diarrhoea and vomiting as well as asymptomatic for certain people. Infection is spread mainly through broad droplets. In early March 2020, WHO again has announced that COVID-19 is a pandemic with currently no specific treatment. The potential use of SARS-COV-2 proteome as a vaccine candidate by analysing through B-cell and T-cell antigenicity by using a immunoinformatics approach as a vaccine development early stage. In this study, we used consensus sequence for SARS-COV-2 proteome that was retrieved from NCBI database. VaxiJen 2.0 was mainly used to identify the antigenic property of SARS-COV-2 proteins. IEDB then used to analyse the B-cell epitope, the presence of T cell immunogenic epitope in SARS-COV-2 proteins was obtained by using compromise method of MHC class I and II tools that accessible respectively using ProPred-1 server and MHC II Binding Prediction in IEDB database. The best epitopes of B and T-cell epitopes were predicted with high antigencity and the information is disseminated through web-based database resource (https://covid19.omicstutorials.com/epitopes/). This study will be useful to find a new epitope-based candidate for SARS-COV-2. However, further study needs to be done for the next stages of vaccine development.


Introduction
The causative source of COVID-19 infection is coronavirus 2 (SARS-CoV-2) with extreme severe respiration syndrome. Coronavirus was known to be encased by β-coronavirus, positive single-strand consisting of a large number of RNA viruses that transmit the disease in humans, as well as animals. The SARS-COV-2 proteome consists of the spike (S), envelope (E), nucleocapsid (N), membrane (M), orf3a, or6, orf7a, orf8 and orf10 (Srinivasan et al. n.d.).
Despite low mortality rates, this virus includes high virulence and infectivity. The COVID-19 signs occur after 5.2 days of incubation. Most typical symptoms of COVID-19 disease include fever, coughing, and exhaustion, although other indications involve sputum, nausea, haemoptysis, diarrhoea, shortness of breath, and lymphopenia. COVID-19 is suspected to invade lung alveolar epithelium as an input channel using the angiotensin-converting enzyme II receptor-mediated endocytosis. Patients hospitalized with COVID-19 reported an increase in leukocyte quantity, abnormal breathing patterns and higher plasma amounts of proinflammatory cytokines (Kumar 2020a).
Effective vaccination is important to contain the pandemic outbreak of SARS-COV-2. Vaccine trials are ongoing but vaccine development can take several months to years. Lots of ongoing research concentrate on SARS-COV-2 virus spike protein (Abraham Peele et al. 2020;Bhattacharya et al. 2020;Kumar 2020bKumar , 2020c. The existence of pre-existing memory T cells in humans with the capacity for recognizing SARS-CoV-2 is little understood (Chen and John Wherry 2020;Grifoni et al. 2020). Such knowledge is of urgent significance and will aid in the design of vaccines and facilitate the evaluation of immunogenicity of vaccine candidates. Much of the epitope research focussed on the virus' spike protein and had insufficient knowledge about MHC-I and MHC-II alleles (Naz et al. 2020).
Many studies identified drug and vaccine candidates successfully for various bacteria and virus using reverse vaccinology approach (Kumar 2011(Kumar , 2015Kumar and Ramanujam 2020;Omeershffudin and Kumar 2019). In this study, the development of peptide vaccine design has identified by reverse vaccinology, which aims to classify possible candidates for the vaccine considering all the proteome of the virus with the prediction of MHC-I (47 alleles) and MHC-II (27 set alleles) with high antigenicity. The information of all epitopes was developed as a publicly available epitope database (https://covid-19.omicstutorials.com/epitopes/) to bind any class of HLA 1 and HLA 2 all over the SARS-CoV-2 proteome.

B-cell epitope prediction
The collected consensus sequence of SARS-COV-2 sequences of surface glycoprotein (S), nucleocapsid phosphoprotein (N), membrane glycoprotein (M), ORF1a and ORF1ab, ORF3a, ORF6, ORF7a, ORF8, and ORF10 were used for B-Cell epitope prediction. The B-Cell epitope prediction was made through Bepipred Linear Epitope Prediction 2.0 (Anon n.d.) available at Immune Epitope Database and Analysis Resources (IEDB) with the setting of 0.55 threshold and the sequence which are more than seven amino acids were considered for prediction.

Prediction of T cell binding epitopes
ProPred I server (Singh and Raghava 2002) was used to predict HLA class I epitopes with default threshold and selected to recognize epitopes that bind to 47 alleles of HLA I. For HLAclass II epitope prediction, MHC-II binding predictions used available in IEDB (Andreatta et al. 2015). The prediction method was selected IEDB recommended with allele selection of 27 full HLA reference set with a default length of 15 with adjusted rank.

Antigenicity analysis
The predicted epitopes were analysed via Vaxijen Server version 2.0 (Doytchinova and Flower 2007) to assess their antigenicity with a threshold of 5 and above for the virus. The epitopes Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 4 August 2020 doi:10.20944/preprints202008.0092.v1 predicted with Vaxijen with 0.5 and above are considered epitopes which have high antigenicity.

Data retrieval
A total of spike glycoprotein (100 sequences

B-cell epitope prediction
Based on the prediction of linear B-cell epitopes from the IEDB server, 12 epitopes for Spike Protein, 5 epitopes for Nucleoprotein, 1 epitope for a membrane protein, 1 epitope for envelope protein, 81 epitopes for orflab, 2 epitopes for orf3a, 1 epitope for orf6 and 1 epitope for orf7a based on antigenic propensity prediction by VaxiJen server analysis..

Discussion
In our study, we used all structural, non-structural and accessory proteins of SARS-COV-2 for B and T-cell epitope prediction using various computational tools through immunoinformatics approach (Patronov and Doytchinova 2013;Raoufi et al. 2020;Tomar and De 2010). Also, we used all 47 alleles for MHC-I and 27 alleles for MHC-II binding prediction. This identification will be crucial for vaccine design against COVID-19. In B-cell epitope prediction (Bettencourt et al. 2020;Prachar et al. 2020;Rock, Reits, and Neefjes 2016), 12 epitopes for Spike Protein, 5 epitopes for Nucleoprotein, 1 epitope for a membrane protein, 1 epitope for envelope protein, 81 epitopes for orflab, 2 epitopes for orf3a, 1 epitope for orf6 and 1 epitope for orf7a predicted based on antigenicity prediction based on threshold 0.5 and above. We discarded the epitopes which have a non-antigenicity characteristic.
Similarly, for T-cell epitope prediction of MHC-I (Smith et al. 2020), we identified 92 epitopes for spike protein, 6 epitopes for nucleoprotein, 36 epitopes for a membrane protein, 6 protein for envelope protein, 68 epitopes for orlab, 5 epitopes of orf3a, 16 epitopes for orf6, 15 epitopes for orf7a, 44 epitopes for orf8, 17 epitopes for orf10 predicted for MHC-I which antigenicity score 5 and above and discarded which have the non-antigenicity characteristic. We selected only best epitopes which have higher antigenicity score for database resource construction. The evaluated epitopes can be used against SARS-COV-2 as a multi-epitope vaccine candidate for COVID-19.

Conclusion
The epitope prediction evaluated both B and T cell epitopes for all SARS-COV-2 proteome for high antigenicity epitopes. The epitopes of MHC-I with 47 allele binding and MHC-II with 27 allele binding prediction will help to understand the mechanism of pathogenesis. However, the designed epitopes need further wet-lab validation.