Some findings on genes over SARS-CoV2 genomes

Coronaviruses are a large family of RNA viruses which cause respiratory infections ranging from the common cold to more severe diseases such as Middle East Respiratory Syndrome (MERS), Severe Acute Respiratory Syndrome (SARS) and COVID-19. This article highlights some key findings based on a thorough scanning of genes of 475 SARS-CoV2 genomes, including the co-presence of ORF7a and ORF8 over the 256 SARS-CoV2 genomes and the absence of the gene ORF7b over the 219 SARS-CoV2 genomes collected from various countries including India. The presence of the gene ORF7b is found in the SARS-CoV2 genomes containing the L-type strain which is reported to having much higher virulence as compared to the S-type strain.


Introduction
The outbreak of the SARS-CoV2, a novel coronavirus becomes now a pandemic [1,2]. The genome of SARS-CoV2 contains approximately 30kbp nu-cleotides and each genomes contains around 11 genes of various type such as S, E, M, N etc [3]. The SARS-CoV2 genomes consist of structural protein gene S 5 which specifically bind to the receptor of the host cell, and this is the key protein for viruses to invade susceptible cells. The gene M and E are involved in the formation of the virus envelope, while the gene N is involved in the assembly of the virus [4]. The origin of the source of the virus SARS-CoV2 and its intermediate host is still controversial [5]. The phylogenetic analysis implied that the 10 coronavirus was the most similar to Bat coronavirus isolate RaTG13 (GenBank No.: MN996532), with 96.2% nucleotide homology in the whole genome [5]. It is also reported that the SARS-CoV2 was closely related (with 88% identity) to two bat-derived severe acute respiratory syndrome (SARS)-like coronaviruses, bat-SL-CoVZC45 and bat-SL-CoVZXC21, collected in 2018 in Zhoushan, east-15 ern China, but were more distant from SARS-CoV (about 79%) and MERS-CoV (about 50%) [6]. It is reported that the genome sequences MT050493, MT012098 from India are highly similar to the genome of the Wuhan seafood market pneumonia virus (accession number: NC 045512) [7]. The other recent findings and present state of the art including review can be obtained from the 20 various articles [8,9,10,11,12,13,14,15,16,17,18].
Replication of Coronavirus is caused by a set of highly conserved viral proteins. Only two ORFs such as ORF3a and ORF7a have been in virus-infected cells among the eight putative accessory proteins encoded by the (SARS-CoV) [19]. The ORF7b gene is expressed in virus-infected cell lysates and from a 25 cDNA encoding the gene 7 coding region, indicating that the sgRNA7 is bicistronic. The ORF7b protein is not only an accessory protein but a structural component of the SARS-CoV virion [19].
It is needless to mention that a deep scanning over these genomes and associated genes is necessary for various reasons including pathogenesis [3]. In this article, an attempt has been made to search out the gene variations among the 470 SARS-CoV2 genomes. This article reports that a pair of genes viz.
ORF7a and ORF8 is present across almost all the SARS-CoV2 genomes except MN988668, MN988669 and MT121215 genomes from China. Also it is found that ORF7b is absent across 219 genomes of SARS-CoV2 such as MT050493
It is noteworthy that variation in coronavirus depend on percentage of mutations and as per the analysis, the L-type strain is dominant where more mortality is reported. The presence of the gene ORF7b seems to be associated with the genomes containing L-type strain.

60
• There are 219 SARS-CoV2 genomes out of 475, which do not contain the ORF7b gene. The list of the genomes with their respective geographic locations, which do not contain ORF7b is given in the Table-3. The absence of ORF7b over 219 genomes across different countries is given in Table 2. It is to be noted that 55.25% of the SARS-CoV2 genomes from USA contain ORF7b gene which may be an indication of the strong pandemic situation in USA.
• The gene E of length 228 is present over all the 475 SARS-CoV2 genomes.  provide the rate of SARS-CoV2's infectivity and mortality in a country.