Introduction
Microbial evolution has played a major role in the onset of public health-related crises that led to a significant number of disease cases and induced casualties among the world population, as numerous polymorphic infectious agents have evolved by adapting their genes to environmental changes, which are represented in this case by the adaptation of human and animal immunity to such infectious events. Currently, it seems that several polymorphic microbes tend to take an evolutionary advantage over the innate immune system, more than the adaptive immune system, since the latter is known as an “effector” immune department as it directly counteracts the pathophysiological effects of the microbe, whilst the first is known as the “inducer” or “bridging” immune department, helping the adaptive immune system undergo a timely and proportionate activation process. Given the current context of an unbalanced between advanced microbial evolution and an under competing human first-line and second-line immunity, it may now be more important to develop softwares and algorithms capable of projecting the emergence of polymorphic microbial variants with novel, advanced capabilities of hijacking such human innate immunity, which plays a critical role in the creation of a competent bridge to the core, adaptive immune elements and pathways. There is a highly complex context in which viral genomes express specific proteins that are responsible for the prevention of their detection by Pattern-Recognition Receptors (PRRs), such as Toll-Like Receptors (TLRs) 3, 7 and 8, as well as Melanoma Differentiation-Associated protein 5 (MDA5) and Retinoic acid-Inducible Gene I (RIG-I), leading to a lack of detection of both Pathogen-Associated Molecular Patterns (PAMPs), which constitute part of the viral genome, and Damage-Associated Molecular Patterns (DAMPs), which represent toxins produced by the virus. As a result, the overall activation of genes encoding the interferon system may be delayed for a significant amount of time, leading to an unrestricted increase and distribution of the viral load during the initial stages of infection, favouring the onset of severe immunopathogenesis, particularly in people with one or more underlying health condition that already impact the quality of their immune activation. Often, such viral genomes synthesise proteins that directly antagonise the activation of the interferon system and of its byproduct cytokine pathways, and such initial interferon deficiency will lead to an overproduction of interferon glycoproteins once cells begin detecting the viral load, as this often occurs only by the time the viral load is present in multiple tissues, organs and even organ systems. A highly relevant example may be observed in the case of the recent SARS-CoV-2-induced COVID-19 pandemic, in which the virus was seen to produce non-structural proteins (NSPs), like NSP1, NSP10, NSP14 and NSP16. Whilst NSP10, NSP14 and NSP16 were observed to interact with each other and indirectly inhibit the interferon system by preventing the activation of various important PRRs, NSP1 was observed to directly inhibit the host interferon-based signalling process independently. NSP1 may be a major viral protein, not only in the case of the COVID-19 pandemic, but also in various Influenza A disease variants, as it was discovered that NSP1 binds to the NSP1-binding protein (NSP1-BP) and causes a decrease in the quality of NSP1-BP’s molecular activity that brings benefit to the viral non-structural protein in cause. Furthermore, with each round of viral mutation, it was observed that the virus gained fresh capabilities of inhibiting the interferon response, despite previously-built immune memory, which often resulted in repeated contraction of clinical disease. Such events of repeated contraction of infectious clinical disease were broadly observed in many other infectious diseases caused by polymorphic viruses, such as the Influenza A Virus (IAV). There may now be an existing statistical probability that there is a proportional relationship between the duration of a pre-symptomatic stage of an infectious disease, such as SARS-CoV-2-induced COVID-19, and an increased incidence of severe pathogenesis and pathophysiology, given the fact that direct and indirect microbial self-camouflaging look to play a major role in transiently preventing the bridging between innate and adaptive immune activation, whose effects can sometimes be catastrophic for the quality of the eventually induced overall immune response - particularly for patients with underlying immune and perhaps also non-immune health conditions. It is also possible that the incidence of post COVID-19 syndrome, which is a chronic inflammatory disease that occurs as a result of autoimmune activation, is also proportional with the lengthy pre-symptomatic phase of the viral disease. Likewise, alongside the buildup for AI-derived statistical research regarding probabilities that microbial SNPs may directly induce host pathophysiology and impact the integrity and quality of the adaptive immune response, it may also be crucially important for bioinformaticians to develop softwares and statistical projection algorithms for the purpose of predicting future mutations that stimulate the virus to freshly inhibit host innate immune activation, as this often can lead to a repeated onset of clinical infectious disease. Such programs and algorithms may also be developed to predict zoonotic events that may be statistically probable to lead to epidemic or even pandemic outbreaks. Just as the progressive development of technology and AI has helped catalyse the process of molecular diagnostic procedures, such as Polymerase Chain Reaction, by turning it from traditional (PCR) to real-time (RT-qPCR), so it may be possible, in a similar fashion, for bioinformaticians and biostatisticians to catalyse the process of projecting statistical probabilities concerning the onset of future epidemics or even pandemics caused by specific microbial agents. Even such a process of Real-Time qPCR has been undergoing considerable optimisations to make it even faster and more accurate than during the initial stages of the RT-qPCR molecular testing implementation (Gadkar, V.y, and Filion, M., 2014).
Discussion
Paradoxically, Single-Nucleotide Polymorphisms (SNPs) constitute both critical ways of biological adaptation seen throughout evolutionary history, as well as a concerning type of genetic mutations that have been occurring in diverse life forms, with consequences that often severely impact the lifespan of both the life form itself and the lifespan of other surrounding organisms and microorganisms. SNPs are known to involve three types of genetic mutations with regards to the structure of the gene: insertions, deletions and substitutions, as well as three types of mutations with regards to gene functionality: silent, missense and nonsense, with the latter two often bringing a severe molecular impact upon the production of the involved proteins. Nonetheless, SNPs are also known to bring effective results for specific life forms, helping them better adapt to novel environmental changes and remain in the circle of natural selection. In the areas of virology and epidemiology, SNPs are widely known to involve the creation and adaptation of novel variants of microorganisms, becoming or remaining at least as capable of causing severe clinical illness in their target organisms as before. For example, it has recently been discovered that the emergence of novel variants of microbes through recurring SNP events bring an impact of a continuing pathogenesis of oncogenic diseases so major that the phenomenon has been regarded as a hallmark of cancer induction (Lythgoe et al., 2022). The topic of frequent occurrence of SNP events in microbial agents could not have been more relevant for the recent SARS-CoV-2-induced COVID-19 pandemic, as it has been recorded that SNP events in various important parts of the viral genome occurred during each round of variant emergence and natural selection, which often involved a weakening of the average induced pathophysiology, but an increase of the molecular abilities of the virus to evade the host interferon system (Suh S. et al., 2022). A major sign of such an aspect is that important early variants of the SARS-CoV-2, highly polymorphic viral agent, were discovered through an SNP genotyping molecular testing procedure (Harper H. et al., 2021). A similar trend may be observed with regards to the IAV agent, which is known to cause the flu respiratory disease, as researchers discovered viral variants using SNP genotyping as well (Duan S. et al., 2011). Furthermore, it was discovered that the IFN-gamma-encoding human gene underwent SNP during adaptation to the H5N1 strain of the IAV virus, which helped the immune system become resistant to such viral strain (Bin Ji et al., 2015). It may likewise be that SNP changes are also at the core of human adaptation to microbial evolution against host immunity. Likewise, virologists, epidemiologists, clinical bioinformaticians and biostatisticians could share their clinical field for the purpose of helping scientific research project the occurrence of certain moments in which an SNP event causative of significant microbiological adaptation occurs in the future. Given the fact that numerous polymorphic viruses have been prioritising a targeting of the first- and second-line immune weak points in their adaptation process against the central, third-line host immunity, it could be important to analyse the history of SNP events in viral genes encoding proteins that bring such molecular effects, to ultimately project the occurrence of future SNP events of potential concern for the evolutionary state of the host innate immune system. The innate and adaptive departments of the host immune system both play their unique role in shaping the overall defence system of the host organisms, and the overall clinical objective of improving the human evolutionary trajectory against pathogenic agents with tricky ways of evading the immune system is to help create a better and wider bridge between the two immune departments, potentially describing a “United Immune System” concept. Such an aspect may involve a direct address of the weak evolutionary points that are likely existent in important first-line and second-line, innate immune systems, such as the interferon system and locally even the complement system, whilst continuing efforts to develop therapeutic and immunising agents that continue to aid the third-line, adaptive immune system build more effective signals, as well as a wider, more durative and fortified memory. Firsthand, it had been theorised that the innate immune system brings speedy and broad-spectrum effects of microbial clearance and that it lacks specificity and memory, unlike the adaptive immune system, which is deemed as the immune department having specificity and displaying long-term memory against certain pathogenic agents. Nonetheless, it was only recently discovered that the innate immunity actually displays considerable extents of both “specificity” and “memory”, and that the adaptive immunity also displays “unspecificity” after all. Likewise, there may now be a possibility that vaccine-based research efforts will bring a wider inclusion of major innate immune elements, such as Type I and Type III Interferon glycoproteins, as they would not only represent vaccine adjuvants, but immunising agents themselves. It may be through such a wider inclusion of innate immunity into vaccinology that the concept of “United Immune System” will be successfully implemented over time (Carp T., 2024).
Underlying health conditions, which are both immune and non-immune in nature, have altogether contributed to the phenomenon of delayed and exaggerated Type I and Type III Interferon-based autocrine and paracrine signalling, leading to the development of a disrupted balance between produced pro-inflammatory and anti-inflammatory cytokines as a result of a more chaotic signalling rate to genes that synthesise cytokines as such. Type I and Type III Interferons may be deemed as pre-cytokine immune glycoproteins, whilst Type II Interferons would more likely be deemed as part of the cytokine family, given the fact that the production of Type II Interferons is dependent upon the activation extent of Interferon-Stimulated Genes (ISGs), which themselves synthesise various cytokines and whose activation is resulted from the autocrine and paracrine signalling of Type I and Type III IFN glycoproteins to the IFNAR1/IFNAR2 receptor complex and the IFNL1R/IL10R2 receptor complex respectively, which is followed by the signalling transduction process that ends with the phosphorylation of the transcription factors for the ISG family. Henceforth, it may be important to differentiate the difference between Type II Interferons, which are produced by natural lymphocytes as a result of their ISG cytokine protein-based recruitment, and Type I and III Interferons, which are produced and signalled by several types of immune cells, such as plasmacytoid dendritic cells (pDCs). As a result, it may be that delays in the timing of IFN I and III synthesis, exocytosis and signalling may bring the most problematic effects of delay and exaggeration of the overall immune activation, as the quality of IFN II-based expression seems to depend upon the quality of IFN I and III-based production and successful signalling. Whilst it is important to mention that a normalised synthesis of Type II Interferons is more directly responsible for a normalised recruitment of adaptive lymphocytes that include major types of B- and T- Lymphocytes, it is as important to mention that Type I and Type III IFNs being at the highest point of the foundation of immune activation ultimately plays the most important role in the adequate synthesis of IFN II, meaning that a normalised recruitment of adaptive lymphocytes is ultimately mostly dependent upon the adequate activation of Interferon-encoding genes (INGs) that express classes of Interferon glycoprotein as such. In short, it is statistically probable that Type I and Type III Interferons are included in the group of immune proteins that are most foundational for the adequate development of the bridge between the innate and the adaptive immune departments as soon as the initial infective moment occurs, and it is possible that the best approach paving the way to a successful buildup of a foundational immunological tree implicates the detection of the immune elements that play the most and the least foundational roles respectively. The classification of immune elements as pre-cytokine, cytokine and post-cytokine could further support such scientific and intellectual efforts, as they can only increase the resolution of an immunological tree as such. It is thereby critical for such AI-derived algorithms to include the potential scientific fact above in order for the best research efforts for the development of pharmaceutical, therapeutic and vaccine-based approaches to ultimately be performed. It seems that an evolutionary push for the adaptation of human immunity to the SNP changes of polymorphic viruses in their genes suppressive of human Type I and Type III IFN-encoding genes (INGs) involves two parts: a more direct, immune system-based approach that frontally increases the sensitisation of such INGs, foundationally by using recombinant Type I and Type III IFNs, and a less direct, pathogen-based approach that helps the process of the sensitisation of the ING activity by means of the utilisation of wholly inactivated or lysed microbial genomes that only contain slightly-activated genes involved in the suppression of such IFN production. A harmonic cooperation of the two approaches may bring the highest rate of efficacy amongst all candidates.
Furthermore, a substance named protollin was tested through a first-phase, double-blind clinical trial, for the delay in the onset of the Alzheimer’s Disease neurodegenerative disease, which is known to be caused by aggregates of beta-amyloid misfolded protein in the encephalon. Namely, the substance was administered in a small dosage nasally, and the results were promising, as helper CD4+ and cytotoxic CD8+ T-Lymphocytes were recruited to the encephalon, where they activated microglial cells and prevented the lysis and deposition of astrocytes, helping in the process of destroying both soluble and insoluble beta-amyloid aggregates before the clinical onset of the neurodegenerative disease, which often occurs in relatively elderly patients (Frenkel D. et al., 2005). Furthermore, collected clinical evidence even suggested low-dose protollin plays a role as an immunising agent against Alzheimer’s Disease, as it brought prophylactic immune effects against beta-amyloid aggregate-induced neuronal pathogenesis (Frenkel D. et al., 2008). Likewise, protollin may be used for a delay in the onset of other protein-related degenerative diseases, including Retinitis Pigmentosa, which is caused by the pathogenic action of Rhodopsin aggregates against the rod photoreceptors upon the retinal cells, essentially damaging and destroying an increasing count of cells with rods and eventually causing a complete loss of vision, oftentimes by the time the patients reach an age of 40-45. Such an application may be feasible due to the proximity of the optical system to the central nervous system, though simultaneously, medical and clinical caution is specifically required in such a case due to the fact that the retina is sensitive to any slight extent of immune activation. Moreover, a low dose of protollin may even be used alongside a low dose of Type I and Type III Interferon glycoproteins to treat both natural and adaptive lymphocytes to optimise their immune function and integrity. In the case of HIV-induced AIDS, a treatment of adaptive lymphocytes with protollin as well may further sharpen their defence mechanisms against the retrovirus, as well as improve their immune efficacy, potentially bringing an utmost “evolutionary punch of self-defence” against microbial agents like HIV. Likewise, protollin, Type I IFNs and Type III IFNs, if used in low concentrations, may both individually and together constitute whole immunising agents against various diseases that involve various extents of immune activation (Carp T., 2024). Moreover, there are several plant-based proteins that play various roles in sensitising the activation of Type I and Type III Interferon-encoding genes (INGs) and of Interferon-Stimulated Genes (ISGs), and likewise, principles of Translational Medicine and Pharmacology may be used to develop small drug-like molecules with the aim of sensitising such elements of the interferon system to the best of the research abilities, whilst preventing the onset of any adverse event to the same best of the scientific abilities. The plants that synthesise such interferon-stimulating proteins include: Silybum marianum, Astralagus membranaceus, Schisandra chinensis, Agaricus blazei, Ganoderma lucidum, Morinda citrofolia, Aloe vera and Foeniculum vulgare (Dacia Plant, 2021). Overall, clinical researchers may bring an effort of utilising a combination of various Type I and Type III Interferon-stimulating proteins to create an overall, proportional “evolutionary punch of self-defence” that will support the human immune system in dominating the microbial mechanisms of evading first- and second-line immunity once again.
Initially, the UniProt program was used to collect the amino acid sequences of the concerned viral protein, before the Clustal Omega program was used to perform multiple sequence alignment involving the listed proteins above. The above programs were used because the research involved whole protein sequences, without requiring a BLAST- and FASTA-based sequencing to detect the protein source of specific amino acid sequences (Pearson et al., 2014). Derivatives of MEGABLAST and G-BLASTN were also used to increase the speed and accuracy of such AI-mediated procedures (Zhao, K. and Chu, X., 2014). Alongside the BLAST and FASTA, basic-level processes of nucleotide and protein sequencing, similarities may also researched and predicted using the Structured Query Language (SQL) programming method, which constitutes a back-end computerised approach of building operations to determine the existence or absence of statistical significance that specific protein sequences are homologous in nature, and such an approach may be required in the context of analysing and predicting the occurrence of specific SNPs in pathogenic microbes of public health concern, due to the need of further research catalysis as a result of the highly complex spectrum of factors that lead to such SNP occurrences (Pearson et al., 2017). A large degree of homology has been observed in all protein variants listed above, as the percentage of identity exceeded 30% and the e-value was smaller than 0.001 respectively, indicating the substantial nature of the hit. When the e-value is smaller than 0.001, the p-value implying the null hypothesis that there is no association between the protein is smaller than 0.05, meaning that the null hypothesis ought to be rejected. In multiple sequence alignment, it is important to search for three key amino acids: Serine (Ser/S), Histidine (His/H) and Aspartic Acid (Asp/D). Serine is known to have a non-polar, aliphatic R group, whilst Histidine is known to have a non-polar, aromatic R group, whilst Aspartic Acid contains a polar, negatively-charged R group. These three key amino acids are found in sequences of 3-10 amino acids that constitute the identical regions of the amino acid sequences, and they make up the major active site residues that facilitate the hydrolytic chemical reaction of the peptide bond (conserved motifs). In other words, genetic changes play a foundational role in the determination of the biochemical configuration of translated and folded proteins.
With regards to statistical projections, computer softwares like SPSS could be used for the collection of all necessary statistical calculated data in order to build finalised patterns of mathematical prediction with an above-threshold level of reliability. Through the utilisation of softwares like Microsoft Excel and SPSS, statistical tests that include Chi-Squared tests, as well as paired and unpaired T-tests, are performed, with scientists often managing to determine the relationship of correlation and causality without difficulty or significant temporal expenditure likewise. By the statistical analysis of past evolutionary viral processes and their single nucleotide polymorphisms (SNPs), scientists could build predictive algorithms for future viral mutations and determine statistical probability about whether there will be certain mutations causative of above-threshold levels of gain-of-function in viral proteins not directly causative of immunopathology, but of transient cellular immune silencing. There are various environmental, microbiological and proteomic factors that influence the morbidity and case-fatality ratios of epidemic and pandemic diseases, and the core underlying influence lies within the boundaries of evolutionary biology. One example in which computerised systems may be updated to involve artificial intelligence (AI) models would implicate the update the integrated engine of the Laboratory Information Management System (LIMS) - which stores genetic data from hospital patients - comprising a data pool for processed Real-Time Quantitative Polymerase Chain Reaction (RT-qPCR) test results - through the inclusion of a franklin.ai model, which was created by the Sonic Healthcare UK organisation and designed to automatise the entry and processing of genetic data from patients suspected of cancer. Likewise, franklin.ai increases the storage of the collected data through the increase of the frequency of the data processing per day (Smith S. C. et al., 2024). Likewise, such an AI model or similar ones may be included in other software systems that contain data collected from genetic sequencing and the assembly of numerous phylogenetic trees, in order for the computerised process of biological and statistical research to be further catalysed. Such AI models could also be used for the pools of statistical data in order to help the speed of statistical projections to continue increasing. Namely, AI models like franklin.ai could be installed into the integrated engine of biological, genomic and statistical systems, which normally contains multiple kinds of data formats, where exchange regularly takes place, and the overall objective would be to catalyse such process of data exchange and processing in order to increase the storage of such systems, though it is probable for the model to be more suitable for a catalysis of genetic data entry. Nonetheless, such an AI-related update may represent a significant example of how manual processes of data entry can be automatised, helping computerised research processes become faster and more accurate. By such processes, it may be more feasible for bioinformaticians and biostatisticians to determine statistical probabilities that mutations in specific regions of microbial genomes will occur and that novel infectious disease outbreaks would occur as a result.
The scope of the present study is to investigate the extent to which the transient capabilities of microbes to hijack main innate immune pathways influence the rates of morbidity and lethality caused by the induced infectious disease afterward. To effectively perform such a study, we will investigate the relationship between single-nucleotide polymorphisms in genes encoding various viral proteins and their consequent adaptation to novel immune memory. Specifically, we will observe the relationship between such viral mutations and changes in the biochemical profile of secondary and tertiary structures of such proteins, given the existence of various biochemical versions of amino acid-based side chains, which play a critical role in the determination of the protein’s stability, functionality and overall state of activity. It may currently be deemed as statistically probable that the more functional and active such viral proteins are, the more capable is the overall viral genome to evade critical elements of the host immune system. Afterward, it would be necessary to build phylogenetic trees of such viral proteins to observe their evolutionary history over a large, inter-era extent of time. It would be possible to perform such a function using the available databases for pairwise sequence and multiple sequence alignment. After the output results are published, the implicated researchers may then observe the amino acid sequence of each aligned protein, as well as the degree of homology between each protein variant. Then, the secondary structure of each aligned protein can be analysed in detail using various available softwares. There are multiple features within such proteins that require research attention, including the existence of either Serine, Histidine or Aspartate in the conserved motifs, as well as the existence or absence of polarity, the existence of absence of an electric charge, the extent of repetition of amino acids whose R groups have the same electric charge or extent of polarity. Likewise, the biochemical profile of the aligned proteins is directly influenced by their genetic makeup, which in turn is directly influenced by SNPs. Throughout such analysis, researchers may develop statistical programmes to estimate probabilities that future mutations in specific microbial agents will lead to unique zoonotic processes or even cause the selection of a previous variant into one causative of major epidemic or pandemic illnesses. The overall process can be repeated for each species of the viral proteins implicated, and for each individual virus, there may be separate statistical projections made regarding any occurrence of future mutations leading to the development of new public health-related disease outbreaks. One important example of a factor leading to such consequences would be the frequency of SNPs in genetically active and influential areas of each viral genome. Furthermore, there may be a wide spectrum of environmental factors that may need to be taken into consideration during such processes of projecting the occurrence of specific SNP events in microbial agents. Whilst it is important to involve a broad methodology that is based on Artificial Intelligence, it is equally important to continue basing scientific and statistical research on efforts of human intellectual and wisdom, given that it is crucial to continue thinking outside of the box, and implicitly, outside of the algorithms that are based on an already-enclosed box of past analyses, in order for better scientific discovery to continue arising. Overall, it is important to apprehend the high level of complexity that such biostatistical research may be displaying, meaning that intellectual efforts may require to be extensive in nature and that research teams may need to be broad in both numbers and diversity of scientific background.