ARTICLE | doi:10.20944/preprints202208.0435.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: genomic DNAs; stochastics; tensor-unitary transformation; quantum informatics; fractal; projection operators; gestalt phenomena; stochastic determinism
Online: 25 August 2022 (11:52:12 CEST)
The article is devoted to algebraic modeling of universal rules of stochastic organization of genomic DNA of higher and lower organisms, previously published by the author. The proposed algebraic apparatus, which uses formalisms of quantum mechanics and quantum informatics and which is based on the so-called tensor-unitary transformations of vectors that generate families of interrelated stochastic-deterministic vectors of increased dimensions. The features of the vectors' interconnections in these families model the stochastic-deterministic properties of the named phenomenological rules. There are new approaches to modeling of developing multi-parameter biosystems, whose number of parameters increases in the course of step-by-step development. In the light of the presented materials, the issues of fractal-like organization in genetically inherited biosystems are considered. The development of the theory of stochastic determinism as an antipode of deterministic chaos is discussed.
ARTICLE | doi:10.20944/preprints202311.0677.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Genomic selection; Cotton; Fiber quality
Online: 10 November 2023 (06:52:53 CET)
Cotton is the most important natural fiber cash crop, which has high commodity economic benefits and provides an important material foundation for China's construction. With the improvement of textile technology and living standard, higher requirements are put forward for raw cotton quality. Traditional cotton breeding methods need typing and selection. With the development of biotechnology and the research of genomics, genomic selection has been widely used in cotton breeding. Genomic selection is a new breeding method, which can be selected and bred by constructing a prediction model and using high-density molecular markers covering the whole genome. In this study, the application of various genomic selection models in cotton fiber quality was explored, which provided more reliable information for genetic breeding improvement in the future, thus helping to improve the efficiency of actual cotton breeding.
REVIEW | doi:10.20944/preprints202110.0305.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: Genomic selection; WGBLUP; Medicago sativa
Online: 21 October 2021 (12:35:16 CEST)
Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve by conventional breeding strategies. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa, previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops including alfalfa. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches which use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. This is the first report in alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs.
REVIEW | doi:10.20944/preprints202109.0063.v1
Subject: Biology And Life Sciences, Cell And Developmental Biology Keywords: p53; genomic stability; cell death
Online: 3 September 2021 (13:13:48 CEST)
P53 is known as the most critical tumor suppressor and is often referred to as the guardian of our genome. More than 40 years after its discovery, we are still struggling to understand all molecular details on how this transcription factor prevents oncogenesis or how to leverage current knowledge about its function to improve cancer treatment. Multiple cues, including DNA-damage or mitotic errors, can lead to the stabilization and nuclear translocation of p53, initiating the expression of multiple target genes. These transcriptional programs may well be cell type and stimulus-specific, as is their outcome that ultimately imposes a barrier to cellular transformation. Cell cycle arrest and cell death are two well-studied consequences of p53 activation, but, while being considered as critical, they do not fully explain the consequences of p53 loss-of-function phenotypes in cancer. Here, we discuss how mitotic errors alert the p53 network and give an overview on multiple ways how p53 can trigger cell death. We argue that a comparative analysis of different types of p53 responses, elicited by different triggers in a time-resolved manner in well-defined model systems is critical to understand cell type specific cell fate induced by p53 upon its activation, in order to resolve the remaining mystery of its tumor suppressive function.
REVIEW | doi:10.20944/preprints202311.0143.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: Plants; Molecular Biology; Genomic; Transcriptomic; Epigenetic
Online: 2 November 2023 (10:11:14 CET)
The methods used to introduce CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas-mediated genome editing into fruit species, as well as the impacts of the application of this technology to activate and knock out of target genes in different fruit trees species including tree development, yield, fruit quality, and tolerance to biotic and abiotic stresses have been firstly described in this review. The application of this gene editing technology could allow the development of new generations of fruit crops with improved traits by targeting different genetic segments or even could facilitate the introduction of traits in elite cultivars without changing other traits. However, at this moment, the scarcity of efficient regeneration and transformation protocols in some species, the fact that many of those procedures are the genotype-dependent or the convenience of segregating the transgenic parts of the CRISPR system represent the main handicaps limiting the potential of genetic editing techniques for fruit trees. Finally, latest news on the legislation and regulations about the use of plants modified through CRISPR/Cas systems has been also discussed.
ARTICLE | doi:10.20944/preprints202310.1289.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: Selection Signatures; Genomic Analysis; Muscular Development Genes
Online: 19 October 2023 (16:42:00 CEST)
The goal of our study was to identify signatures of selection in the Turopolje pigs and other commercial pig breeds. We conducted a comprehensive analysis of five datasets, including one local pig breed (Turopolje) and four commercial pig breeds (Large White, Landrace, Pietrain, and Duroc), using strict quality control measures. Our final dataset consisted of 485 individuals and 54,075 single nucleotide polymorphisms (SNPs). To detect selection signatures within these pig breeds, we utilized the XP-EHH and XP-nSL methodologies, which allowed us to identify candidate genes that have been subject to positive selection. Our analysis consistently highlighted the PTBP2 and DPYD genes as commonly targeted by selection across all studied breeds. Both of these genes are associated with muscular development in pigs and other species. Furthermore, in the Large White breed a number of genes were detected by the two methods, such as ATP1A1, CASQ2, CD2, IGSF3, MAB21L3, NHLH2, SLC22A15, VANGL1. In Duroc breed a different set of genes was detected, such as ARSB, BHMT, BHMT2, DMGDH, JMY. The function of these genes was related to body weight, production efficiency and meat quality, average daily gain and other similar traits. Overall, our results have identified a number of genomic regionsthat are under selective pressure between local and commercial pig breeds. This information can help to improve our understanding of the mechanisms underlying pig breeding, and ultimately contribute to the development of more efficient and sustainable pig production practices. Our study highlights the power of using multiple genomic methodologies to detect genetic signatures of selection, and provides important insights into the genetic diversity of pig breeds.
ARTICLE | doi:10.20944/preprints202309.1253.v1
Subject: Biology And Life Sciences, Neuroscience And Neurology Keywords: toxin; venom; jellyfish; Cnidaria; genomic information; electrophysiology
Online: 19 September 2023 (08:18:20 CEST)
We have identified a new human voltage-gated potassium channel (hKv1.3) blocker, NnK-1, in the jellyfish Nemopilema nomurai, based on its genomic information. The gene sequence encoding NnK-1 contains 5,408 base pairs, with five introns and six exons. The coding sequence of the NnK-1 precursor is 894 nucleotides long and encodes 297 amino acids, containing five presumptive ShK-like peptides. An electrophysiological assay demonstrated that the chemically synthesized fifth peptide, NnK-1, is an effective hKv1.3 blocker. A multiple sequence alignment with cnidarian Shk-like peptides, which have Kv1.3-blocking activity, revealed that four residues (3Asp, 25Lys, 33Lys, and 34Thr) of NnK-1, together with six cysteine residues, are conserved. Therefore, we hypothesize that these four residues are crucial for the binding of the toxins to voltage-gated potassium channels.
COMMUNICATION | doi:10.20944/preprints202308.0338.v1
Subject: Public Health And Healthcare, Public Health And Health Services Keywords: Histoplasmosis; Histoplasma capsulatum; genome; clades; genomic; epidemiology
Online: 4 August 2023 (07:15:15 CEST)
Histoplasmosis is one of the most underdiagnosed and underreported endemic mycoses in the United States. Histoplasma capsulatum is the causative agent of this disease. To date, molecular epidemiologic studies detailing the phylogeographic structure of H. capsulatum in the United States have been limited. We conducted genomic sequencing using isolates from histoplasmosis cases reported in the United States. We identified North American Clade 2 (NAm2) as the most prevalent clade in the country. Despite high intra-clade diversity, isolates from Minnesota and Michigan cases predominately clustered by state. Future work incorporating environmental sampling and veterinary surveillance may further elucidate the molecular epidemiology of H. capsulatum in the United States and how genomic sequencing can be applied for surveillance and outbreak investigation of histoplasmosis.
REVIEW | doi:10.20944/preprints202307.0999.v2
Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: prostate cancer; metastasis; tissue-based genomic biomarker
Online: 18 July 2023 (07:17:24 CEST)
Background: The incidence of prostate cancer (PC) has been risen annually. Despite the diagnosis is made mainly with non-metastatic PC, mortality is explained by the metastatic disease (mPC). Without a doubt there is an intermediate scenario in which patients have no mPC but will have initiated a metastatic cascade through an epithelial-mesenchymal transition. There is indeed a need for more and better tools to predict what patients will progress in the future to non-localized clinical disease or already have micrometastatic disease, and therefore, will clinically progress after primary treatment. Biomarkers for predicting mPC are still under development; there are few studies and not much evidence of their usefulness. Summary: This review is focused on tissue-based genomic biomarkers (TBGB) for predicting metastatic disease. We developed four main research questions that will attempt to answer according to the current evidence. Why is important to predict metastatic disease? Which tests are available to predict metastatic disease? What impact should there be on clinical guidelines and clinical practice in predicting metastatic disease? What are current prostate cancer treatments? Key Messages: Knowing useful predict tools could help determine which patients may need multimodal or adjuvant treatment even with a localized disease, and in consequence, what patients do not need more than a single modality of treatment. The importance of predicting metastasis is fundamental, given that once metastasis is diagnosed, the quality of life (QoL) and survival drop dramatically.
REVIEW | doi:10.20944/preprints202306.1644.v1
Subject: Biology And Life Sciences, Aquatic Science Keywords: Aphanius fasciatus; Transitional waters; lagoon; biomarker; genomic
Online: 22 June 2023 (16:48:45 CEST)
Transitional waters are fragile ecosystems with high ecological values, representing the breeding and resting sites for rare and threatened species. They deserve particular concern for protection, undergoing numerous threats from anthropogenic nature. The present review aims to analyze the recent literature on Aphanius fasciatus, nowadays considered one of the most strictly estuarine-dependent fish species, and as such affected by the degradation of lagoon habitats, and to discuss its suitability as sentinel species of the quality of transitional water environments. The analysis and discussion highlight the potential applicability of the molecular, cellular, and physiological responses of this species as diagnostic tools for detecting the subtle effects induced by environmental pollution on the biota in transitional water environments. Moreover, the suitability of the responses of this species is suggested in the wider framework of the One Health Perspective which considers human and animal health and the environmental state highly interconnected, sharing common aspects. To date, omics technologies show very great potential in reacquiring novel knowledge on the responses of the organisms to environmental changes and to the alterations of the environmental health status. Therefore, considering the relevant potential of this species as sentinel species, many efforts are needed in the next future to improve the quantity and quality of the -omics tools that refer to A. fasciatus.
REVIEW | doi:10.20944/preprints202306.0948.v1
Subject: Biology And Life Sciences, Immunology And Microbiology Keywords: Haemophilus influenzae; Serotype; Genomic; Invasive disease; Bacteria
Online: 13 June 2023 (14:52:36 CEST)
Haemophilus influenzae is one of causative agent of invasive bacterial pathogen that affects both children and adults. Haemophilus influenzae is a pleomorphic gram-negative coccobacillus and it is a common commensal of the upper respiratory tract. It is a human-only pathogen that can cause severe invasive diseases. These bacterial infections can range from mild, such as ear infections, to severe, such as bloodstream infections. The infections typically affect children younger than 5 years old and old age person older than 65 years. They also affect people who are immunocompromised, such as those with certain medical conditions. The highest incidence rates of invasive Haemophilus influenzae disease have recently been discovered in various nations, including North America, Canada, and parts of Europe. In order to monitor the evolving nature of invasive Haemophilus influenzae disease critically reviewed data is required to capture the true status of invasiveness of the Haemophilus influenza diseases. Developing new vaccines against Haemilus influenzae is a potential solution to protect some vulnerable populations against the invasive disease due to this bacterial species. This review article thoroughly investigate recent and up-to-date biomedical science perspective development, innovation, findings, publications and current areas of scientific interest and gap including pathogenicity, diagnosis, multidrug-resistance, Molecular characterization and genetic evolution, epidemiology and immunological characteristics of Haemophilus influenzae, including specific current issues that are affecting the research and development of vaccines to treat Haemophilus influenzae non serotype b diseases and providing insight into how these problems may be overcome.
ARTICLE | doi:10.20944/preprints202306.0746.v1
Subject: Biology And Life Sciences, Virology Keywords: HIV-1 subtype D; Phylodynamics; Genomic surveillance
Online: 12 June 2023 (03:33:40 CEST)
(1) Background: The HIV subtype D is generally associated with a faster decline in CD4+ T cell counts, a higher viral load, and a faster progression to AIDS. However, it is still poorly characterized in Brazil. In this study, we used genomics and epidemiological data to investigate the transmission dynamics of HIV subtype D in the state of Bahia, Northeast Brazil. (2) Methods: to achieve this goal, we obtained four novel HIV-1 subtype D partial pol genome sequences using the Sanger method. To understand the emergence of this novel subtype in the state of Bahia, we used phylodynamic analysis on a dataset comprising 3,704 pol genome sequences downloaded from the Los Alamos database. (3) Results: Our analysis revealed three branching patterns, indicating multiple introductions of the HIV-1 subtype D in Brazil from the late 1980s to the late 2000s and a single introduction event in the state of Bahia. Our data further suggest that these introductions most likely originated from European, Eastern African, Western African and Southern African countries. (4) Conclusion: Understanding the distribution of HIV-1 viral strains and their temporal dynamics is crucial for monitoring the real-time evolution of circulating subtypes and recombinant forms, as well as for designing novel diagnostic and vaccination strategies. We advocate for a shift to active surveillance, to ensure adequate preparedness for future epidemics mediated by emerging viral strains.
BRIEF REPORT | doi:10.20944/preprints202303.0532.v1
Subject: Medicine And Pharmacology, Pharmacology And Toxicology Keywords: Carbapenem, Antimicrobial Resistance, Klebsiella pneumoniae, genomic context
Online: 30 March 2023 (13:02:47 CEST)
Carbapenems are considered for treating Klebsiella pneumoniae and other Enterobacteriaceae infections, especially if they are not susceptible to other generally prescribed antibiotics, i.e., if they show resistance. In such cases, antibiotic activity decreases, and most patients succumb to the infection. A better understanding of the disease pattern and resistance mechanisms could be gained by magnifying the genes that confer resistance to antibiotics. Therefore, studying the genes that confer resistance to carbapenems and any other antibiotics for that matter is indispensable for coming up with improved treatment options. This study included the analyses of co-resistance patterns between resistance genes-between drug classes and within the carbapenem-resistant genes, genomic context analysis of highly expressed carbapenem-resistant genes, and phylogenetic study of OXA-producing genes, plasmid incompatibility identification, and sequence type identification using MLST. The presence of ESBLs, MBLs, and SBLs across the downloaded genomes was studied. SHV-producing genes were found to co-occur with most of the resistant genes belonging to different drug classes. The plasmid incompatibility type IncFIB was found to be common among the highly expressed genes, and most of these genes were flanked by different families of insertion sequence (IS) elements. MLST study suggested that the presence of sequence types ST-11, ST-14, and ST-147 was common in the downloaded set of genomes.
REVIEW | doi:10.20944/preprints202301.0197.v1
Subject: Social Sciences, Behavior Sciences Keywords: Biomarkers; Genetic; Suicidal behavior; Suicide; Mexican; Genomic.
Online: 11 January 2023 (10:34:57 CET)
Suicide is defined as the action of harming oneself with the intention of dying. It is estimated that worldwide one suicide occurs every 40 seconds, making it a major health problem. Studies in families have suggested that suicide has a genetic component, around the world studies have been carried out in search of genetic variants associated with suicidal behavior, these variants could be useful as potential biomarkers to identify people at risk of suicide. In this area in Mexico, some studies of variants in genes related to neurotransmission and other important pathways have been carried out and a possible association of variants located in genes has been suggested: SLC6A4, SAT-1, TPH-2, ANKK1, GSHR, SCARA50, RGS10, STK33, COMT, and FKBP5. This systematic review shows the genetic studies on the Mexican population. This article contributes by compiling the existing information on genetic variants and genes associated with suicidal behavior, said variants in the future could be used as potential biomarkers to identify people at risk of suicide.
ARTICLE | doi:10.20944/preprints202209.0283.v1
Subject: Medicine And Pharmacology, Veterinary Medicine Keywords: Wolinella; Virulence genes; Helicobacter pylori; genomic homology
Online: 20 September 2022 (02:06:43 CEST)
Wolinella spp. and Helicobacter spp. have been repeatedly reported in the oral cavity of dogs and are associated with periodontal disease. Wolinella strains predominate in the oral cavity of dogs. The only known species of this genus, Wolinella succinogenes, was considered non-pathogenic until sequence analysis of its genome revealed homologous genes resembling virulence factors in Helicobacter pylori. This has led researchers to question the nonpathogenic status of W. succinogenes. The cagA and babA genes are examples of crucial virulence factors in H. pylori pathogenesis; thus, the present study evaluated the prevalence of these genera and assessed the Wolinella strain genome in terms of the presence of these virulence factors. Multiple specific PCR tests were performed on oral secretion samples collected from 62 dogs by sterile cytobrush to evaluate the genera, species, and presence of virulence genes. The species-specific 16s rRNA genes from the Helicobacter and Wolinella genera were detected in 58.06% and 83.87% of the oral samples, respectively. H. pylori were not detected in the specimens. No cagA and babA genes were detected in the Wolinella spp. or non-pylori Helicobacter genomes. Our results confirmed that Wolinella spp. is the predominant population compared to Helicobacter in the oral cavity of dogs. Apparently, the incidence of Helicobacter infections is generally associated with non-pylori Helicobacter organisms. Despite the hypothesis of genomic homology between W. succinogenes and H. pylori, cagA and babA virulence genes were not identified in any of the oral samples from the dogs.
REVIEW | doi:10.20944/preprints202204.0228.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: vegetables; high throughput phenotyping; genomic assisted breeding
Online: 26 April 2022 (06:00:45 CEST)
Conventional phenotyping breeding approaches for vegetable crops like Solanaceae, Bulb, Root crops, have made a significant contribution by developing many varieties. Despite this, conventional phenotyping approaches are not sufficient due to the longer time taken to develop a variety, low genetic gain, environmental factors and some other externalities that affect the phenotype-based selection. To address the challenges of conventional phenotype, a new recent method of high throughput phenotyping (HTP) is considered a promising tool. The development of high-throughput phenotyping technology began in the preceding decade as advancements in sensor, computer vision, automation, and advanced machine learning technologies. HTP platforms are being utilized to undertake non-destructive assessments of the complete plant system in a range of crops. HTP provides the precise measurements and suggests the collection of high-quality and accurate data which is necessary for standardizing phenotyping for the collection of genetic dissection and genomic assisted breeding such as genome-wide association studies (GWAS), linkage mapping, marker-assisted selection (MAS), genomic selection (GS). The remainder of this chapter discusses how high-throughput phenotyping technologies can be used in genomic-assisted breeding for vegetable crops
REVIEW | doi:10.20944/preprints202110.0407.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Cancer; Telomerase; hTERT; Telomere; Therapeutics; Genomic Integrity
Online: 27 October 2021 (12:39:22 CEST)
Telomerase is an enzyme which is culpable for the aliment and stability of telomeres. It also maintains the genomic integrity and chromosomal stability. The progressive shortening of telomeres may cause chromosomal instability and alternation in the telomerase. It may cause telomere attrition which can lead to oncogenic incidence in human. Cancer is a disease which is induced by genetic alternations in genes. The genetic mutation within the hTERT is a common type of scenario which is generally found above 90 percent of cancer. In cancer, the length of telomere and the activity of telomerase are very important for cancer cells to proliferate and also for the survival of tumors. Cancer cells regulate through several pathways to increase telomerase activity. There have been several advancements developed to inhibit the telomerase activity in cancer cell but the repercussion of those has demonstrated many adverse effects. Research on AAVs mediated telomerase gene therapy has demonstrated prominent outcomes in animal trials. Thus, it has the potential to bring significance shine in the telomerase cancer therapeutics. Here, in this review article we have analyzed studies related to telomerase gene therapeutics to cure cancer. We also have summarized the telomerase function and mechanism of action to cause cancer. Moreover, other current development in the clinical advances of telomerase inhibition in cancer is described.
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: human metapneumovirus; whole genome sequencing; genomic epidemiology
Online: 3 February 2021 (10:08:44 CET)
Human metapneumovirus (HMPV) is an important cause of upper and lower respiratory tract disease in individuals of all ages. It is estimated that most individuals will be infected by HMPV by the age of 5 years old. Despite this burden of disease, there remains caveats in our knowledge of virus global genetic diversity due to a lack of HMPV sequencing, particularly at whole genome scale. The purpose of this study was to create a simple and robust approach for HMPV whole genome sequencing to be used for genomic epidemiological studies. To design our assay, all available HMPV full length genome sequences were downloaded from the NCBI GenBank database and used to design four primer sets to amplify long, overlapping amplicons spanning the viral genome and, importantly, specific to all known HMPV subtypes. These amplicons were then pooled and sequenced on an Illumina iSeq; however the approach is suitable to other common NGS platforms. We demonstrate the utility of this method using a representative subset of clinical samples and examine these sequences using a phylogenetic approach. Here we present an amplicon-based method for the whole genome sequencing of HMPV from clinical extracts that can be used to better inform genomic studies of HMPV epidemiology and evolution.
ARTICLE | doi:10.20944/preprints202005.0343.v1
Subject: Biology And Life Sciences, Virology Keywords: GALAXY; Assembly; Annotation; Genomic Variants Discovery; Workflow
Online: 21 May 2020 (09:57:42 CEST)
Citizen Science has come up to perform analytics over the SARS-CoV-2 genome. Public GALAXY servers provide an automated platform for genomics analysis. Study includes design of GALAXY workflows for RNASEQ assembly and annotation as well as genomic variant discovery and perform analysis across four samples of SARS-CoV-2 infected humans obtained from the local population of Wuhan, China. It provides information about transcriptomics and genomic variants across the SARS-CoV-2 genome. Study can be extended to perform evolutionary and comparative study across each species of coronaviruses. Augmented and integrated study with cheminformatics and immunoinformatics will be a way forward for drug discovery and vaccine development.
ARTICLE | doi:10.20944/preprints202311.1064.v1
Subject: Public Health And Healthcare, Public Health And Health Services Keywords: newborn screening; bioethics; genomic sequencing; qualitative; public views
Online: 16 November 2023 (10:29:57 CET)
Recent dramatic reductions in the timeframe in which genomic sequencing can deliver results means its application in time-sensitive screening programs such as newborn screening (NBS) is becoming a reality. As genomic NBS (gNBS) programs are developed around the world, there is an increasing need to address the ethical and social issues that such initiatives raise. This study therefore aimed to explore the Australian public’s perspectives and values regarding key gNBS characteristics and preferences for service delivery. We recruited English-speaking members of the Australian public over 18 years of age via social media; 75 people aged 23-72 participated in one of 15 focus groups. Participants were generally supportive of introducing genomic sequencing into newborn screening, with several stating that adoption of such revolutionary and beneficial technology was a moral obligation. Participants consistently highlighted receiving an early diagnosis as the leading benefit, which was frequently linked to the potential for early treatment and intervention, or access to other forms of assistance, such as peer support. Informing parents about the test during pregnancy was considered important. This study provides insights into the Australian public’s views and preferences to inform the delivery of a gNBS program in the Australian context.
ARTICLE | doi:10.20944/preprints202309.1376.v1
Subject: Biology And Life Sciences, Virology Keywords: DENV; Genomic surveillance; Epidemiology; molecular clock; SNPs; arboviruses
Online: 20 September 2023 (10:42:22 CEST)
The dengue virus (DENV) is an arbovirus belonging to the Flaviviridae family; the species comprises four antigenically distinct serotypes (DENV–1, DENV–2, DENV–3, and DENV–4), which are further subdivided into genotypes. The virus is transmitted to humans through mosquito bites, primarily from the Aedes spp. genus. Dengue is endemic in various parts of the world, including tropical and subtropical regions of Asia, Latin America, Africa, and Oceania. In Brazil, the state of Tocantins, located in north-central Brazil, has experienced a significant number of arboviral disease cases, particularly dengue. This study aimed to monitor DENV circulation within the state by conducting full genome sequencing of viral genomes recovered from 61 patients between June 2021 and July 2022. During this period, both DENV-1 and DENV-2 serotypes were identified. Our findings confirm the circulation of DENV serotypes 1 and 2 in Tocantins, affecting males and females equally, with younger age groups (4 to 43 years old) being the most susceptible. Phylogenetic analysis revealed that the circulating viruses belong to DENV–1 genotype V American and DENV–2 genotype III Southeast Asian/American. The Bayesian analysis of DENV-1 Genotype V sequenced here is closely related to genomes previously sequenced in the state of São Paulo. Regarding DENV-2 genotype III genomes, these are clustered in a distinct, well-supported subclade, along with previously reported isolates from the states of Goiás and São Paulo. In both cases, our results suggest that multiple introductions of these genotypes occurred in the Tocantins state. This observation highlights the significant impact of major population centers in Brazil on virus dispersion, including other Latin-American countries and the USA. In the SNP analysis, DENV-1 displayed 122 distinct missense mutations, while DENV-2 had 44, with significant mutations predominantly occurring in the envelope and NS5 proteins. The analyses performed here reveal the concomitant circulation of distinct DENV-1 and -2 genotypes in some Brazilian states, underscoring the dynamic evolution of the DENV and the ongoing significance of surveillance efforts in supporting public health policies.
REVIEW | doi:10.20944/preprints202306.0108.v1
Subject: Medicine And Pharmacology, Psychiatry And Mental Health Keywords: suicide; neuroinflammation; cytokine, biomarker; genomic marker; precision psychiatry
Online: 1 June 2023 (15:45:30 CEST)
The fight against suicide is highly challenging as it may be one of the most complex and at the same time most threatening among all psychiatric phenomena. In spite of its huge impact, and despite advances in neurobiology research, understanding and predicting suicide remains a major challenge for both researchers and clinicians. To be able to identify those patients who are likely to engage in suicidal behaviors and identify suicide risk in a reliable and timely manner, we need more specific, novel biological and genetic markers/indicators to develop better screening and diagnostic methods, and in the next step to utilize these molecules as intervention targets. One such potential novel approach is offered by our increasing understanding of the involvement of neuroinflammation based on multiple observations of increased proinflammatory states underlying various psychiatric disorders including suicidal behavior. The present paper overviews our existing understanding of the association between suicide and inflammation including peripheral and central biomarkers, genetic and genomic markers, and our current knowledge of intervention in suicide risk using treatments influencing inflammation, also overviewing the next steps to be taken and obstacles to be overcome before we can utilize cytokines in the treatment of suicidal behavior.
COMMUNICATION | doi:10.20944/preprints202304.0343.v1
Subject: Biology And Life Sciences, Virology Keywords: Zika; arboviruses; vector-borne infections; genomic surveillance; phylogenetics
Online: 14 April 2023 (03:51:13 CEST)
The Americas, particularly Brazil, were greatly impacted by the widespread outbreak of Zika virus (ZIKV) in 2015 and 2016. Efforts were made to implement genomic surveillance of ZIKV as part of the public health responses. The accuracy of spatiotemporal reconstructions of the epidemic spread relies on the unbiased sampling of the transmission process. In the early stages of the outbreak, we recruited patients exhibiting clinical symptoms of arbovirus-like infection from Salvador and Campo Formoso, Bahia, in Northeast Brazil. Between May 2015 and June 2016, we identified 21 cases of acute ZIKV infection and subsequently recovered 14 near full-length sequences using the amplicon tiling multiplex approach with nanopore sequencing. We perform a time-calibrated discrete phylogeographic analysis to trace the spread and migration history of the ZIKV. Our phylogenetic analysis supports a consistent relationship between ZIKV migration from Northeast to Southeast Brazil and its subsequent dissemination beyond Brazil. Additionally, our analysis provides insights into the migration of ZIKV from Brazil to Haiti and the role Brazil played in the spread of ZIKV to other countries, such as Singapore, the USA and Dominican Republic. The data generated by this study enhances our understanding of ZIKV dynamics and supports the existing knowledge, which can aid in future surveillance efforts against the virus.
ARTICLE | doi:10.20944/preprints202206.0335.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: metadata; contextual data; harmonization; genomic surveillance; data management
Online: 24 June 2022 (08:46:04 CEST)
ARTICLE | doi:10.20944/preprints202111.0326.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: SNP; calpaincalpastatin system genes; genomic association; tenderization; ageing
Online: 18 November 2021 (13:48:09 CET)
The most important factor that determines beef tenderness is its proteolytic activity and the balance between calpain1 protease activity and calpastatin inhibition is especially important, while contributions could arise from calpain2 and possibly calpain3. These processes are however affected by the meat aging process itself. To determine whether genotypes in the calpaincalpastatin system can enhance tenderness throughout a 20 day aging period, South African purebred beef bulls (n=166) were genotyped using the Illumina BovineHD SNP BeadChip, through genebased association analysis targeting the cast, capn3, capn2 and capn1 genes. The WarnerBratzler shear force (WBSF) and myofibril fragment length (MFL) of Longissimus thoracis et lumborum (LTL) steaks were evaluated between d 3 d 20 of aging, with protease enzyme activity in the first 20 h postmortem. Although several of the 134 SNP associated with tenderness, only seven SNP in the cast, capn2 and capn1 genes sustained genetic associations, additive to agingassociated increases in tenderness for at least three of the four aging periods. While most genomic associations were relatively stable over time, some genotypes within SNP responded differently to aging, resulting in altered genomic effects over time. The level of aging at which genomic associations are performed is an important factor that determines whether SNP affect tenderness phenotypes.
REVIEW | doi:10.20944/preprints202010.0460.v2
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: plant breeding; genomic selection; Bayes; BLUP; machine learning
Online: 18 November 2020 (11:21:50 CET)
Estimation of breeding values through Best Linear Unbiased Prediction (BLUP) using pedigree-based kinship and Marker-Assisted Selection (MAS) are the two fundamental breeding methods used before and after the introduction of genetic markers, respectively. The emergence of high-density genome-wide markers has led to the development of two parallel series of approaches inspired by BLUP and MAS, which are collectively referred to as Genomic Selection (GS). The first series of GS methods alters pedigree-based BLUP by replacing pedigree-based kinship with marker-based kinship in a variety of ways, including weighting markers by their effects in genome-wide association study (GWAS), joining both pedigree and marker-based kinship together in a single-step BLUP, and substituting individuals with groups in a compressed BLUP. The second series of GS methods estimates the effects for all genetic markers simultaneously. For the second series methods, the marker effects are summed together regardless of their individual significance. Instead of fitting individuals as random effects like in the BLUP series, the second series fits markers as random effects. Differing assumptions regarding the underlying distribution of these marker effects have resulted in the development of many Bayesian-based GS methods. This review highlights critical concept developments for both of these series and explores ongoing GS developments in machine learning, multiple trait selection, and adaptation for hybrid breeding. Furthermore, considering the increasing use and variety of GS methods in plant breeding programs, this review addresses important concerns for future GS development and application, such as the use of GWAS-assisted GS, the long-term effectiveness of GS methods, and the valid assessment of prediction accuracy.
ARTICLE | doi:10.20944/preprints201811.0623.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: genomic selection; genomic prediction; genotyping by sequencing; pasmo resistance; pasmo severity; quantitative trait loci; single nucleotide polymorphism; Septoria linicola; flax
Online: 30 November 2018 (08:39:11 CET)
Pasmo (Septoria linicola) is a fungal disease causing major losses in seed yield and quality, and stem fibre quality in flax. Pasmo resistance (PR) is quantitative and has low heritability. To improve PR breeding efficiency, the accuracy of genomic prediction (GP) was evaluated using a diverse worldwide core collection of 370 accessions. Four marker sets, including three defined by 500, 134, and 67 previously identified quantitative trait loci (QTL) and one of 52,347 PR-correlated genome-wide single nucleotide polymorphisms, were used to build ridge regression best linear unbiased prediction (RR-BLUP) models using pasmo severity (PS) data collected from field experiments performed during five consecutive years. With five-fold random cross-validation, GP accuracy as high as 0.92 was obtained from the models using the 500 QTL when the average PS was used as the training dataset. GP accuracy increased with training population size, reaching values >0.9 with training population size greater than 185. Linear regression of the observed PS with the number of positive-effect QTL in accessions provided an alternative GP approach with an accuracy of 0.86. The results demonstrate the GP models based on marker information from all identified QTL and the 5-year PS average is highly effective for PR prediction.
ARTICLE | doi:10.20944/preprints202309.1163.v1
Subject: Biology And Life Sciences, Virology Keywords: viral metagenomics; black-necked crane; genomic structure; phylogenetic analysis
Online: 18 September 2023 (11:52:56 CEST)
The black-necked crane is the only species that lives in the plateau. At present, there is little re-search on viral diseases of the black-necked crane. In this study, a virus metagenomics approach was employed to investigate the viral composition of black-necked cranes in Saga County, Shi-gatse City, Tibet, China. The identified virus families carried by black-necked cranes mainly in-clude Genomoviridae, Parvoviridae, and Picornaviridae. Among them, one picornavirus genome is characterized as a novel species in the genus Grusopivirus of the family Picornaviridae, four new parvoviruses genome were obtained and classified into four different novel species within the genus Chaphamaparvovirus of the subfamily Hamaparvovirinae, and four novel genomoviruses ge-nome were also acquired and identified as members of three different species including Ge-mykroznavirus haeme1, Gemycircularvirus ptero6, and Gemycircularvirus ptero10. All of these viruses were firstly detected in fecal samples of black-necked cranes. This study provides valuable in-formation for understanding the viral community composition in the digestive tract of black-necked cranes in Tibet and for monitoring, preventing, and treating black-necked cranes viral diseases.
ARTICLE | doi:10.20944/preprints202308.1137.v1
Subject: Public Health And Healthcare, Public, Environmental And Occupational Health Keywords: Rotavirus A; Reassortment; Interspecies transmission; Genomic characterization; Porcine; Zambia
Online: 16 August 2023 (04:19:03 CEST)
Rotavirus is a major cause of diarrhea globally in animals and young children under 5 years. Here, molecular detection and genetic characterization of porcine rotavirus in smallholder and commercial pig farms in the Lusaka Province of Zambia were con-ducted. Screening of 148 stool samples by RT-PCR targeting the VP6 gene revealed a prevalence of 22.9 % (34/148). Further testing of VP6-positive samples with VP7-specific primers produced 12 positives, which were then Sanger-sequenced. BLASTn of the VP7 positives showed sequence similarity to porcine and human rota-virus strains with identities ranging from 87.5% to 97.1%. By next-generation se-quencing, the full-length genetic constellation of the representative strains RVA/pig-wt/ZMB/LSK0137 and RVA/pig-wt/ZMB/LSK0147 were determined. Geno-typing of these strains revealed a known Wa-like genetic backbone and their genetic constellations were G4-P-I5-R1-C1-M1-A8-N1-T1-E1-H1 and G9-P-I5-R1-C1-M1-A8-N1-T1-E1-H1, respectively. Phylogenetic analysis revealed that these two viruses might have their ancestral origin from pigs, though some of their gene segments were related to human strains. The study shows evidence of reas-sortment and possible interspecies transmission between pigs and humans in Zambia. Therefore, the “One Health” surveillance approach for rotavirus A in animals and humans is recommended to inform the design of effective control measures.
ARTICLE | doi:10.20944/preprints202308.0448.v1
Subject: Medicine And Pharmacology, Epidemiology And Infectious Diseases Keywords: SARS-CoV-2; genomic surveillance; NGS; deletion; variants.; clusters
Online: 4 August 2023 (13:34:25 CEST)
Next generation sequencing (NGS) from SARS-CoV-2-positive swabs collected during the last months of 2022 revealed a large deletion between ORF7b and ORF8 (426 nt) in six patients infected with the BA.5.1 Omicron variant. This extensive genome loss removed a large part of these two genes, maintaining in frame the first 22 aminoacids of ORF7b and the last 3 aminoacids of ORF8. Interestingly, the deleted region was flanked by 2 small repeats, likely involved in the formation of a hairpin structure. Similar rearrangements, comparable in size and location to the deletion, were also identified in 15 sequences in the NCBI database. In this group, 7 out of 15 cases from the USA and Switzerland presented both the BA.5.1 variant and the same 426 nuclotides deletion. It is noteworthy that 3 out of 6 cases were detected in patients with immunodeficiency and is conceivable that this clinical condition could promote the replication and selection of these mutations.
COMMUNICATION | doi:10.20944/preprints202307.0167.v1
Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Cucumis sativus var hardwickii; VEP; Genomic variation; SNP; SNV
Online: 4 July 2023 (08:55:12 CEST)
Genome-wide sequencing data play an important role in evaluating the genomic level differences between superior and poor-quality crop plants and improving our understanding of molecular association with desired traits. We analyzed the obtained 92,921,066 raw reads from genome-wide resequencing of Cucumis sativus var. hardwickii through in-silico approaches and mapped to the reference genome of Cucumis sativus to identify the genome-wide single nucleotide polymorphisms (SNPs) and Single nucleotide variations (SNV). Here, we report 19, 74,213 candidate SNPs including 1,33,468 insertions and 1,43,237 deletions and 75 Indels genome-wide. A total of 2228224 identified variants were classified into four classes including 0.01% sequence alteration, 5.94% insertion, 6.37% deletion and 87.66% SNV respectively. These variations can be a major source of phenotypic diversity and sequence variation within the species. During the present study these variants were also utilized to resolve orthologous relationships among the genomes of C. melo and C. sativus var. hardwickii. Overall, the discovery of SNPs and genomic variants may help predict the plant response to certain environmental factors and can be utilized to improve crop plants' economically important traits.
REVIEW | doi:10.20944/preprints202307.0144.v1
Subject: Medicine And Pharmacology, Other Keywords: Keywords: Cancer; database; genomic, proteomic, lipidomic, glycomic, clinical trials.
Online: 4 July 2023 (08:37:30 CEST)
Our search of existing cancer databases aimed to assess the current landscape and identify key needs. We analyzed 71 databases, focusing on genomics, proteomics, lipidomics, and glycomics. We found a lack of cancer-related lipidomic and glycomic databases, indicating a need for further development in these areas. Proteomic databases dedicated to cancer research were also limited. To assess overall progress, we included human non-cancer databases in proteomics, lip-idomics, and glycomics for comparison. This provided insights into advancements in these fields over the past eight years. We also analyzed other types of cancer databases, such as clinical trial databases and web servers. Evaluating user-friendliness, we used the FAIRness principle to assess findability, accessibility, interoperability, and reusability. This ensured databases were easily accessible and usable. Our search summary highlights significant growth in cancer databases while identifying gaps and needs. These insights are valuable for researchers, clinicians, and database developers, guiding efforts to enhance accessibility, integration, and usability. Addressing these needs will support advancements in cancer research and benefit the wider cancer community.
ARTICLE | doi:10.20944/preprints202305.1076.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: genomic selection; genotyping; heat stress; robotic dairy; selection index
Online: 16 May 2023 (04:05:22 CEST)
Dairy cattle predicted by genomic breeding values to be heat tolerant are known to have less milk production decline and lower core body temperature increases in response to elevated temperatures. In a study conducted at the University of Melbourne’s Dookie Robotic Dairy Farm during summer, we identified the most 20 heat-susceptible and 20 heat-tolerant cows in a herd of 150 Holstein Friesian lactating cows based on their phenotypic responses (changes in respiration rate, surface body temperature, panting score, and milk production). Hair samples were collected from the tip of the cows' tail following standard genotyping protocols. The results indicated variation in feed saved and HT genomic estimated breeding values (GEBVs) (P≤0.05) across age, indicating a potential for their selection. As expected, the thermotolerant group had higher GEBVs for HT and feed saved but lower for milk production. In general, younger cows had superior GEBVs for Balanced Performance Index (BPI), Type Weighted Index (TWI) and Australian Selection Index (ASI), whilst older cows were superior in fertility, feed saved (FS) and HT. The study demonstrated highly significant (P≤ 0.001) negative correlations (-0.28 to -0.74) between HT and GEBVs for current Australian Dairy cattle selection indices (BPI, TWI, ASI, HWI) and significant (P≤ 0.05) positive correlations between HT and GEBVs for traits like FS (0.45) and fertility (0.25). Genomic selection for HT will help improve cow efficiency and sustainability of dairy production under hot summer conditions. However, a more extensive study involving more lactating cows across multiple farms is recommended to confirm the associations between the phenotypic predictors of HT and GEBVs.
ARTICLE | doi:10.20944/preprints202305.0043.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Genome-Wide Association Study; Genomic Selection, Marker-Assisted Selection
Online: 2 May 2023 (02:33:16 CEST)
Advancement in biotechnology and genomics research have promoted access to DNA markers and their use in breeding programs. Genome-wide association study (GWAS), Genomic selection (GS) and Marker-Assisted Selection (MAS) are some of the applications of DNA markers in plant breeding. Researchers have suggested combining these individual applications for better selection accuracies. This study examines the potential advantages of incorporating GWAS-results into MAS and GS as well as the validity of the different methods for combining these approaches. From this study, it was concluded that number of QTNs have greater effects on prediction accuracies compared to heritability estimates. Also, the increase in prediction accuracy from the invalid method of incorporating GWAS results into GS and MAS model is similar to results recorded with using the valid approach. However, greater difference may be observed in another scenario which can lead to spurious results when used to make breeding decisions.
TECHNICAL NOTE | doi:10.20944/preprints202207.0109.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: Heterochromatin; mealy bugs; parent-of-origin effects; genomic imprinting
Online: 7 July 2022 (04:22:14 CEST)
Study of imprinted heterochromatinization of the paternal chromosome set in male mealy bugs is made difficult because it takes place at the blastula stage within the ovary. We describe here a method that allows for the bulk preparation of staged early embryos that develop normally outside the mother. We define an accessible experimental window encompassing 48 to 72hours post-mating in which regulation of heterochromatinization of the paternal chromosome set can now be investigated outside the confines of the ovary.
ARTICLE | doi:10.20944/preprints202205.0319.v1
Subject: Biology And Life Sciences, Virology Keywords: phage display; epitope mapping; COVID-19; genomic library; NGS
Online: 24 May 2022 (04:20:15 CEST)
The development of antibody therapies against SARS-CoV-2 remains a challenging task during the ongoing COVID-19 pandemic. All approved therapeutic antibodies are directed against the receptor binding domain (RBD) of Spike and lost neutralization efficacy against continuously emerging SARS-CoV-2 variants, which especially mutate in the RBD region. Previously, phage display has been used to identify epitopes of antibody responses against several diseases. Such epitopes have been applied to design vaccines or neutralizing antibodies. Here, we constructed an ORFeome phage display library for the SARS-CoV-2 genome. Open reading frames (ORFs) representing the SARS-CoV-2 genome were displayed on the surface of phage particles in order to identify enriched immunogenic epitopes from COVID-19 patients. Library quality was assessed by both, NGS and epitope mapping of a monoclonal antibody with known binding site. The most prominent epitope captured represented parts of Spike´s fusion peptide (FP). It is associated with the cell entry mechanism of SARS-CoV-2 into the host cell and the serine protease TMPRSS2 cleaves Spike within this sequence. Blocking of this mechanism could be a potential target for non-RBD binding therapeutic anti-SARS-CoV-2 antibodies. As mutations within the FP amino acid sequence were rather rare among SARS-CoV-2 variants so far, this may be an advantage in the fight against future virus variants.
REVIEW | doi:10.20944/preprints202110.0011.v1
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Machine Learning; Precision Medicine; Genomic Medicine; Therapeutic; Artificial Intelligence
Online: 1 October 2021 (11:41:27 CEST)
The advancement of precision medicine in medical care has led behind the conventional symptom-driven treatment process by allowing early risk prediction of disease through improved diagnostics and customization of more effective treatments. It is necessary to scrutinize overall patient data alongside broad factors to observe and differentiate between ill and relatively healthy people to take the most appropriate path toward precision medicine, resulting in an improved vision of biological indicators that can signal health changes. Precision and genomic medicine combined with artificial intelligence have the potential to improve patient healthcare. Patients with less common therapeutic responses or unique healthcare demands are using genomic medicine technologies. AI provides insights through advanced computation and inference, enabling the system to reason and learn while enhancing physician decision-making. Many cell characteristics, including gene up-regulation, proteins binding to nucleic acids, and splicing, can be measured at high throughput and used as training objectives for predictive models. Researchers can create a new era of effective genomic medicine with the improved availability of a broad range of data sets and modern computer techniques such as machine learning. This review article has elucidated the contributions of ML algorithms in precision and genome medicine.
ARTICLE | doi:10.20944/preprints202103.0039.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Zika Virus; Phylogenomics; Viral Genomic Variability; Conserved RNA structures
Online: 1 March 2021 (18:17:53 CET)
Zika virus (ZIKV), without a vaccine or no effective treatment approved as yet, have globally spread since the past century. The infection caused by ZIKV in humans has changed progressively from mild to subclinical in the last years, causing epidemics with greater infectivity, tropism towards new tissues, and other related symptoms as a product of various emergent ZIKV-host cell interactions. However, it is still unknown why or how the RNA genome structure impacts those interactions in differential evolutionary origin strains. Moreover, genomic comparison of ZIKV strains from the sequence-based phylogenetic analysis is well known, but differences from RNA structure comparisons are less known. Thus, in order to understand the RNA genome variability of lineages of various geographic distributions better, 412 complete genomes in a phylogenomic scanning were used for studying the conservation of structured RNAs. We found specific genomic regions, which highlight their patterns of conserved RNA structures at the level of inter-geographical comparisons. We have proposed these structures as candidates for further experimental validation to establish their potential role in vital functions of the viral cycle of ZIKV and their possible associations with the singularities of different outbreaks that occurred in specific geographic regions.
REVIEW | doi:10.20944/preprints202012.0372.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Vicia faba; genomic resources; mapping population; gene discovery; breeding
Online: 15 December 2020 (10:38:24 CET)
Faba bean (Vicia faba L.), a member of the Fabaceae family, is one of the important food legumes cultivated in cool temperate regions. It holds great importance for human consumption and livestock feed because of its high protein content, dietary fibre, and nutritional value. Major faba bean breeding challenges include its mixed breeding system, unknown wild progenitor, and genome size of ~13 Gb, which is the largest among diploid field crops. The key breeding objectives in faba bean include improved resistance to biotic and abiotic stress and enhanced seed quality traits. Major progress on reduction of vicine-convicine and seed coat tannins, the main anti-nutritional factors limiting faba bean seed usage, have been recently achieved through gene discovery. Genomic resources are relatively less advanced compared to other grain legume species, but significant improvements are underway due to a recent significant increase in research activities. A number of bi-parental populations have been constructed and mapped for targeted traits in the last decade. Faba bean now benefits from saturated synteny‐based genetic maps, along with next-generation sequencing and high-throughput genotyping technologies that are paving the way for marker-assisted selection. Developing a reference genome, and ultimately a pan-genome, will provide a foundational resource for molecular breeding. In this review, we cover the recent development and deployment of genomic tools for faba bean breeding.
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: tetraodon palembangensis; chromosome-level genome; genomic annotation; gene family
Online: 31 August 2020 (04:28:47 CEST)
The humpback puffer, Tetraodon palembangensis, also known as Pao palembangensis, is a species of poisonous freshwater pufferfish mainly distributed in Southeast Asia (Thailand, Laos, Malaysia and Indonesia). Despite interesting biological features, such as its very inactive nature, tetrodotoxin production and body expansion mechanisms, molecular research on the humpback puffer is still rare because of the lack of a high-quality reference genome. Here, we reported a first chromosome-level genome assembly of an adult humpback puffer, of which the genome size is 362 Mb with ~1.78 Mb contig N50 and ~15.8 Mb scaffold N50s. Based on the genome, ~61.5Mb (18.11%) repeat sequences were also identified, and totally 19,925 genes were annotated, 99.20% of which could be predicted with function using protein-coding function databases. Finally, a phylogenetic tree was constructed with single-copy gene families from ten teleost fishes. The humpback puffer genome will be a valuable genomic resource to illustrate possible mechanisms of tetrodotoxin synthesis and tolerance, providing clues for future detailed studies of biological toxins.
REVIEW | doi:10.20944/preprints202004.0394.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: bioinformatics; population structure; population stratification bias; genomic medicine; biobanks
Online: 22 April 2020 (07:39:22 CEST)
The past years saw the rise of genomic biobanks and mega-scale meta-analysis of genomic data that promise to reveal the genetic underpinnings of health and disease. However, the over-representation of Europeans in genomic studies not only limit the global understanding of disease risk and intervention efficacy, but also inhibit viable research into the genomic differences between carriers and patients. Whilst the community has agreed that more diverse samples are required, it is not enough to blindly increase diversity; the diversity must be quantified, compared, and annotated to lead to insight. Genetic annotations from separate biobanks need to be comparable, computable, operate without access to raw data due to privacy concerns. But they must be comparable, both for regular research and to allow international comparison in response to pandemics. Here, we evaluate the appropriateness of commonly used genomic tools used to depict population structure in a standardized and comparable manner. The end goal is to reduce the effects of confounding and learn from genuine variation in genetic effects on phenotypes across populations, which will improve the value of biobanks, locally and internationally, increase the accuracy of association analyses, and inform developmental efforts.
REVIEW | doi:10.20944/preprints201906.0231.v1
Subject: Physical Sciences, Radiation And Radiography Keywords: bystander effect, genomic instability, lethal mutations,radiotherapy, diagnostic radiology
Online: 24 June 2019 (08:34:26 CEST)
Non-targeted effects (NTE) such as bystander effects or genomic instability have been known for many years but their significance for radiotherapy or medical diagnostic radiology are far from clear. Central to the issue are reported differences in response of normal and tumour tissues to signals from directly irradiated cells. This review will discuss possible mechanisms and implications of these different responses and will then discuss possible new therapeutic avenues suggested by the analysis. Finally, the importance of NTE for diagnostic radiology and nuclear medicine which stems from the dominance of NTE in the low dose region of the dose response curve will be presented. Areas such as second cancer induction and microenvironment plasticity will be discussed.
ARTICLE | doi:10.20944/preprints202311.0395.v1
Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: Oncotype; Recurrence Score; breast cancer; genomic risk; chemotherapy; genomic assay; Exact Sciences; Oncotype RS; Oncotype DX; clinical risk; chnages in chemotherapy; TAILORx
Online: 7 November 2023 (10:35:34 CET)
Published in July 2018, TAILORx aimed to establish non-inferiority of endocrine therapy (ET) compared to addition of chemotherapy (CHT-ET) in hormone receptor positive, human epidermal receptor 2 negative (HR+/HER2-) breast cancer (BC) patients with a 21-gene intermediate (11-25) recurrence score (RS). While this hypothesis proved correct, the study did show benefit of addition of chemotherapy (CHT) in a subgroup of women under 50 years of age, and particularly in the RS 16-25. The aim of this present study was to look at how TAILORx findings, including changes in RS categories, impacted CHT implementation at one oncologic center in Basel, Switzerland, and to identify main factors leading to these changes. Methods: We conducted a retrospective study on HR+/HER2-, BC patients who underwent 21-gene genomic testing between 2010-2021, at our center. Patients with metastatic disease were not included. We identified 326 eligible patients, of which 165 had a BC diagnosis before TAILORx (cohort A) and 161 after TAILORx publication (cohort B). Results: Demographic and tumor characteristics were similar in the two cohorts, although cohort B included significantly more women under the age of 50 when compared to A (34% vs. 24%, p<0.001). Median age and mean RS results were comparable 59 (IQR 16) in A and 58 (IQR 19) years in B and 17.72 (SD9.59) in A and 17.89 (SD9.53) in B, respectively. Patients in cohort A were slightly more overweight when compared to B (55% vs. 40% respectively, p<0.001). Most patients had stage II tumors of NST histologic type. Tumors in cohort B had higher Ki-67 than cohort A (39% vs. 32%, p=0.010). When compared based on manufacturer’s and based on TAILORx thresholds there were no significant differences in RS distribution. However, changes in score category led to shifts in patient population distribution, leading to a 40% drop in the low RS (from 60% to 20%), a doubling in the intermediate RS (from 30% to 60%) and an increase of 5% in the high RS (from 8-10% to 15%). Most patients had conservative surgery, adjuvant radiotherapy (RT) and ET. Overall CHT recommendation and application did not differ significantly in B vs. A. There was a reduction of 1% in the intermediate RS (11-25) and an increase of 13% CHT-ET application in the high RS (>26) category. In cohort B we noticed an increase in CHT-ET application among women <50 years old (by 12.5%), in lobular carcinomas (by 10%), grade 3 tumors (by 2%), node positive BC (by 3%) and node negative (by 2%). Tumor board recommendation for CHT dropped by 1% in the post TAILORx era, with a notable reduction of 5% in the intermediate RS (11-25) category. However, overall CHT administration rate was 19% in cohort A and 22% in B (p=0.763). Tumor Board recommended CHT for 90% of the BC patients that would have otherwise been assigned to CHT according to new RS guidelines in A and for 85% in B, showing a trend for undertreatment. Logistic regression analysis showed significance of age in both cohort A (OR 1.05, 95% CI 1.01-1.11, p=0.03) and cohort B (OR 0.89, 95% CI 0.84-0.94), p<0.001), and of nodal status in both cohort A (=R 3.32 95% CI 1.09-10-06, p=0.034) and cohort B (OR 3.31, 95% CI 1.29-8.53, p=0.013), while intermediate and high RS seem to be more relevant in cohort A (OR 0.12, 95% CI 0.03-0.43, p=0.01 and OR 0.04, 95% CI 0.01-0.21, p<0.01). Overall logistic regression analysis showed relevance of age (OR 0.93, 95% CI 0.08-0.97, p=0.001), pN (OR 4.77, 95% CI 2.03-11.22, p<0.001) and RS categories (RS 11-25: OR 0.02, 95% CI 0.01-0.07, p<0.001; RS>26: OR 617.93, 95% CI 57.97-6587.16, p<0.001). Conclusion: Our findings are similar to those reported across several studies: while tumor board recommendation for CHT decreases in the intermediate RS category, there is an increase being reported in the high RS category, leading to overall minor changes in CHT application. Administration of CHT-ET seems to be increasing among younger women, with unfavorable histo-pathological factors, such as lobular carcinoma and G3 histologic grade. Before TAILORx there is a tendency for undertreatment (-10%), especially among older BC patients, which seems to be maintained and even deepened (-15%) in the post TAILORx era, pointing to a personalized decision-making approach among Swiss oncologists.
ARTICLE | doi:10.20944/preprints202212.0168.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: AQP2; cortisol; corticosteroid; non-genomic effects; molecular dynamics; water permeability
Online: 9 December 2022 (02:31:09 CET)
Aquaporins (AQPs) are water channels widely distributed in living organisms and involved in many pathophysiologies as well as in cell volume regulations (CVR). In the present study, based on the structural homology existing between mineralocorticoids receptors (MR), glucocorticoids receptors (GR), cholesterol consensus motif (CCM) and the extra-cellular vestibules of AQPs, we investigated the binding of corticosteroids on AQP family through in silico molecular dynamics simulations of AQP2 interactions with cortisol. We propose for the first time a putative AQPs corticosteroids binding site (ACBS) and discussed its conservation through structural alignment. Corticosteroids can mediate non-genomic effects, nonetheless, the transduction pathways involved are still misunderstood. Moreover, a growing body of evidence is pointing out toward the existence of a novel membrane receptor mediating part of these rapid corticosteroids effects. Our results suggest that the naturally produced glucocorticoid cortisol inhibits channel water permeability. Based on these results, we propose a detailed description of a putative underlying molecular mechanism. In this process, we also bring new insights on the regulatory function of AQPs extra-cellular loops and on the role of ions in tuning the water permeability. Altogether, this work brings new insights into corticosteroids non-genomic effects through the proposition of AQPs as membrane receptor of this family of regulatory molecules. This original result is the starting point for future investigations to define more in depth and in vivo the validity of this functional model.
COMMUNICATION | doi:10.20944/preprints202210.0351.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: oral cancer; machine learning; gene prioritization; genomic datasets; data mining
Online: 24 October 2022 (07:10:08 CEST)
Delayed cancer detection is one of the common causes of poor prognosis in case of many cancers including the cancers of the oral cavity. Despite improvement and development of new and efficient gene therapy treatments, very little has been done to algorithmically assess the impedance of these carcinomas. In this work, we attempt to annotate viable attributes in oral cancer gene datasets for identification of gingivobuccal cancer (GBC). We further apply supervised and unsupervised machine learning methods to the gene datasets revealing key candidate attributes for GBC prognosis. Our work highlights the importance of automated identification of key genes responsible for GBC that could perhaps be easily replicated to other forms of oral cancer detection.
ARTICLE | doi:10.20944/preprints202205.0131.v1
Subject: Biology And Life Sciences, Virology Keywords: SARS-CoV-2; Variants of Concern; Delta Variant; genomic surveillance.
Online: 10 May 2022 (09:44:11 CEST)
In this study, we analyzed sequences of SARS-CoV-2 isolates of the Delta variant in Mexico, which completely replaced other previously circulating variants in the country due to its transmission advantage. Among Delta sublineages detected, 81.5 % were classified as AY.20, AY.26, and AY.100. According to publicly available data, these sublineages only reached a world prevalence of less than 1%, suggesting a possible Mexican origin. The signature mutations of these sublineages are described, and phylogenetic analyses and haplotype networks were used to track their spread across the country. Other frequently detected sublineages include AY.3, AY.62, AY.103, and AY.113. Over time, the principal sublineages showed different geographical distributions, with AY.20 predominant in Central Mexico, AY.26 in the North, and AY.100 in the Northwest and South/Southeast. This work describes the circulation, from May to November 2021, of the primary sublineages of the Delta variants associated to the third wave of the COVID-19 pandemic in Mexico and reinforces the importance of SARS-CoV-2 genomic surveillance for timely identification of emerging variants that may impact public health.
ARTICLE | doi:10.20944/preprints202204.0266.v1
Subject: Biology And Life Sciences, Virology Keywords: SARS CoV-2; Variant of Concern; Omicron; mutation; genomic surveillance
Online: 28 April 2022 (03:28:50 CEST)
Genomic surveillance represents an important strategy for understanding evolutionary mechanisms, transmission profile, and infectivity of different SARS-CoV-2 variants. We assessed the epidemiological profile of 366 individuals who tested positive for SARS-CoV-2 from 29 municipalities in Rondônia between December 2021 to March 2022. Samples were collected, RNA was ex-tracted and screened using RT-qPCR for Alpha, Beta, Gamma, Delta and Omicron VOCs and viral quantification was performed. Sequences were analyzed for phylogeny, mutations and lineages. Of the samples analyzed, 93.71% were positive for the Omicron variant and 6.28% were positive for the Delta variant. The symptoms observed were cough, sore throat, and fever, with a mean duration of 5 days; no hospitalizations or deaths were reported. We noted that among the positive individuals, 51% had been immunized with two doses, 22% received three doses, 13% received one dose, and 13% were not immunized. Just 242 samples were amenable to analysis for alignment and phylogenetic characterization; corresponding to variants BA.1 and BA.1.1; a total of 120 mutations were identified, 36% of which were found in the S gene. In conclusion, there was a high frequency of mutations in the SARS-CoV-2 genome, but no record of clinical severity, demonstrating the positive effect of vaccination.
REVIEW | doi:10.20944/preprints202103.0519.v1
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: Genomic selection; GWAS; Bayesian methods; BLUP; Image analysis; Machine learning
Online: 22 March 2021 (11:21:29 CET)
Plant breeding primarily focuses on improving conventional agronomic traits, e.g. yield, quality, and resistance to biotic and abiotic stress; however, genetic improvement methods are being rapidly enhanced through genomics and phenomics. In the Genomics-Phenomics-Agronomics (GPA) paradigm, diverse research approaches have been conducted to bridge any two of these elements, and recently, all of them together. This review first highlights the progress to link i) genomics to agronomics; ii) genomics to phenomics; and iii) phenomics to agronomics. Secondly, the GPA domain is dissected into different layers, each addressing the three elements simultaneously. These dissected layers include genetic dissection through gene mapping using genome-wide association studies and genomic selection using Best Linear Unbiased Prediction, Bayesian approaches, and machine learning. The objective of the review is to help readers to grasp the core developments among the exponentially growing literature in each of these fields. Through this review, the connections among the three elements of the GPA paradigm are coherently integrated toward the prospect of sustainable development of agronomic traits through both genomics and phenomics.
ARTICLE | doi:10.20944/preprints202012.0421.v1
Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Whole genome pooled-seq; Pakistani Teddy goat; Genomic selection signatures
Online: 17 December 2020 (09:13:29 CET)
Whole genome pooled sequence data of 12 Pakistani Teddy goats is analyzed for positive selection signatures as their breed defining characteristics. Selection imprints left in the Teddy genome are unveiled by genomic differentiation after the successful paired-end alignment of 635,357,043 reads with (ARS1) reference genome assembly. Pooled-heterozygosity ( ) and Tajima’s D (TD) are applied for validation and getting better hits of selection signals, while pairwise FST statistics is conducted on Teddy vs. Bezoar (wild goat ancestor) for genomic differentiation. Annotation of regions under positive selection reveals 59 genes underlying production and adaptive traits. score ≥ 5 detected six windows having highest scores on Chr. 29, 9, 25, 15 and 14 that harbor HRASLS5, LACE1 and AXIN1 genes which are candidate for embryonic development, lactation and body height. Secondly, TD value of ≤ -2.2 showed 4 windows with very strong hits on Chr.5 & 9 harbor STIM1 and ADM genes related to body mass and weight. Lastly, FST analysis generated three strong signals with threshold ≤ 0.42 on Chr.12 & 5 harbor ITGB1 gene associated with milk production & lactation traits. Other significant selection signatures encompass genes associated with wool production, prolificacy, immunity and coat colors. In brief, this study identified the genes under selection in this Pakistani goat breed that will be helpful to refining future breeding policies and converging required productive traits within and across other goat breeds and to explore full genetic potential of this valued livestock species.
ARTICLE | doi:10.20944/preprints201912.0069.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: statistical mirroring; genomic mirrors; comparative optinalysis; multiple comparison; inferences; homology
Online: 5 December 2019 (11:36:25 CET)
Sequence alignment and comparison through pairwise, multiple, global and local techniques are the main principles that underpin comparative genomics. However, most of the algorithms used are alignment-based which imposed some limitations on their use and application. In an attempt to provide an alignment-free alternative approaches, a methodology of comparative optinalysis and statistical mirroring was used and adopted to provide a suitable alternative for multiple genomic sequence comparison. In this article, methods comparison with MUSCLE, MUFFT, Clustal Omega, and T-Coffee was designed to assess the suitability and statistical power of statistical mirroring as an alternative method for multiple genomic sequences comparison using different sets of logically generated biological sequence datasets with different problems and computational complications. The results of the comparisons validate that statistical mirroring is a suitable alignment-free alternative approach for multiple genomic sequence comparison. The applied method (statistical mirroring) distinguishes itself over MUSCLE, MUFFT, Clustal Omega, and T-Coffee in specificity to a position-specific changes, specificity to a base-specific changes, cladogram and phylogenetic linearity, alignment independency, computational simplicity, and limit of input capacity.
ARTICLE | doi:10.20944/preprints202309.1280.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: SNP Chip DNA Marker; GAPIT; GWAS; Genomic Selection; Grain traits; Rice
Online: 20 September 2023 (02:17:56 CEST)
This study investigated novel quantitative traits loci (QTLs) associated with the control of grain shape and size as well as grain weight in rice. We employed a joint strategy multiple GAPIT (Genome Association and Prediction Integrated Tool) models [(Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK)), Fixed and random model Circulating Probability Uniform (FarmCPU), Settlement of MLM Under Progressive Exclusive Relationship (SUPER), and General Linear Model (GLM)]–High Density SNP Chip DNA Markers (60,461) to conduct a Genome-Wide Association Study (GWAS). GWAS was performed using genotype and grain-related phenotypes of 143 recombinant inbred lines (RILs). Data show that parental lines (Ilpum and Tung Tin Wan Hein 1, TTWH1, Oryza sativa L., ssp. japonica and indica, respectively) exhibited divergent phenotypes for all analyzed grain traits), which was reflected in their derived population. GWAS results revealed the association between seven SNP Chip makers and quantitative trait loci (QTLs) for grain length, co-detected by all GAPIT models on (Chr) 1–3, 5, 7, and 11), were qGL1-1BFSG (AX-95918134, Chr1: 3820526 bp) explains 65.2%–72.5% of the phenotypic variance explained (PVE). In addition, qGW1-1BFSG (AX-273945773, Chr1: 5623288 bp) for grain width explains 15.5%–18.9% of PVE. Furthermore, BLINK or FarmCPU identified three QTLs for grain thickness independently, and explain 74.9% (qGT1Blink, AX-279261704, Chr1: 18023142 bp) and 54.9% (qGT2-1Farm, AX-154787777, Chr2: AX-154787777 bp) of the observed PVE. For t length-to-width ratio, the qLWR2BFSG (AX-274833045, Chr2: 10000097 bp) explains nearly 15.2%–32% of PVE for LWR. Likewise, the major QTL for thousand-grain weight (TGW) was detected on Chr6 (qTGW6BFSG, AX-115737727, 28484619 bp) and explains 32.8%–54% of PVE. The qTGW6BFSG QTL coincides with qGW6-1Blink for grain width and explained 32.8%–54% of PVE. Putative Candidate genes pooled from major QTLs for each grain traits have interesting annotated functions that require functional studies to elucidate their function in the control of grain size, shape, or weight in rice. Genome selection analysis proposed makers useful for downstream marker-assisted selection based on genetic merit of RILs.
REVIEW | doi:10.20944/preprints202308.1641.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: pan-genomes; comparative genomics; plant pathways; genomic databases; gravitropism; Gene Ontology
Online: 23 August 2023 (09:29:18 CEST)
The availability of multiple sequenced genomes from a single species made it possible to explore intra- and inter-specific genomic comparisons at higher resolution and build clade-specific pangenomes of several crops. The pan-genomes of crops constructed from various cultivars/accessions, landraces, and wild ancestral species represent a compendium of genes and structural variations and allow researchers to search for the novel genes and alleles that were inadvertently lost in domesticated crops during the historical process of crop domestication or in the process of extensive plant breeding. Fortunately, many valuable genes and alleles associated with desirable traits like disease resistance, abiotic stress tolerance, plant architecture, and nutrition qualities exist in landraces, ancestral species, and crop wild relatives. The novels genes from the wild ancestors and landraces can be introduced back to high-yielding varieties of modern crops by implementing classical plant breeding, genomic selection, and transgenic/gene editing approaches. Thus, pan-genomic represents a great leap in plant research and offers new avenues for targeted breeding for mitigating the impact of global climate change. Here we summarize the tools used for pangenome assembly and annotations, web-portals hosting plant pangenomes. Furthermore, we highlight a few discoveries made in crops using the pan-genomic approach and its future potential.
ARTICLE | doi:10.20944/preprints202308.1172.v1
Subject: Public Health And Healthcare, Health Policy And Services Keywords: genomic biomarkers; clinical utility; clinical validity; analytic validity; health technology assessment)
Online: 17 August 2023 (07:36:43 CEST)
Genome-based testing in oncology is a rapidly expanding area of health care which is the basis of the emerging area of precision medicine. Efficient and considered adoption of novel genomic medicine testing is hampered in Canada by the fragmented nature of health care oversight as well as by lack of clear and transparent processes to support rapid evaluation, assessment and implementation of genomic tests. This article provides an overview of some key barriers and proposes approaches to addressing these challenges as a potential pathway to developing a national approach to genomic medicine in oncology.
ARTICLE | doi:10.20944/preprints202308.0757.v1
Subject: Medicine And Pharmacology, Clinical Medicine Keywords: breast cancer; young patients; genomic; disease-free survival; galactose metabolism; stemness
Online: 9 August 2023 (09:35:55 CEST)
In recent years, there has been a notable rise in the incidence of breast cancer among young patients, who exhibit worse survival outcomes and distinct characteristics compared to the intermediate and elder patients. Therefore, it’s imperative to identify identify the specific features unique to young patients, which could offer insights into potential therapeutic strategies and improving survival outcomes. In our study, we performed an integrative analysis of bulk transcriptional and genomic data from extensive clinical cohorts to identify prognostic factors. Additionally, we analyzed the single-cell transcriptional data and conducted in vitro experiments. Our work confirmed that young patients exhibited higher grading, worse disease-free survival (DFS), a higher frequency of mutations in TP53 and BRCA1, a lower frequency of mutations in PIK3CA, and upregulation of eight metabolic pathways. Notably, galactose metabolism pathway showed upregulation in young patients and was associated with worse DFS. Further analysis and experiments indicated that galactose metabolism pathway may regulate the stemness of cancer cells and ultimately contribute to worse survival outcomes. In summary, our finding identified distinct clinicopathological, transcriptional, and genomics features and revealed a correlation between galactose metabolism pathway, stemness, and poor disease-free survival of breast cancer in young patients.
REVIEW | doi:10.20944/preprints202307.1469.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Salinity; Cotton; Germination and Yield; Functional Genomic; Seed priming; Genetic Engineering
Online: 21 July 2023 (12:38:20 CEST)
The production of plants and crops is inﬂuenced by environmental stress, which is a serious scientific issue. Cotton is an essential crop for producing natural fibers that are used to make biofuel and edible oils. Salinity is the main element that influences cotton growth during the beginning of germination. The type of salt and the growth stage affects how sensitive a plant is to salt stress. Developing ways to enhance cotton performance in salty circumstances can be aided by an understanding of the response of cotton to salt, its mechanism of resistance, and its management methods. Osmotic and ionic imbalances originate due to the deposition of soluble salts under salinity stress in the plant's root zone. Soil salinity significantly reduces plant growth due to several factors, such as nutritional ion imbalance, which reduces K+, PO4-, and NO- absorption, excessive salt chloride concentrations, and osmotic stress, which hinders water availability. Research has revealed that compared to subsequent stages, the germination, emergence, and seedling phases are more vulnerable to salinity stress, which ultimately affects the seed cotton yield by delaying blooming, reducing fruiting positions, shedding fewer fruits, and reducing boll weight. The morphology, growth of cotton roots, shoots, yield, and fiber quality are all strongly impacted by salinity stress. It slows down plant feeding, cellular metabolism, and photosynthesis. The soil, water, and climate all affect how the plant responds to salinity stress. During salt stress, excessive exclusion of sodium or its compartmentalization is the key adaptation process in cotton. A major adaptive potential to create cotton types that can withstand salt is provided by the up-regulation of both physicochemical and non-enzymatic antioxidant genes. A successful strategy to increase cotton germination in saline soils is seed priming. Moreover, the transgenic strategy might be a viable choice for improving cotton yield in saline environments. Our review focuses on the impacts of salinity on cotton productivity as well as how plants react to salt stress. It also clarifies recent genetic advancements and molecular breeding for cotton's resistance to soil salt. To create salt-tolerant cultivars, a combination of traditional breeding and novel molecular approaches will be useful.
ARTICLE | doi:10.20944/preprints202305.0213.v1
Subject: Biology And Life Sciences, Insect Science Keywords: Noctuidae; transposable elements; genomic diversity; phylogeny; horizontal transfer TE (HTT) events
Online: 4 May 2023 (07:37:23 CEST)
Noctuidae is known to have high species diversity, although the genomic diversity of Noctuidae species have not been studied extensively. Investigation of transposable elements (TEs) in this family can improve our understanding on the genomic diversity of Noctuidae. In this study, we annotated and characterized genome-wide TEs in ten noctuid species belonging to seven genera. With multiple annotation pipelines, we constructed a consensus sequence library containing 1,038 –2,826 TE consensus. The genome content of TEs showed high variation in the ten Noctuidae genomes, ranging from 11.3% to 45.0%. The relatedness analysis indicated that the TE content, especially the content of LINEs and DNA transposons, are positively correlated with the genome size (r=0.86, p-value=0.001). We identified SINE/B2 as a lineage-specific subfamily in Trichoplusia ni, a species-specific expansion of LTR/Gypsy subfamily in Spodoptera exigua, and a recent expansion of SINE/5S subfamily in Busseola fusca. We further revealed that of the four TE classes, only LINEs showed phylogenetic signal with high confidence. We also examined how the expansion of TEs contributed to the evolution of noctuid genomes. Moreover, we identified a total of 56 horizontal transfer TE (HTT) events among the ten noctuid species and at least three HTT events between the nine Noctuidae species and 11 non-noctuid arthropods. One of HTT events caused by a Gypsy transposon might have caused the recent expansion of Gypsy subfamily in the S.exigua genome. By determining the TE content, dynamics, and HTT events in the Noctuidae genomes, our study emphasized that TE activities and HTT events had substantial impacts on the Noctuidae genome evolution.
ARTICLE | doi:10.20944/preprints202304.1235.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Genomic prediction; flavonoid pigmentation; Sorghum bicolor; prediction accuracy; marker-assisted selection
Online: 29 April 2023 (10:11:10 CEST)
Marker-assisted selection (MAS) and genomic selection (GS) have been used to select individuals with desirable traits. MAS used a few markers associated with a specific trait to select individuals with desirable traits, which are determined after a Genome-wide association studies (GWAS). On the contrary, GS uses a large number of markers distributed across the genome to predict the genomic breeding values for a further selection of the individuals. In general, MAS has shown a high prediction accuracy but is not suitable for traits that are controlled for multiple genes, and has another constraint, it is required the phenotypic data; on the contrary, GS has not shown the highest prediction accuracy as MAS but it takes into account the effect of multiple genes controlling a target trait and it can be used without phenotypic data. Including GWAS-selected markers in GS can enhance the reduced prediction accuracy that GS shows in comparison with MAS. Thus, the objective of this study was to compare the prediction accuracy of MAS, and some models of genomic prediction (gBLUP, gBLUP including GWAs-selected markers, and some Bayesian models such as Bayes A, Bayes B, Bayes LASSO and Bayesian Ridge Regression) with GWAS-selected markers incorporated in gBLUP in order to confirm if the incorporation of GWAS in GS increases the prediction accuracy of GS. As a model for this study, it was used data from Sorghum which has shown population structure, to evaluate if the incorporation of GWAs-selected markers into GS improves prediciton accuracy. It was used a sample of 6000 SNPs out of the 265.487 reported in the study conducted by Morris et al (2013), and also it was considered some parameters that affect the efficiency of the selection such as the size of the training population, the heritability, and the number of QTNs. The GWAS-selected SNPs were identified after using the model BLINK. The results showed that the incorporation of GWAS-selected markers enhanced the performance of the genomic selection with similar prediction accuracy as MAS, the number of QTNs and size of the training population affected the accuracy, with higher accuracy with a bigger size of the training population and with a lower number of QTNs, but it seems that the heritability does not have any impact in the model where GWAS-selected SNPs were included in gBLUP.
ARTICLE | doi:10.20944/preprints202210.0241.v1
Subject: Medicine And Pharmacology, Other Keywords: evolution; mutation; genomic surveillance; SARS-CoV-2; COVID-19; ViralVar; webtool
Online: 17 October 2022 (12:59:07 CEST)
The unprecedented growth of publicly available SARS-CoV-2 genome sequence data has increased demand for effective and accessible SARS-CoV-2 data analysis and visualization tools. A majority of the currently available tools either require computational expertise to deploy or limit user input to pre-selected subsets of SARS-CoV-2 genomes. To address these limitations, we developed ViralVar, a publicly available, point-and-click webtool that gives users the freedom to investigate and visualize user-selected subsets of SARS-CoV-2 genomes obtained from the GISAID public database. ViralVar has two primary features that enable: 1) visualization of spatiotemporal dynamics of SARS-CoV-2 lineages, and 2) structural/functional analysis of genomic mutations. As proof-of-principle, ViralVar was used to explore the evolution of the SARS-CoV-2 pandemic in the USA in the pediatric, adult, and elderly population (n > 1.7 million genomes). While the spatiotemporal dynamics of variants did not differ between these age groups, several USA-specific sublineages arose relative to the rest of the world. Our development and utilization of ViralVar to provide insights on the evolution of SARS-CoV-2 in the USA demonstrates the importance of developing accessible tools to facilitate and accelerate large-scale surveillance of circulating pathogens. The ViralVar webserver is freely available at http://viralvar.org/.
REVIEW | doi:10.20944/preprints202110.0251.v1
Subject: Biology And Life Sciences, Cell And Developmental Biology Keywords: pluripotent; embryo; stem cells; genomic stability; cell cycle; apoptosis; differentiation; cancer
Online: 18 October 2021 (15:12:11 CEST)
Abstract: Remarkably, the p53 transcription factor, referred to as “the guardian of the genome”, is not essential for mammalian development. Moreover, efforts to identify p53‑dependent developmental events have produced contradictory conclusions. Given the importance of pluripotent stem cells as models of mammalian development, and their applications in regenerative medicine and disease, resolving these conflicts is essential. Here we attempt to reconcile disparate data into justifiable conclusions predicated on reports that p53‑dependent transcription is first detected in late mouse blastocysts, that p53 activity first becomes potentially lethal during gastrulation, and that apoptosis does not depend on p53. Furthermore, p53 does not regulate expression of genes required for pluripotency in embryonic stem cells (ESCs); it contributes to ESC genomic stability and differentiation. Depending on conditions, p53 accelerates initiation of apoptosis in ESCs in response to DNA damage, but cell cycle arrest as well as the rate and extent of apoptosis in ESCs are p53-independent. In embryonic fibroblasts, p53 induces cell cycle arrest to allow repair of DNA damage, and cell senescence to prevent proliferation of cells with extensive damage.
REVIEW | doi:10.20944/preprints202012.0738.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Next generation sequencing; Genetic disorders; Genomic medicine; Genetic counseling; Rare diseases
Online: 29 December 2020 (16:47:01 CET)
Genetic disorders are preeminent determinants of infant mortality. The inherited pediatric-onset genetic disorders have consequential stress on child growth and development: several congenital, complex and rare disorders with indistinguishable clinical symptoms where diagnosis always remains a challenging task. Traditional diagnosis methods include biochemical tests followed by chromosomal microarray and sequencing of a single gene or panel of genes. These methods had several limitations, but with the advent of whole-exome sequencing (WES), genetic testing has become cost-effective and transformative. Exome sequencing has been known for its effectiveness, which appropriately elucidates and distinguishes the heterogeneous disorders to avoid misdiagnosis and decode the underlying genetic alterations. WES has led to discovering genes and genomic variants in a broad spectrum of diseases, including autism, epilepsy, congenital heart diseases, neurodevelopmental diseases, cancer, nephrotic disorders, neural tube defects and fetal structural anomalies. WES is significant in producing immense genomic biomarkers that can be made as appropriate pharmacogenomic targets for drug therapy. In this article, we analyze the recent exploration of WES technology to revolutionize not only the process of genetic variation and disease detection but also the convention of preventative and targeted drug discovery.
REVIEW | doi:10.20944/preprints201805.0180.v1
Subject: Biology And Life Sciences, Cell And Developmental Biology Keywords: BMP; TGF-β; signaling; sex; chromosomes; XIST; genomic imprinting; hormones; fibrosis
Online: 11 May 2018 (09:49:48 CEST)
Crosstalk between the BMP and TGF-β signaling pathways regulates many complex developmental processes from the earliest stages of embryogenesis throughout adult life. In many situations, the two signaling pathways act reciprocally. For example, TGF-β signaling is generally pro-fibrotic whereas BMP signaling is anti-fibrotic and pro-calcific. Sex-specific differences occur in many diseases including cardiovascular pathologies. Differing ratios of fibrosis and calcification in stenotic valves suggests that BMP/TGF-β signaling may vary in men and women. In this review, we focus on the current understanding of the interplay between sex and BMP/TGF-β signaling and pose several unanswered questions.
ARTICLE | doi:10.20944/preprints202311.1511.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: Abdominal fat deposition; Genomic heritability; Genetic correlation; Body weight gain; Egg production
Online: 23 November 2023 (10:55:06 CET)
Fat has a high energy density and excessive fatness has been recognized as a problem for egg production and the welfare of chickens. The identification of a genetic polymorphism controlling fat deposition would be helpful to select against excessive fatness in the laying hen. This study aimed to estimate genomic heritability and identify the genetic architecture of abdominal fat deposition in a population of chickens from a Dongxiang blue-shelled local breed crossbred with the White Leghorn. A genome-wide association study wase conducted on abdominal fat percentage, egg production and body weights using a sample of 1534 hens genotyped with a 600K Chicken Genotyping Array. The analysis yielded a heritability estimate of 0.19 ± 0.04 for abdominal fat percentage, 0.56 ± 0.04 for body weight at 72 weeks, 0.11 ± 0.03 for egg production and 0.24 ± 0.04 for body weight gain. The genetic correlation of abdominal fat percentage with egg production between 60 and 72 weeks of age was 0.35 ± 0.18. This implies a potential trade-off between these two traits related to allocation of resource. Strong positive genetic correlations were found between fat deposition and weight traits. A promising locus close to COL12A1 on chromosome 3, associated with abdominal fat percent, was found in the present study. The other region located around HTR2A on chromosome 1, where allele substitution was predicted to be associated with body weight gain, accounted for 2.9% of phenotypic variance. Another region located on chromosome 1, but close to SOX5, was associated with egg production. These results may be used to influence the balanced genetic selection for laying hens.
REVIEW | doi:10.20944/preprints202306.0515.v1
Subject: Biology And Life Sciences, Life Sciences Keywords: epigenetic; genomic imprinting; ovarian tissue cryopreservation; testicular tissue cryopreservation; Medically Assisted Reproduction
Online: 7 June 2023 (08:32:11 CEST)
Ovarian tissue cryopreservation (OTC) or testicular tissue cryopreservation (TTC) are effective and often the only options for fertility preservation in female or male patients due to oncological, medical, or social aspects. While TTC and resumption of spermatogenesis, either in vivo or in vitro, has still be considered an experimental approach in humans, OTC and autotransplantation has been applied increasingly to preserve fertility with more than 200 live births worldwide. However, the cryopreservation of reproductive cells followed by the resumption of gametogenesis, either in vivo or in vitro, may interfere with sensitive and highly regulated cellular processes. In particular, the epigenetic profile, which includes not just reversible modifications of the DNA itself but also post-translational histone modifications, small non-coding RNAs, gene expression and availability, and storage of related proteins or transcripts, have to be considered in this context. Due to complex reprogramming and maintenance mechanisms of the epigenome in germ cells, growing embryos, and offspring, OTC and TTC are carried out at very critical moments early in the life cycle. Given this background, the safety of OTC and TTC taking into account the epigenetic profile has to be clarified. Cryopreservation of mature germ cells (including Metaphase II oocytes and mature spermatozoa collected via ejaculation or more invasively after testicular biopsy) or embryos has been used successfully for many years in Medically Assisted Reproduction (MAR). However, tissue freezing followed by in vitro or in vivo gametogenesis has become more attractive in the past, while few human studies have analysed the epigenetic effects, with most data deriving from animal studies. In this review, we highlight the potential influence of the cryopreservation of immature germ cells and subsequent in vivo or in vitro growth and differentiation on the epigenetic profile in humans and animals.
ARTICLE | doi:10.20944/preprints202202.0164.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: rare variants; genome-wide association study; validation test; SNP chip; genomic selection
Online: 11 February 2022 (15:59:26 CET)
The experiments described in this research article were designed to test the effect of rare variants into genomic prediction in dairy cattle. Common polymorphisms are able to explain only a small proportion of the underlying genetic variation of complex phenotypes. Variants representing functional mutations with large effects on complex phenotypes are expected to be rare due to natural (humans) or artificial (livestock) selection pressure. Therefore, it is important to check whether the use of rare variants could increase the accuracy of ranking of animals by providing the tool for more precise differentiation among the bulls with high additive genetic merit. The goal of our study was to verify whether including rare variants in a genomic selection model allows for a more accurate description of the additive genetic background of traits under selection in dairy cattle. We used the linear mixed model for comparison SNP estimates for Holstein-Friesian cattle of the two data sets – a set containing only single nucleotide polymorphisms defined by minor allele frequency ≥ 0.01, which is routinely used in the Polish genomic evaluation system (46,216 SNPs), and a set containing SNPs selected based only on the call rate (54,378 SNPs). Based on the SNP estimates we also calculated DGV and GEBV and compared them between both data sets. In all the analyses we used production, fertility, conformation and udder health traits. We also assessed the time required for the two most computationally demanding components of genomic selection: preparing genotype data, and estimation of SNP effects between those two data sets. The results of our study indicated that the analysis including rare variants resulted in changes in the individual ranking of the top 100 male and female candidates, but had no effect on the outcome of the quality of EBV prediction as expressed by the Interbull validation test.
ARTICLE | doi:10.20944/preprints202112.0049.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: bacterial wilt; biological control; phage; microscopy; sequencing; molecular characterization; genomic characterization; depolymerase
Online: 3 December 2021 (10:36:47 CET)
Ralstonia solanacearum is the causative agent of bacterial wilt, one of the most destructive plant diseases. While chemical control has an environmental impact, biological control strategies can allow sustainable agrosystems. Three lytic bacteriophages (phages) of R. solanacearum with biocontrol capacity in environmental water and plant were isolated from river water in Europe but not fully characterized, their genomic characterization being fundamental to understand their biology. In this work, the phage genomes were sequenced and subjected to bioinformatic analysis. The morphology was also observed by electron microscopy. Phylogenetic analyses were performed with a selection of phages able to infect R. solanacearum and the closely related phytopathogenic species R. pseudosolanacearum. The results indicated that the genomes of vRsoP-WF2, vRsoP-WM2 and vRsoP-WR2 range from 40,688 to 41,158 bp with almost 59% GC-contents, 52 ORFs in vRsoP-WF2 and vRsoP-WM2, and 53 in vRsoP-WR2 but, with only 22 or 23 predicted proteins with functional homologs in databases. Among them, two lysins and one exopolysaccharide (EPS) depolymerase, this type of depolymerase being identified in R. solanacearum phages for the first time. These three European phages belong to the same novel species within the Gyeongsanvirus, Autographiviridae family (formerly Podoviridae). These genomic data will contribute to a better understanding of the abilities of these phages to damage host cells and, consequently, to an improvement in the biological control of R. solanacearum.
Subject: Biology And Life Sciences, Virology Keywords: SARS-CoV-2; Sequence analysis; Comparative genomic variants; Alternating Series; Covid-19
Online: 7 April 2021 (12:59:50 CEST)
A signal analysis of the genoma sequenced of coronavirus variants: B.1.1.7, B.1.135, B.1.429-B.1.427, B.1.525 and P1 is presented. We deal with a certain type of finite alternating sum series having independently distributed terms associated with binary (0,1) indicators for the nucleotide bases A,C,G,T. This method provides additional information to conventional Similarity comparisons and Power Spectrum approaches. It leads to uncover distinctive patterns regarding the intrinsic data organization of genomic sequences according to its progression along the nucleotide bases position. Hence, the method could be useful for survelliance of genoma variants.
REVIEW | doi:10.20944/preprints202103.0643.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Fragile X Syndrome; FMRP, RNA-binding protein; Physiopathology; Genomic; Transcriptomic; Proteomic; Metabolomic
Online: 25 March 2021 (16:22:35 CET)
Fragile X syndrome (FXS) is a neurodevelopmental disorder associated with a wide range of cognitive, behavioral and medical problems. It arises from the silencing of the fragile X mental retardation 1 (FMR1) gene, and consequently, in the absence of its encoded protein, FMRP (Fragile X Mental Retardation Protein). FMRP is a ubiquitously expressed and multifunctional RNA-binding protein, primarily considered as a translational regulator. Pre-clinical studies of the past two decades have therefore focus on this function to relate FMRP’s absence to the molecular mechanisms underlying FXS physiopathology. Based on these data, successful pharmacological strategies were developed to rescue fragile X phenotype in animal models. Unfortunately, these results did not translate into human, as clinical trials using same therapeutic approaches did not reach the expected outcomes. These failures highlight the need to put into perspectives the different functions of FMRP in order to get a more comprehensive understanding of FXS pathophysiology. In this review, FMRP’s involvement on noteworthy molecular mechanisms are pointed out; ultimately contributing to various biochemicals alterations composing the fragile X phenotype.
REVIEW | doi:10.20944/preprints202010.0634.v1
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: genetic load; purging; drift load; pseudo-overdominance; heterozygosity-fitness correlation; genomic architecture
Online: 30 October 2020 (10:18:40 CET)
Upon inbreeding, the architecture of the inbreeding load shifts as selection purges strongly deleterious recessive mutations and drift fixes many milder ones. Most small inbred populations show limited genetic variation while crosses between such populations commonly express pronounced heterosis, confirming fixation. In contrast, purging appears to be limited in that inbred populations often retain substantial inbreeding depression. In addition we have the enigma Darwin noted: purely selfing taxa are unknown. Because both purging and fixation reduce inbreeding depression and load, another mechanism must exist to sustain these. Background selection and the associations that develop among alleles in small inbred populations will shift the architecture of the load potentially creating blocks of recessive mutations linked in repulsion. This would generate pseudo-overdominance that could sustain these “PODs” and inbreeding load. Recombination and crosses between lineages could erode PODs. Crosses between populations fixed for different mutations would generate high pseudo-overdominance, enhancing heterosis and potentially POD formation. New recessive mutations arising within PODs would reinforce overdominance. PODs should generate clear genetic signatures including genomic hotspots of heterozygosity and linkage disequilibrium containing alleles at intermediate frequency generating segregating load. Results from several simulation and empirical studies match these predictions. Further simulations and comparative genomic analyses are needed to rigorously test whether PODs exist in sufficient strength and number to generate persistent inbreeding depression and load in inbred lineages.
ARTICLE | doi:10.20944/preprints201805.0471.v1
Subject: Medicine And Pharmacology, Clinical Medicine Keywords: genomics; genomic medicine; health outcomes; evidence; standards; eMERGE; ClinGen; precision public health
Online: 31 May 2018 (11:27:23 CEST)
Genomic medicine is moving from research to the clinic. There is a lack of evidence about the impact of genomic medicine interventions on health outcomes. This is due in part to a lack of standardized outcome measures that can be used across different programs to evaluate the impact of interventions targeted to specific genetic conditions. The eMERGE Outcomes working group (OWG) developed measures to collect information on outcomes following the return of genomic results to participants for several genetic disorders. These outcomes were compared to outcome intervention pairs for genetic disorders developed independently by the ClinGen Actionability working group (AWG). In general, there was concordance between the defined outcomes between the two groups. The ClinGen outcomes tended to be higher level and the AWG scored outcomes represented a subset of outcomes referenced in the accompanying AWG evidence review. eMERGE OWG outcomes were more detailed and discrete, facilitating collection of relevant information from health records. This paper demonstrates that common outcomes for genomic medicine interventions can be identified. Further work is needed to standardize outcomes across genomic medicine implementation projects and make these publicly available to enhance dissemination and assist in making precision public health a reality.
ARTICLE | doi:10.20944/preprints201705.0206.v1
Subject: Biology And Life Sciences, Immunology And Microbiology Keywords: Clostridium difficile; ST201; binary toxin-positive; whole genome sequencing; comparative genomic analysis
Online: 30 May 2017 (06:15:11 CEST)
A novel binary toxin-positive non-027, non-078 Clostridium difficile strain designated LC693 whose sequence type was ST201 was isolated from the fecal sample of a patient with severe diarrhea in China. To understand the pathogenesis basis of C. difficile ST201, this recently recovered isolate LC693 was then chosen for whole genome sequencing. The project finally generated an estimated genome size of approximately 4.07 Mbp. The genome sequence was then analyzed together with the other two ST201 strains VL-0104 and VL-0391 and compared to the epidemic 027/ST1 and 078/ST11 strains. Phylogenetic analysis demonstrated that the ST201 strains belonged to clade 3. Genome size of the three ST201 strains ranged from 4.07 Mb~4.16 Mb, with an average GC content between 28.5%~28.9%. The ST201 genomes contained more than 40 antibiotic resistance genes and 15 of them were predicted to be associated with vancomycin-resistance, suggesting that they may have a strong antibiotic resistance. The ST201 strains contained a typical clade 3 specific PaLoc with a Tn6218 element inserted, and those genes harbored on their PaLoc that participated in the toxin expression and regulation were highly homologues to the epidemic 027 and 078 strains, with the exception of tcdC. A truncated TcdC was found in the ST201 strains, which is suggestive to have a contribution to the toxin production of the ST201 strains. In addition, the ST201 strains contained intact binary toxin coding and regulation genes, which is also proposed to contribute to the virulence. Genome comparison of the ST201 strains with the epidemic 027 and 078 strain identified 641 genes specific for the C. difficile ST201, and a number of them were predicted as fitness and virulence associated genes. The identification of those genes also contributes to the pathogenesis of the ST201 strain. To our knowledge, this is the first study that the genome sequence of C. difficile ST201 was discussed in detail, and the present study would have a contribution to understanding the pathogenesis basis of C. difficile ST201.
ARTICLE | doi:10.20944/preprints202306.0010.v1
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: SARS-CoV-2; dashboard; genomic variants; software; pipeline; virus genome assemblies; knowledge base
Online: 1 June 2023 (03:16:09 CEST)
Background: The outbreak of the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) resulted in the global COVID-19 pandemic. The urgency for an effective SARS-CoV-2 vaccine has led to the development of a first series of vaccines at unprecedented speed. The discovery of SARS-CoV-2 spike-glycoprotein mutants, however, and consequentially the potential to escape vaccine-induced protection and increased infectivity, demonstrates the persisting importance of monitoring SARS-CoV-2 mutations to enable early detection and tracking of genomic variants of concern. Results: We developed the CoVigator tool with three components: 1) a knowledge base that collects new SARS-CoV-2 genomic data, processes it and stores its results; 2) a comprehensive variant calling pipeline; 3) an interactive dashboard highlighting the most relevant findings. The knowledge base routinely downloads and processes virus genome assemblies or raw sequenc-ing data from the COVID-19 Data Portal (C19DP) and the European Nucleotide Archive (ENA), respectively. The results of the variant calling pipeline are visualized through the dashboard in the form of tables and customizable graphs, making it a versatile tool for tracking SARS-CoV-2 variants. We put a special emphasis on the identification of intrahost mutations and make available to the community what is, to the best of our knowledge, the largest dataset on SARS-CoV-2 intrahost mutations. In the spirit of open data, all CoVigator results are available for download. The CoVigator dashboard is accessible via covigator.tron-mainz.de. Conclusion: With increasing demand worldwide in genome surveillance for tracking the spread of SARS-CoV-2, CoVigator will be a valuable resource of up-to-date list of mutations, which can be incorporated into global efforts to sustainably prevent or treat infections.
ARTICLE | doi:10.20944/preprints202110.0332.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: iPReditor-CMG; RNA editing site; Mitochondrial genomes; genomic sequence feature; support vector machine
Online: 22 October 2021 (15:11:40 CEST)
Cytosine (C) to uracil (U) RNA editing is one of the most important post-transcriptional processes, however exploring C-to-U editing events efficiently within the crop mitochondrial genome remains a challenge. An improving predictive RNA editor for crop mitochondrial genomes, iPReditor-CMG, was proposed, which was based on SVM, three common crop mitochondrial genomes and self-sequenced tobacco mitochondrial ATPase. After multi-combination feature extracting, high-dimension feature screening and multi-test independent predicting, the results showed that the average accuracy of intraspecific prediction was 0.85, and the highest value even up to 0.91, which outperformed the previous reference models. While the prediction accuracies were 0.78 between dicotyledons and no more than 0.56 between dicotyledons and monocotyledons, implying a possible similarity in C-to-U editing mechanisms among close relatives. The best model was finally identified with an independent test accuracy of 0.91 and an area under the curve of 0.88, and further suggested that five unreported feature sequences TGACA, ACAAC, GTAGA, CCGTT and TAACA were closely associated with the editing phenomenon. Multiple evaluation findings supported that the iPReditor-CMG could be effectively applied to predict crop mitochondrial editing sites, which may contribute to insight into their recognition mechanisms and even other post-transcriptional events in crop mitochondria.
REVIEW | doi:10.20944/preprints201805.0416.v1
Subject: Biology And Life Sciences, Virology Keywords: bacteriophage; Tevenvirinae; radiolabeling; genomic map; 2-D polyacrylamide gel electrophoresis; NEPHGE-SDS PAGE
Online: 29 May 2018 (08:31:59 CEST)
The mechanisms by which bacteriophage T4 converts the metabolism of its E. coli host to one dedicated to progeny phage production was the subject of decades of intense research in many labs from the 1950’s through the 1980’s. At this point, a wide range of phages are starting to be used therapeutically and in many other applications and also the range of available phage sequence data is skyrocketing. It is thus important to re-explore the extensive available data about the intricacies of the T4 infection process as summarized here, expand it to looking much more broadly at other genera of phages, and explore phage infections using newly-available modern techniques and a range of appropriate environmental conditions.
CONCEPT PAPER | doi:10.20944/preprints202310.0038.v1
Subject: Biology And Life Sciences, Life Sciences Keywords: candidate processes; disease; genomic variations; knowledge; medical genomics; ontology; pathways; prioritization; systems analysis; variome
Online: 2 October 2023 (04:26:02 CEST)
Understanding the genetic architecture of a disease is crucial for development of valid diagnostic and therapeutic interventions. The analysis of genomic variations associated with pathological conditions is the starting point for uncovering disease-causing pathways (candidate processes). However, the complexity of intergenic and genetic-environmental interactions hinders the identification of pathogenic values of genomic changes. Furthermore, heredity, epigenetics and somatic mosaicism make the interpretation of genomic data even more sophisticated. To succeed, a variety of bioinformatic techniques are applied. Here, reviewing own and literature data, knowledge-based prioritization of genomic variations is described. Theoretical basis of the knowledge-based prioritization is given with a special regard to gene ontology, heuristics, hermeneutics (genomic hermeneutics) and analytics. Practical and methodological issues of prioritization using ontology- or pathway-based systems analysis are considered in the light of optimistic and realistic scenarios of cumulative phenotypic effects of the variome (the whole set of genomic variations in an individual or specific set of genomic variations for a phenotypic outcome). In the present communication, copy number variants (CNVs) in children with neurodevelopmental diseases are used as a practical foundation for the prioritization, inasmuch as these genome variations are systematically overlooked in the so-called NGS era. Nonetheless, it is highly likely that the prioritization is applicable to almost all types of genomic variations (e.g. chromosome abnormalities, gene mutations, functional synonymous variants etc.). The present methodology seems to be a valuable addition to current biomedical science widening the opportunities for medical genomics and genetics.
REVIEW | doi:10.20944/preprints202309.1224.v1
Subject: Biology And Life Sciences, Virology Keywords: Influenza A virus; avian influenza virus; genomic surveillance; poultry farms; wild birds; HPAI H5N1.
Online: 19 September 2023 (03:54:36 CEST)
The Influenza A virus (IAV) is a highly infectious virus that poses a significant threat to global public health and food supplies. The current subtype of avian influenza virus (AIV), H5N1, is being closely monitored worldwide due to its unprecedented spread from Europe to North America and now to Central and South America. This review summarizes recent updates on the evolution of the different IAV subtypes in birds and mammals including humans, in Chile. The distribution and spread of AIV H5N1 in Chile indicated a complex interplay between ecological and human factors in that it was negatively correlated with distance to the closest urban center and precipitation and temperature seasonality. It is evident that highly pathogenic avian influenza (HPAI) H5N1 in Chile was introduced from North America via the Atlantic migratory flyways as opposed to local transmission from other countries in South America. The presence of these viruses in Chile underscores the need for increased biosecurity on poultry farms and continuous genomic surveillance approaches to understand and control AIVs in both wild and domestic bird populations in Chile.
ARTICLE | doi:10.20944/preprints202211.0059.v1
Subject: Biology And Life Sciences, Virology Keywords: COVID-19; SARS-CoV-2; Whole Genome Sequencing; Genomic epidemiology; West Africa, Burkina Faso
Online: 2 November 2022 (11:14:27 CET)
Background: After its initial detection in Wuhan, China, in December 2019, SARS-CoV-2 has spread rapidly, causing successive epidemic waves worldwide. This study aims to provide a genomic epidemiology of SARS-CoV-2 in Burkina Faso. Methods: Three hundred and seventy-seven SARS-CoV-2 genomes obtained from PCR-positive nasopharyngeal samples (PCR cycle threshold score <35) collected between May 5, 2020, and January 31, 2022 were analysed. Genomic sequences were assigned to phylogenetic clades using NextClade and to Pango lineages using pangolin. Phylogenetic and phylogeographic analyses were performed to determine the geographical sources and time of virus introduction in Burkina Faso. Results: The analyzed SARS-CoV-2 genomes could be assigned to 10 phylogenetic clades and 27 Pango lineages already described worldwide. Our analyses revealed the important role of cross-border human mobility in the successive SARS-CoV-2 introductions in Burkina Faso from neighboring countries. Conclusion: This study provides additional insights into the genomic epidemiology of SARS-CoV-2 in West Africa. It highlights the importance of land travel in the spread of the virus and the need to rapidly implement preventive policies. Regional cross-border collaborations and the adherence of the general population to government policies are key to prevent new epidemic waves.
ARTICLE | doi:10.20944/preprints201810.0505.v1
Subject: Biology And Life Sciences, Immunology And Microbiology Keywords: Mucilaginibacer rubeus；Mucilaginibacter kameinonensis; genomic island; evolution; heavy metal resistance; draft genome sequence; CTnDOT
Online: 22 October 2018 (15:21:18 CEST)
Heavy metals are compounds that can be hazardous and impair growth of living organisms. Bacteria have evolved the capability not only to cope with heavy metals but also to detoxify polluted environments. Three heavy metal-resistant strains of Mucilaginibacer rubeus and one of Mucilaginibacter kameinonensis were isolated from the gold/copper Zijin mining site, Longyan, Fujian, China. These strains were shown to exhibit high resistance to heavy metals with minimal inhibitory concentration reaching up to 3.5 mM Cu(II), 21 mM Zn(II), 1.2 mM Cd(II), and 10.0 mM As(III). Genomes of the four strains were sequenced by Illumina. Sequence analyses revealed the presence of a high abundance of heavy metal resistance (HMR) determinants. One of the strain, M. rubeus P2, carried genes encoding 6 putative P1B-1-ATPase, 5 putative P1B-3-ATPase and 4 putative Zn(II)/Cd(II) P1B-4 type ATPase, and 16 putative RND-type metal transporter systems. Moreover, the four genomes carry a high abundance of genes coding for putative metal binding chaperones. Analysis of the close vicinity of these HMR determinants uncovered the presence of clusters of genes potentially associated with mobile genetic elements. These loci include genes coding for tyrosine recombinases (integrases) and subunits of mating pore (type 4 secretion system) respectively allowing integration/excision and conjugative transfer of numerous genomic islands. Further in silico analyses revealed that their genetic organization and gene products resemble the Bacteroides integrative and conjugative element CTnDOT. These results highlight the pivotal role of genomic islands in the acquisition and dissemination of adaptive traits, allowing for rapid adaption of bacteria and colonization of hostile environments.
ARTICLE | doi:10.20944/preprints201805.0463.v1
Subject: Medicine And Pharmacology, Other Keywords: undiagnosed rare diseases; diagnostic odyssey; NGS; deep phenotyping; genomic matchmaking; secondary findings; patient involvement
Online: 31 May 2018 (09:35:32 CEST)
The time required to reach a correct diagnosis is one of the most important problems for rare disease (RD) patients. Diagnostic delay can be intolerably long, to the point that it is usually described as a “diagnostic odyssey” and, sometimes, a diagnosis might never occur. The International Rare Disease Research Consortium proposed, as ultimate goal for 2017-2027, to enable all people with a suspected RD to be diagnosed within one year if the disorder is known, and to enter a globally coordinated diagnostic and research pipeline for the unsolved cases. In-depth analysis of the genotype through next generation sequencing, together with a standardized in-depth phenotype description and sophisticated high-throughput approaches, have been applied as diagnostic tools to increase the chance of a timely and accurate diagnosis. This approach is very fruitful as, according to the Orphanet database, from 2010 to March 2017 more than 600 new RDs have been described and about 3600 genes linked to RDs have been identified. However, combination of -omics and phenotype data and international sharing of this information raise ethical concerns. Values to be assessed include not only patient autonomy but also family implications, beneficence, non-maleficence, justice, solidarity and reciprocity, which must be respected and promoted and, at the same time, balanced among each other. In this work we suggest that, to maximise patients involvement in the search for a diagnosis and identification of new causative genes, undiagnosed patients should have the possibility to: 1) actively participate in the description of their phenotype; 2) choose the level of visibility of their profile in matchmaking databases; 3) express their preferences regarding return of new findings, in particular which level of Variant of Unknown Significance (VUS) significance should be considered relevant to them. The quality of the relation between individual patients and physicians, and between the patient community and the scientific community is critically important for making the best use of available data and combining efforts in the search for matches with similar cases worldwide that will help to solve unsolved cases. The contribution of patients to collecting and coding data comprehensively is critical for efficient use of data downstream of data collection.
ARTICLE | doi:10.20944/preprints202106.0558.v1
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: Chronic obstructive pulmonary disease (COPD); Pulmonary Rehabilitation; Vegetables; DNA damage; Genomic Instability; Oxidative stress; Inflammation.
Online: 23 June 2021 (10:08:20 CEST)
Chronic obstructive pulmonary disease (COPD) is a respiratory disease associated with airways inflammation and lung parenchyma fibrosis. The primary goals of COPD treatment are to re-duce symptoms and risk of exacerbations, therefore pulmonary rehabilitation is considered the key component of managing COPD patients. Oxidative airway damage, inflammation and re-duction of endogenous antioxidant enzymes are known to play a crucial role in the pathogenesis of COPD. Natural antioxidants have also recently been considered as they play an important role in metabolism, DNA repair and fighting the effects of oxidative stress. In this paper we evaluated the response of 105 elderly COPD patients to pulmonary rehabilitation (PR), based on high or low vegetable consumption, by analyzing clinical parameters and biological measure-ments at baseline and after completion of the three weeks PR. We found that high vegetable in-take in normal diet, without any specific intervention, can increase the probability to success-fully respond to rehabilitation (65.4% of responders ate vegetables daily vs. 40.0% of Non-Responders, p=0.033). Three weeks of pulmonary rehabilitation are probably too short to reveal a reduction of the oxidative stress and DNA damage, but are enough to show an im-provement in the patient's inflammatory state.
ARTICLE | doi:10.20944/preprints202106.0112.v1
Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: Chronic obstructive pulmonary disease (COPD); Pulmonary Rehabilitation; Diet; DNA damage; Genomic Instability; Oxidative stress; Inflammation
Online: 3 June 2021 (12:21:13 CEST)
Chronic obstructive pulmonary disease (COPD) is a respiratory disease associated with airways inflammation and lung parenchyma fibrosis. The primary goals of COPD treatment are to re-duce symptoms and risk of exacerbations, therefore pulmonary rehabilitation is considered the key component of managing COPD patients. Oxidative airway damage, inflammation and re-duction of endogenous antioxidant enzymes are known to play a crucial role in the pathogenesis of COPD. Natural antioxidants have also recently been considered as they play an important role in metabolism, DNA repair and fighting the effects of oxidative stress. In this paper we evaluated the response of 105 elderly COPD patients to pulmonary rehabilitation (PR), based on high or low vegetable consumption, by analyzing clinical parameters and biological measure-ments at baseline and after completion of the three weeks PR. We found that high vegetable in-take in normal diet, without any specific intervention, can increase the probability to success-fully respond to rehabilitation (65.4% of responders ate vegetables daily vs. 40.0% of Non-Responders, p=0.033). Three weeks of pulmonary rehabilitation are probably too short to reveal a reduction of the oxidative stress and DNA damage, but are enough to show an im-provement in the patient's inflammatory state.
ARTICLE | doi:10.20944/preprints202010.0131.v1
Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: ALG13; apoptosis; cell cycle; chemokine signaling; FAM27C; genomic medicine; oxidative phosphorylation; TASOR; VEGF signaling; VHL.
Online: 6 October 2020 (14:47:30 CEST)
Published transcriptomic data from surgically removed metastatic clear cell renal cell carcinoma (ccRCC) samples were re-analyzed from the Genomic Fabric Perspective that considers the transcriptome a multi-dimensional mathematical object, constrained by a dynamic set of expression correlations among genes. Every gene in the chest wall metastasis (MET), two primary tumors (PTA, PTB) and the surrounding normal tissue (NOR) of the right kidney was characterized by three independent measures: average expression level (AVE), relative expression variation (REV) and expression correlation (COR) with each other gene. AVE was used to determine the regulation of the genomic fabrics of ccRCC, apoptosis, chemokine and VEGF signaling pathways. REV quantified the alteration of the transcripts’ abundances control, while COR determined the remodeling of the transcriptomic networks of chemokine signaling and oxidative phosphorylation genes. The gene hierarchy was established in based on Gene Commanding Height and the Gene Master Regulators (GMR) TASOR (PTA), FAM27C (PTB) and ALG13 (MET) and DAPK3 (NOR) were identified in each profiled region. We predict that TASOR overexpression would block transcription in PTA but not in PTB, while slightly stimulating it in NOR. Silencing of ALG3 would slow-down the cell-cycle in all three cancer regions with practically no effect in NOR.
COMMUNICATION | doi:10.20944/preprints202004.0253.v1
Subject: Biology And Life Sciences, Virology Keywords: Genomic Epidemiology; GenomeTrakr; microbial pathogen surveillance, NCBI submission; whole genome sequencing; QA/QC; One Health
Online: 16 April 2020 (05:26:42 CEST)
The holistic approach of One Health, which sees human, animal, plant, and environmental health as a unit, rather than discrete parts, requires not only interdisciplinary cooperation, but standardized methods for communicating and archiving data, enabling participants to easily share what they have learned and allow others to build upon their findings.Ongoing work by NCBI and the GenomeTrakr project illustrates how open data platforms can help meet the needs of federal and state regulators, public health laboratories, departments of agriculture, and universities. Here we describe how microbial pathogen surveillance can be transformed by having an open access database along with Best Practices for contributors to follow. First, we describe the open pathogen surveillance framework, hosted on the NCBI platform. We cover the current community standards for WGS quality, provide an SOP for assessing your own sequence quality and recommend QC thresholds for all submitters to follow. We then provide an overview of NCBI data submission along with step by step details. And finally, we provide curation guidance and an SOP for keeping your public data current within the database. These Best Practices can be models for other open data projects, thereby advancing the One Health goals of Findable, Accessible, Interoperable and Re-usable (FAIR) data.
ARTICLE | doi:10.20944/preprints201908.0288.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: array-comparative genomic; gliomas; Cell culture; Cancer genomics; Cancer Transcriptomics; brain tumors; cell line; glioblastoma
Online: 27 August 2019 (16:34:22 CEST)
Cancer cell lines are widely used as in vitro models of tumorigenesis, facilitating fundamental discoveries in cancer biology and translational medicine. Currently, there are few options for glioblastoma (GBM) treatment and limited in vitro models with accurate genomic and transcriptomic characterization. Here, a detailed characterization of a new GBM cell line, namely AHOL1, was conducted in order to fully characterize its molecular composition based on its copy number alteration (CNA) and transcriptome profiling, followed by the validation of key elements associated with GBM tumorigenesis. Large numbers of CNAs and differentially expressed genes (DEGs) were identified. CNAs were distributed throughout the genome, including gains at Xq11.1-q28, Xp22.33-p11.1, Xq21.1-q21.33, 4p15.1-p14, 8q23.2-q23.3 and losses at Yq11.21-q12, Yp11.31-p11.2 and 12p13.31 positions. Nine druggable genes were identified, including HCRTR2, ETV1, PTPRD, PRKX, STS, RPS6KA6, ZFY, USP9Y and KDM5D. By integrating DEGs and CNAs, we identified 57 overlapping genes enriched in fourteen pathways. Altered expression of several cancer-related candidates found in the DEGs-CNA dataset was confirmed by RT-qPCR. Taken together, this first comprehensive genomic and transcriptomic landscape of AHOL1 provides unique resources for further studies and identifies several druggable targets that may be useful for therapeutics and biologic and molecular investigation of GBM.
ARTICLE | doi:10.20944/preprints202201.0348.v1
Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: Data Science; Genomic Data Science; Machine Learning; Network Analysis; RNA-Seq; Precision Medicine; Subtyping; Parkinson’s Disease
Online: 24 January 2022 (11:36:51 CET)
Precision medicine emphasizes fine-grained diagnostics, taking individual variability into account to enhance treatment effectiveness. Parkinson's Disease (PD) heterogeneity among individuals is a proof that disease subtypes exist, and assigning individuals to subgroups is necessary for a better understanding of disease mechanisms and designing precise treatment approaches. The purpose of this study was to identify PD subtypes using RNA-Seq data in a combined pipeline including unsupervised machine learning, bioinformatics, and network analysis. 210 post mortem brain RNA-Seq samples from PD (n = 115) and Normal Controls (NC, n = 95) were obtained with a systematic data retrieval following PRISMA statements and a fully data-driven clustering pipeline was performed to identify PD subtypes. Bioinformatics and Network analyses were performed to characterize the disease mechanisms of the identified PD subtypes and to identify target genes for drug repurposing. Two PD clusters were identified and 42 DEGs were found (p.adjusted ≤ 0.01). PD clusters had significantly different gene network structures (p < 0.0001) and phenotype-specific disease mechanisms, highlighting the differential involvement of the Wnt/β-catenin pathway regulating adult neurogenesis. NEUROD1 was identified as a key regulator of gene networks and ISX9 and PD98059 were identified as NEUROD1-interacting compounds with disease-modifying potential, reducing the effects of dopaminergic neurodegeneration. This hybrid data analysis approach could enable precision medicine applications by providing insights for the identification and characterization of pathological subtypes. This workflow has proven useful on PD brain RNA-Seq, but its application to other neurodegenerative diseases is encouraged.
BRIEF REPORT | doi:10.20944/preprints202004.0024.v1
Subject: Biology And Life Sciences, Virology Keywords: COVID-19; SARS-nCoV-2; vaccine; antibody; immune escape; variant; spike protein; genomic drift; convalescent plasma
Online: 3 April 2020 (04:24:52 CEST)
New coronavirus (SARS-CoV-2) treatments and vaccines are under development to combat the COVID-19 disease. Several approaches are being used by scientists for investigation including 1) various small molecule approaches targeting RNA polymerase, 3C-like protease, and RNA endonuclease and 2) exploration of antibodies obtained from convalescent plasma from patients who have recovered from COVID-19. The coronavirus genome is highly prone to mutations that lead to genetic drift and escape from immune recognition; thus, it is imperative that sub-strains with different mutations are also accounted for during vaccine development. As the disease has grown to become a pandemic, new B-cell and T-cell epitopes predicted from SARS coronavirus have been reported. Using the epitope information along with variants of the virus, we have found several variants which might cause drifts. Among such variants, 23403A>G variant (p.D614G) in spike protein B-cell epitope is observed frequently in European countries such as the Netherlands, Switzerland and France.
REVIEW | doi:10.20944/preprints201911.0085.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: mendelian disease; diagnostics; variant interpretation; variant prioritization; rna splicing; bioinformatics; machine learning; genomic medicine; effect prediction
Online: 8 November 2019 (04:07:16 CET)
Defects in pre-mRNA splicing are frequently a cause of Mendelian disease. Despite the advent of next-generation sequencing, allowing a deeper insight into a patient’s variant landscape, the ability to characterize variants causing splicing defects has not progressed with the same speed. To address this, recent years have seen a sharp spike in the number of splice prediction tools leveraging machine learning approaches, leaving clinical geneticists with a plethora of choices for in silico analysis. In this Review, some basic principles of machine learning are introduced in the context of genomics and splicing analysis. A critical comparative approach is then used to describe seven recent machine learning-based splice prediction tools, revealing highly diverse approaches and common caveats. We find that, although great progress has been made in producing specific and sensitive tools, there is still much scope for personalized approaches to prediction of variant impact on splicing. Such approaches may increase diagnostic yields and underpin improvements to patient care.
ARTICLE | doi:10.20944/preprints202203.0100.v2
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: genomic DNA; probability; matrices; tensor product; Hadamard product; antenna arrays; photonic crystals; liquid crystals; biophotonics; quantum informatics
Online: 26 April 2022 (10:30:56 CEST)
The article continues the author's publications about the matrix-tensor study of universal rules of stochastic (probabilistic) organization of long single-stranded DNA sequences in eukaryotic and prokaryotic genomes. The author reveals that corresponding matrices of probabilities of n-plets in n-textual representations of each genomic DNA are numerically interrelated each with other in such algebraic form, which has analogies with formalisms of the known tensor-matrix theory of digital antenna arrays. These arrays combine many separate antennas into a single coordinated ensemble with unique emergent properties, due to which antenna arrays are widely used in devices of medicine, astrophysics, avionics, etc. The noted analogies allow putting forward the author's hypothesis that the stochastic organization of genomic DNAs is connected with bio-antenna arrays. From the point of view of this hypothesis, many known facts about using principles of antenna arrays in inherited physiological phenomena are collected in a single grouping with genomic DNAs. This new topic about the biological meaning of profitable properties of antenna arrays includes problems of biological evolution, the origin of the genetic code, regenerative medicine, and the development of algebraic biology. These issues are discussed jointly with the author's results of quantum information analysis of stochastic features of genomic DNAs.
REVIEW | doi:10.20944/preprints202308.1688.v1
Subject: Biology And Life Sciences, Cell And Developmental Biology Keywords: cancer; asymmetric and symmetric cell cycling (ACD, SCD); loss of function; DNA-damaged cells; genomic repair; reprogramming; oncogenesis
Online: 24 August 2023 (09:43:20 CEST)
The life cycle of cancer follows the life cycle of the common ancestor of amoebozoans, metazoans, and fungi (AMF) and its systemic germline, which serves as a blueprint for all germlines capable of asymmetric cell division (ACD) and stem cell differentiation. Consequently, the oxygen sensitivity of the ancient non-gametogenic germline (Urgermline) was inherited by all germ and stem cell lines including the cancer germline. They all respond to Ugermline’s hyperoxia with loss of stemness and ACD ability and a dysregulated phenotype with irreparable DNA defects and defective symmetric cell divisions (DSCD). In protists, defective DSCD cells undergo an ancient MGRS repair program involving cell and nuclear fusion and hyperploid giant nuclei that restores the damaged genome to its former pre-DSCD state, with ACD potential and stemness. Human and metazoan DSCD use the same MGRS repair program inherited from the AMF ancestor. Ectopic DSCDs and DSCD-like phenotypes can survive in humans for many years in suitable niches. Under favorable environmental conditions, they also have access to the ancient MGRS repair mechanism including the ancient gene regulatory network (aGRN) and all other AMF genes. The aGRN takes control of cancer’s hybrid genome and represses human genes. It installs a G+S cancer life cycle of AMF imprinting that shapes the differentiation of naïve cancer stem cells (CSCs). CSCs are deeply homologous to the aGSCs of the AMF ancestor. Reprogramming of the DSCD genome by MGRS paves the way for oncogenesis. In this light, cancer is not a mutational or genetic disease, but a non-mutational genome-altering disease.
COMMUNICATION | doi:10.20944/preprints202011.0720.v1
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: Papillary thyroid cancer; noninvasive follicular thyroid neoplasm with papillary-like nuclear features; follicular adenoma; telomere-related genomic instability
Online: 30 November 2020 (11:37:22 CET)
Papillary thyroid carcinoma (PTC) has two main histologic variants: classical-PTC (CL-PTC) and follicular variant PTC (FV-PTC). Recently, due to its similar features to benign lesions, the encapsulated FV-PTC variant was reclassified as noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP). Nonetheless, specific molecular signatures are not yet available. It is well known that telomere-related genome instability is caused by inappropriate DNA repair of dysfunctional telomeres and that mechanisms involved in the damaged telomere repair processing may led to detrimental outcomes, altering the 3D nuclear telomere and genome organization in cancer cells. This pilot study aimed to evaluate whether a specific nuclear telomere architecture might characterize NIFTP, potentially distinguishing it from other PTC histologic variants. Our findings demonstrate that 3D telomere profiles of CL-PTC and FV-PTC were different from NIFTP and that NIFTP more closely resembles follicular thyroid adenoma (FTA). NIFTP has longer telomeres than CL-PTC and FV-PTC samples and telomere length overlaps in NIFTP and FTA. There was no association between BRAF expression and telomere length in all tested samples. Our data showing that 3D nuclear telomere organization is altered differently in thyroid cancer variants, suggest that this parameter might guide clinical management of NIFTP. Although further investigations in a larger cohort of patients are necessary to corroborate our observations, telomere-related genomic instability might be of value in the diagnosis of NIFTP and allow for a more appropriate selection of the correct treatment.
ARTICLE | doi:10.20944/preprints201803.0059.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: polybrominated diphenyl ether; PBDE; BDE-47; adipose; transcriptomic; genomic; obesogen; complement and coagulation cascade; de novo lipogenesis; metabolism
Online: 8 March 2018 (03:10:13 CET)
For the majority of lipophilic compounds adipose tissue is traditionally considered as storage depot and only rarely as a target organ. Meanwhile, abnormalities in adipose tissue physiology induced by chemical exposures may contribute to the current epidemic of obesity and metabolic diseases. Polybrominated diphenyl ethers (PBDEs) is a group of lipophilic flame retardants found in majority of human samples in North America. Their ability to alter physiology of adipose tissue is unknown. We exposed pregnant mice to 0.2 mg/kg body weight/day of BDE-47 perinatally. Transcriptomic changes in gonadal adipose tissue were analyzed in male offspring using RNA-seq approach with subsequent bioinformatic analysis. Genes of coagulation and complement cascade, de novo lipogenesis, and xenobiotic metabolism were altered in expression in response to BDE-47 exposure. The affected molecular network included the following hubs: PPARα, HNF1A and HNF4. These findings suggest that adipose tissue should be considered a target tissue for BDE-47, in addition to its role as a storage depot. This study also builds a background for a targeted search of sensitive phenotypic endpoints of BDE-47 exposure, including lipid profile parameters and coagulation factors in circulation. Additional studies are needed to investigate the role of PBDEs as an obesogen.
REVIEW | doi:10.20944/preprints202209.0238.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Free-living amoebas; FLA; genotypes; molecular epidemiology; Genomic epidemiology; Balamuthia mandrillaris; Naegleria fowleri; Acanthamoeba spp.; Vermamoeba vermiformis; Sappinia pedata
Online: 16 September 2022 (07:09:01 CEST)
Free-living amoebae (FLA) are protozoa widely distributed in the environment, found in a great diversity of terrestrial biomes. However, few genera of FLA are linked to human infections. Within these genera, Acanthamoeba spp., classified by genotypes (T1-T23), being T1, T2, T4, T5, T10, T12, and T18 as capable of causing granulomatous amoebic encephalitis (GAE) in immunocompromised patients mostly and Acanthamoeba keratitis related to genotypes T2, T3, T4, T5, T6, T10, T11, T12 and T15 in apparently healthy patients. Meanwhile, Naegleria fowleri is the causative agent of an acute infection called primary amoebic meningoencephalitis (PAM), while Balamuthia mandrillaris, like some Acanthamoeba genotypes, causes GAE, differing from the latter in the description of numerous cases in patients immunocompetent. Finally, other FLA related to the pathologies mentioned above have been reported; Sappinia pedata is responsible for one case of amoebic encephalitis; Vermamoeba vermiformis has been found in cases of ocular damage, and its extraordinary capacity as endo-cytobiont for microorganisms of public health importance such as Legionella pneumophila, Bacillus anthracis, Pseudomonas aeruginosa, among others. In this review, issues related to the epidemiology of each one are addressed, updating their geographic distribution and cases reported in recent years for pathogenic FLA.
ARTICLE | doi:10.20944/preprints202003.0098.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: genetic variation; brine shrimp Artemia; invasive species; mt-DNA COI; Inter-Simple Sequence Repeats (ISSRs) genomic fingerprinting; Western Asia
Online: 6 March 2020 (02:43:06 CET)
Due to the rapid developments in aquaculture industry, Artemia franciscana, originally an American species, has been intentionally introduced to the Eurasia, Africa and Australia. In the present study, we used a partial sequence of the mitochondrial DNA Cytochrome Oxidase subunit I (mt-DNA COI) gene and genomic fingerprinting by Inter-Simple Sequence Repeats (ISSRs) to determine the genetic variability and population structure of Artemia populations (indigenous and introduced) from 14 different geographical locations in Western Asia. Based on the haplotype spanning network, Artemia urmiana has exhibited higher genetic variation than native parthenogenetic populations. Although A. urmiana represented a completely private haplotype distribution, no apparent genetic structure was recognized among the native parthenogenetic and invasive A. franciscana populations. Our ISSR findings have documented that despite invasive populations have lower variation than source population in Great Salt Lake (Utah, USA), they have significantly revealed higher genetic variability compare to the native populations in Western Asia. According to the ISSR results, the native populations were not fully differentiated by the PCoA analysis, but the exotic A. franciscana populations were geographically divided in four genetic groups. We believe that during the colonization, invasive populations have experienced substantial genetic divergences, under new ecological conditions in the non-indigenous regions.
ARTICLE | doi:10.20944/preprints202305.1165.v1
Subject: Biology And Life Sciences, Ecology, Evolution, Behavior And Systematics Keywords: non-destructive DNA sampling; DNA collection methods; Louisiana Pigtoe; visceral swabbing; freshwater mussels; Fusconaia askewi; genotyping-by-sequencing; population genetic structure; genomic coverage; sequencing depth
Online: 16 May 2023 (14:15:33 CEST)
Limiting harm to organisms via genetic sampling is an important consideration for rare species. Nondestructive sampling techniques have been developed to address this issue in freshwater mussels. Two methods, visceral swabbing and tissue biopsies, have proven to be effective for DNA sampling, though it is unclear as to which method is preferable for genotyping-by-sequencing (GBS). Tissue biopsies may cause undue stress and damage to organisms, while visceral swabbing potentially reduces the chance of such harm. Our study compared the efficacy of these two DNA sampling methods for generating GBS data for the Unionid freshwater mussel, Texas Pigtoe (Fusconaia askewi). Our results find both methods generate quality sequence data, though some considerations are in order. Tissue biopsies produced significantly higher DNA concentrations and larger numbers of reads when compared to swabs, though there was no significant association between starting DNA concentration and number of reads generated. Swabbing produced greater sequence depth (more reads per sequence) while tissue biopsies revealed greater coverage across the genome (at lower sequence depth). Patterns of genomic variation as characterized in principal component analyses were similar regardless of the sampling method, suggesting that the less invasive swabbing is a viable option for producing quality GBS data in these organisms.
ARTICLE | doi:10.20944/preprints201711.0008.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: WNT pathway; porcupine inhibitor ETC-1922159; sensitivity analysis; colorectal cancer; unknown biological hypotheses; combinatorial search space; support vector ranking; DNA repair and genomic stability factor RAD51
Online: 1 November 2017 (05:08:04 CET)
DNA repair helps in maintaining the proper and healthy functioning for the cells in the human body. Failure in DNA repair process can lead to aberrations as well as tumorous stages. There are various types of damages that a DNA can go through, one of which is the DNA double strand breaks (DSB) that can be repaired via homologous recombination (HR). RAD51 plays a central role in HR and has been implicated as a negative/poor prognostic marker for colorectal adenocarcinoma, with high expression in colorectal cancer. Mechanistically, RAD51AP1 facilitates RAD51 during the repairing process by binding with RAD51 via two DNA binding sites, thus helping in the D-loop formation in the HR process. Often, in biology, we are faced with the problem of exploring relevant unknown biological hypotheses in the form of myriads of combination of factors that might be affecting the pathway under certain conditions. For example, RAD51AP1-XRCC2 is one such 2nd order combination whose relation needs to be tested under the influence recently developed Porcupine-WNT inhibitor ETC-1922159. The x-ray repair cross complementing XRCC family is known to work as a mediator or stabilizer for RAD51 during the HR process. The inhibitor is known to suppress Porcupine and thus inhibit a range of oncogenes known to be directly or indirectly affected by the Wnts. In a recent unpublished work in bioRxiv, we had the opportunity to rank these unknown biological hypotheses for down regulated genes at 2nd order level after the drug was administered. The in silico observations showed that the combination of RAD51AP1-XRCC2 was assigned a relatively lower rank, thus validating the pipeline's efficacy with the confirmed wet lab experiment that indicate that both RAD51AP1 and XRCC2 were down regulated after treatment in cancer cells. Here, we take one step further by in silico analysis of the 3rd order combinations of RAD51-X-X & RAD51AP1-X-X (X can be known or unknown factor), from a range of 100 randomly picked down regulated genes after ETC-1922159 treatment. The pipeline uses the density based HSIC (Hibert Schmidth Information Criterion) sensitivity index with an rbf (radial basis function) kernel, which is known to be highly effective in sensitivity analysis. Various unknown/unexplored/untested RAD51/RAD51AP1 related 3rd order biological hypotheses emerge some of which are confirmed in wet lab, while others need to be tested.
ARTICLE | doi:10.20944/preprints201812.0185.v1
Subject: Medicine And Pharmacology, Obstetrics And Gynaecology Keywords: Fluorescence in situ hybridization (FISH), Karyotype, array comparative genomic hybridization (aCGH), amniotic fluid (AF), chorionic villus sampling (CVS), aneuploidies, pathogenic copy number variants (pCNV), confined placental mosaicism (CPM), true fetal mosaicism (TFM), pseudo-mosaicism.
Online: 17 December 2018 (09:58:43 CET)
Current prenatal genetic evaluation showed a significantly increase in non-invasive screening and the reduction of invasive diagnostic procedures. To evaluate the diagnostic efficacy on detecting common aneuploidies, structural chromosomal rearrangements and pathogenic copy number variants (pCNV), we performed a retrospective analysis on a case series initially analyzed by aneuvysion fluorescence in situ hybridization (FISH) and karyotyping then followed by array comparative genomic hybridization (aCGH). Of the 386 cases retrieved from the past decade, common aneuploidies were detected in 137 cases (35.5%), other chromosomal structural rearrangements were detected in four cases (1%), and pCNV were detected in five cases (1.3%). The relative frequencies for common aneuploidies suggested a under detection of sex chromosome aneuploidies. Approximately 9.5% of cases with common aneuploidies showed a mosaic pattern. Inconsistent results between FISH and karyotyping were noted in cases with pseudo-mosaicism introduced by culture artifact or variable cellular proliferation from cells with mosaic karyotypic complements under in vitro cell culture. Based on findings from this case series, cell-based FISH and karyotyping should be performed to detect common aneuploidies, structural chromosomal abnormalities, and mosaic pattern. DNA-based aCGH and reflex FISH should be performed to detect and confirm genomic imbalances and pCNV. Practice points to ensure the diagnostic accuracy and efficacy were summarized.