ARTICLE | doi:10.20944/preprints201911.0202.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Circ-RNA; CLIP-Seq; RBP
Online: 17 November 2019 (11:01:25 CET)
Circular RNAs are a special type of RNAs which recently attracted a lot of research interest in studying its formation and function. RNA binding proteins (RBPs) that bind circRNAs are important in these processes but are relatively less studied. CLIP-Seq technology has been invented and applied to profile RBP-RNA interactions on the genome-wide scale. While mRNAs are usually the focus of CLIP-Seq experiments, RBP-circRNA interactions could also be identified through specialized analysis of CLIP-Seq datasets. However, many technical difficulties are involved in this process, such as the usually short read length of CLIP-Seq reads. In this study, we created a pipeline called Clirc specialized for profiling circRNAs in CLIP-Seq data and analyzing the characteristics of RBP- circRNAs interactions. In conclusion, this is one of the first few studies to investigate circRNAs and their binding partners through repurposing CLIP-Seq datasets to our knowledge, and we hope our work will become a valuable resource for future studies into the biogenesis and function of circRNAs. Clirc software is available at https://github.com/Minzhe/Clirc
ARTICLE | doi:10.20944/preprints201903.0157.v1
Subject: Life Sciences, Molecular Biology Keywords: long non-coding RNA; hESC; cardiomyocyte; RNA-seq
Online: 15 March 2019 (02:11:52 CET)
Long non-coding RNAs (lncRNAs) have been found to be involved in many biological processes, including the regulation of cell differentiation, but a complete characterization of lncRNA is still lacking. Additionally, there is evidence that lncRNAs interact with ribosomes, raising questions about their functions in cells. Here, we used a developmentally staged protocol to induce cardiogenic commitment of hESCs and then investigated the differential association of lncRNAs with polysomes. Our results identified lncRNAs in both the ribosome-free and polysome-bound fractions during cardiogenesis and showed a very well-defined temporal lncRNA association with polysomes. Clustering of lncRNAs was performed according to the gene expression patterns during the five timepoints analyzed. In addition, differential lncRNA recruitment to polysomes was observed when comparing the differentially expressed lncRNAs in the ribosome-free and polysome-bound fractions or when calculating the polysome-bound vs ribosome-free ratio. The association of lncRNAs with polysomes could represent an additional cytoplasmic role of lncRNAs, e.g., in translational regulation of mRNA expression.
REVIEW | doi:10.20944/preprints202109.0253.v1
Subject: Biology, Other Keywords: Mycobacteria; Mycobacterium tuberculosis; non-coding RNA; RNA-seq; transcriptome
Online: 15 September 2021 (11:00:59 CEST)
A definitive transcriptome atlas for the non-coding expressed elements of pathogenic mycobacteria does not exist. Incomplete lists of non-coding transcripts can be obtained for some of the reference genomes (e.g. Mycobacterium tuberculosis H37Rv) but to what extent these transcripts have homologues in closely related species or even strains is not clear. This has implications for the analysis of transcriptomic data; non-coding parts of the transcriptome are often ignored in the absence of formal, reliable annotation. Here, we review the state of our knowledge of non-coding RNAs in pathogenic mycobacteria, emphasising the disparities in the information included in commonly used databases. We then proceed to review ways of combining computational solutions for predicting the non- coding transcriptome with experiments that can help refine and confirm these predictions.
ARTICLE | doi:10.20944/preprints202204.0220.v1
Subject: Mathematics & Computer Science, Computational Mathematics Keywords: scRNA-seq; single cell; RNA-seq; DEG; differential expression; DE; benchmarking; scRNA-seq simulator
Online: 25 April 2022 (06:18:45 CEST)
To guide analysts to select the right tool and parameters in differential gene expression analysis of single-cell RNA sequencing (scRNA-seq) data, we developed a novel simulator that recapitulates the data characteristics of real scRNA-seq datasets while accounting for all the relevant sources of variation in a multi-subject, multi-condition scRNA-seq experiment: the cell-to-cell variation within a subject, the variation across subjects, the variability across cell types, the mean/variance relationship of gene expression across genes, library size effects, group effects, and covariate effects. By applying it to benchmark 12 differential gene expression analysis methods (including cell-level and pseudo-bulk methods) on simulated multi-condition, multi-subject data of the 10x Genomics platform, we demonstrated that methods originating from the negative binomial mixed model such as glmmTMB and NEBULA-HL outperformed other methods. Utilizing NEBULA-HL in a statistical analysis pipeline (https://github.com/interactivereport/scRNAseq_DE) for single cell analysis will enable scientists to better understand cell-type specific transcriptomic response to disease or treatment effects and to discover new drug targets. Further, application to two real datasets showed the outperformance of our differential expression (DE) pipeline, with unified findings of differentially expressed genes (DEG) and a pseudo-time trajectory transcriptomic result. In the end, we made recommendations of filtering strategies of cells and genes based on simulation results to achieve optimal experimental goals.
ARTICLE | doi:10.20944/preprints201705.0070.v1
Subject: Biology, Plant Sciences Keywords: Chromatin and transcription dynamics; reproductive development; differentiation; ChIP-seq; RNA-seq
Online: 8 May 2017 (18:25:10 CEST)
Plant life-long organogenesis involves sequential, time and tissue specific expression of developmental genes. This requires activities of Polycomb Group (PcG) and trithorax Group complexes, respectively responsible for repressive Histone 3 trimethylation at lysine 27 (H3K27me3) and activation-related H3K4me3. However, the genome-wide dynamics in histone modifications that occur during developmental processes have remained elusive. Here, we report the distributions of H3K27me3 and H3K4me3 along with transcriptional changes, in a developmental series including Arabidopsis leaf and three stages of flower development. We found that chromatin mark levels are highly dynamic over the time series on nearly half of all Arabidopsis genes. Moreover, during early flower morphogenesis, changes in H3K4me3 prime over changes in H3K27me3 and quantitatively correlate with transcription changes, while H3K27me3 changes occur after prolonged expression changes. Notably, early activation of PcG target genes is dominated by increases in H3K4me3 while H3K27me3 remains present at the locus. Our results reveal H3K4me3 as greater predictor over H3K27me3 for transcription dynamics, unveil unexpected chromatin mechanisms at gene activation and underline the relevance of tissue-specific temporal epigenomics.
ARTICLE | doi:10.20944/preprints202102.0234.v1
Subject: Biology, Anatomy & Morphology Keywords: Principal Component Analysis, RNA-seq, prostate cancer, biomarkers, RNA genes
Online: 9 February 2021 (10:26:47 CET)
Prostate cancer (Pca) is a highly heterogeneous disease and the second more common tumor in males. Molecular and genetic profiles have been used to identify subtypes and guide therapeutic intervention. However, roughly 26% of primary Pca are driven by unknown molecular lesions. We use Principal Component Analysis (PCA) and custom RNAseq-data normalization to identify a gene expression signature which segregates primary PRAD from normal tissues. This Core-Expression Signature (PRAD-CES) includes 33 genes and accounts for 39% of data complexity along the PC1-cancer axis. The PRAD-CES is populated by protein-coding (AMACR, TP63, HPN) and RNA-genes (PCA3, ARLN1) sparsely found in previous studies, validated/predicted biomarkers (HOXC6, TDRD1, DLX1), and/or cancer drivers (PCA3, ARLN1, PCAT-14). Of note, the PRAD-CES also comprises six over-expressed LncRNAs without previous Pca association, four of them potentially modulating driver’s genes TMPRSS2, PRUNE2 and AMACR. Overall, our PCA capture 57% of data complexity within PC1-3. GO enrichment and correlation analysis involving major clinical features (i.e., Gleason Score, AR Score, TMPRSS2-ERG fusion and Tumor Cellularity) suggest that PC2 and PC3 gene signatures might describe more aggressive and inflammation-prone transitional forms of PRAD. Of note, surfaced genes may entail novel prognostic biomarkers and molecular alterations to intervene. Particularly, our work uncovered RNA genes with appealing implications on Pca biology and progression.
ARTICLE | doi:10.20944/preprints202112.0149.v2
Online: 23 December 2021 (11:34:00 CET)
Research Highlights: This study identified the cell cycle genes in birch that likely play important roles during plant growth and development. This analysis provides a basis for understanding the regulatory mechanism of various cell cycles in Betula pendula. Background and Objectives: The cell cycle factors not only influence cell cycle progression together, but also regulate accretion, division and differentiation of cells, and then regulate growth and development of plant. In this study, we identified the putative cell cycle genes in B. pendula genome, based on the annotated cell cycle genes in A. thaliana. It could serve as a foundation for further functional studies. Materials and Methods: The transcript abundance was determined for all the cell cycle genes in xylem, root, leaf and flower tissues using RNA-seq technology. Results: We identified 59 cell cycle gene models in the genome of B. pendula, 17 highly expression genes among them. These genes were BpCDKA.1, BpCDKB1.1, BpCDKB2.1, BpCKS1.2, BpCYCB1.1, BpCYCB1.2, BpCYCB2.1, BpCYCD3.1, BpCYCD3.5, BpDEL1, BpDpa2, BpE2Fa, BpE2Fb, BpKRP1, BpKRP2, BpRb1 and BpWEE1. Conclusions: We identified 17 core cell cycle genes in the genome of birch by combining phylogenetic analysis and tissue specific expression data.
ARTICLE | doi:10.20944/preprints201903.0124.v1
Subject: Life Sciences, Molecular Biology Keywords: RNA-Seq, htseq-count, HISAT2, bioinformatics, strandedness
Online: 11 March 2019 (09:06:40 CET)
RNA sequencing (RNA-Seq) is a complicated protocol, both in the laboratory in generation of data and at the computer in analysis of results. Several decisions during RNA-Seq library construction have important implications for analysis, most notably strandedness during complementary DNA (cDNA) library construction. Here we clarify bioinformatic decisions related to strandedness in both alignment of DNA sequencing reads to reference genomes and subsequent determination of transcript abundance.
REVIEW | doi:10.20944/preprints202102.0230.v1
Subject: Life Sciences, Biochemistry Keywords: Astrocyte, Alzheimer´s disease, neurodegeneration, transcriptomics, RNA sequencing (RNA-seq), cellular states.
Online: 9 February 2021 (10:04:24 CET)
Astrocytes perform a wide variety of essential functions defining normal operation of the nervous system, and are active contributors to the pathogenesis of neurodegenerative disorders such as Alzheimer among others. Recent data provide compelling evidence that distinct reactive astrocyte states are associated with specific stages of Alzheimer´s disease. The advent of transcriptomics technologies enables rapid progress in the characterisation of such pathological astrocyte states. In this review, we provide an overview of the origin, main functions, molecular and morphological features of astrocytes in physiological as well as pathological conditions related to Alzheimer´s disease. We will also explore the main roles of astrocytes in the pathogenesis of Alzheimer´s disease and summarize main transcriptional changes and altered molecular pathways observed in astrocytes during the course of the disease.
ARTICLE | doi:10.20944/preprints202201.0464.v1
Online: 31 January 2022 (13:25:38 CET)
The mosaic disease in maize is caused by Sugarcane mosaic virus (SCMV), a member of the Potyviridae family. The best strategy to cope with viral infections is the use of disease-resistant maize lines. To better understand the resistance response to SCMV, we analyzed differentially expressed genes among a resistant line (CI-RL1), a susceptible line (B73), and the F1 progeny from a cross between both lines using RNA-Seq data. We also analyzed transcript expression pattern clustering to allocate previously reported resistance candidate genes. GO enrichment analysis of biological processes highlighted a strong regulation in ROS detoxification in both the susceptible and resistant lines. The enrichment of cellular components led to the identification of an integral component of the plasma membrane in the RL line. Transcript expression patterns provide evidence of the importance of host translation in virus response, showing the diverse and complex behavior of eIF4E homologs and the presence of eleven eEF1α factors in maize. In addition, we identified two genes putatively implied in long-distance movement: ZmPiezo and ZmPVIP1. Finally, we propose an ABC transporter to be associated with viral resistance.
ARTICLE | doi:10.20944/preprints202109.0224.v1
Online: 14 September 2021 (08:19:04 CEST)
The major threats to the sustainable supply of forest tree products are adverse climate, pests and diseases. Climate change, exemplified by increased drought, poses a unique threat to global forest health. This is attributed to the unpredictable behavior of forest pathosystems, which can favor fungal pathogens over the host under persistent drought stress conditions in the future. Currently, the effects of drought on tree resistance against pathogens are hypothetical, thus research is needed to identify these correlations. Norway spruce (Picea abies) is one of the most economically important tree species in Europe, and is considered highly vulnerable to changes in climate. Dedicated experiments to investigate how disturbances will affect the Norway spruce - Heterobasidion sp. pathosystem are important, in order to develop different strategies to limit the spread of H. annosum s.l. under the predicted climate change. Here, we report a transcriptional study to compare Norway spruce gene expressions to evaluate the effects of water availability and the infection of Heterobasidion parviporum. We performed inoculation studies of three-year-old saplings in a greenhouse (purchased from a nursery). Norway spruce saplings were treated in either high (+) or low (-) water groups: high water group received double the water amount than the low water group. RNA was extracted and sequenced. Similarly, we quantified gene expression levels of candidate genes in biotic stress and jasmonic acid (JA) signaling pathways using qRT-PCR, through which we discovered a unique preferential defense response of H. parviporum-infected Norway spruce under drought stress at the molecular level. Disturbances related to water availability, especially low water conditions can have negative effects on the tree host and benefit the infection ability of the pathogens in the host. From our RNA-seq analysis, 114 differentially expressed gene regions were identified between high (+) and low (-) water groups under pathogen attack. None of these gene pathways were identified to be differentially expressed from both non-treated and mock-control treatments between high (+) and low (-) water groups. Finally, only four genes were found to be associated with drought in all treatments.
ARTICLE | doi:10.20944/preprints202007.0711.v1
Subject: Life Sciences, Molecular Biology Keywords: co-expression network; residual feed intake; RNA-Seq
Online: 30 July 2020 (09:39:36 CEST)
Long non-coding RNA (lncRNA) can regulate several aspects of gene expression, being associated with complex phenotypes in humans and livestock species. In taurine beef cattle, recent evidence points to the involvement of lncRNA in feed efficiency (FE), a proxy for increased productivity and sustainability. Here, we hypothesized specific regulatory roles of lncRNA in FE of indicine cattle. Using RNA-Seq data from liver, muscle, hypothalamus, pituitary and adrenal gland from Nellore bulls with divergent FE, we submitted new transcripts to a series of filters to confidently predict lncRNA. Then, we identified lncRNA that were differentially expressed (DE) and/or key regulators of FE. Finally, we explored lncRNA genomic location and interactions with miRNA and mRNA to infer potential function. We were able to identify 126 relevant lncRNA for FE in Bos indicus, some with high homology to previously identified lncRNA in Bos taurus and some possible specific regulators of FE in indicine cattle. Moreover, lncRNA identified here were linked to previously described mechanisms related to FE in hypothalamus-pituitary-adrenal axis and are expected to help elucidate this complex phenotype. This study contributes to expanding the catalogue of lncRNA, particularly in indicine cattle, and identifies candidates for further studies in animal selection and management.
ARTICLE | doi:10.20944/preprints201902.0042.v1
Subject: Life Sciences, Molecular Biology Keywords: RNA-Seq; Oncology; DNA repair; Survival; PCNA metagene
Online: 4 February 2019 (16:55:20 CET)
Removal of the proliferation component of gene expression by PCNA adjustment has been addressed in numerous survival prediction studies for breast cancer and all cancers in the TCGA. These studies indicate that widespread co-regulation of proliferation upwardly biases survival prediction when gene selection is performed on a genome-wide basis. In addition, removal of the correlative effects of proliferation does not reduce the random bias associated with survival prediction using random gene selection. Since most cancers become addicted to DNA repair as a result of forced cellular replication, increased oxidation, and repair deficiencies from oncogenic loss or genetic polymorphisms, we pursued an investigation to remove the proliferation component of expression in DNA repair genes to determine survival prediction. This translational hypothesis-driven focus on DNA repair genes is directly amenable to finding new sets of DNA repair genes that could potentially be studied for inhibition therapy. Overall survival (OS) prediction was evaluated in 18 cancers by using normalized RNA-Seq data for 126 DNA repair genes with expression available in TCGA. Transformations for normality and adjustments for age at diagnosis, stage, and PCNA metagene expression were performed for all DNA repair genes. We also analyzed genomic event rates (GER) for somatic mutations, deletions, and amplification in driver genes and DNA repair genes. After performing empirical p-value testing with use of randomly selected gene sets, it was observed that OS could be predicted significantly by sets of DNA repair genes for 61% (11/18) of the cancers. Interestingly, PARP1 was not a significant predictor of survival for any of the 11 cancers. Results from cluster analysis of GERs indicates that the most opportunistic cancers for inhibition therapy may be AML, colorectal, and renal papillary, because of potentially less confounding due to lower GERs for mutations, deletions, and amplifications in DNA repair genes. However, the most opportunistic cancer for inhibition therapy is likely to be AML, since it showed the lowest GERs for mutations, deletions, and amplifications in DNA repair genes. In conclusion, our hypothesis-driven focus to target DNA repair gene expression adjusted for the PCNA metagene as a means of predicting OS in various cancers resulted in statistically significant sets of genes.
ARTICLE | doi:10.20944/preprints201809.0486.v1
Subject: Biology, Plant Sciences Keywords: Histone deacetylase, metabolism, peanut, hairy roots, RNA-seq
Online: 25 September 2018 (12:40:05 CEST)
Peanut (Arachis hypogaea) is a crop plant with high economic value, but the epigenetic regulation of its growth and development has only rarely been studied. The peanut histone deacetylase 1 gene (AhHDA1) has been isolated and is known to be ABA- and drought-responsive. In this paper, we investigate the role of AhHDA1 in more detail, focussing on the effect of altered AhHDA1 expression in hairy roots at both the phenotypic and transcriptional levels. Agrobacterium rhizogenes-mediated transformation of A. hypogaea hairy roots was used to analyse how overexpression or RNA interference of AhHDA1 affects this tissue. In both types of transgenic hairy root, RNA sequencing was adopted to identify genes that were differentially expressed, and these genes were assigned to specific metabolic pathways. AhHDA1-overexpressing hairy roots were growth-retarded after 20 d in vitro cultivation, and superoxide anions and hydrogen peroxide accumulated to a greater extent than in control or RNAi groups. Overexpression of AhHDA1 is likely to accelerate flux through various secondary synthetic metabolic pathways in hairy roots, as well as reduce photosynthesis and oxidative phosphorylation. Genes encoding the critical enzymes caffeoyl-CoA O-methyltransferase (Araip.XGB85) and caffeic acid 3-O-methyltransferase (Araip.Z3XZX) in the phenylpropanoid biosynthesis pathway, chalcone synthase (Araip.B8TJ0) and polyketide reductase (Araip.MKZ27) in the flavonoid biosynthesis pathway, and hydroxyisoflavanone synthase (Araip.0P3RJ) and isoflavone 2'-hydroxylase (Araip.S5EJ7) in the isoflavonoid biosynthesis pathway were significantly upregulated by AhHDA1 overexpression, while their expression in AhHDA1-RNAi and control hairy roots remained at a lower level or was unchanged. Our results suggest that alteration of secondary metabolism activities is related to overexpression of AhHDA1, which is mainly reflected in phenylpropanoid, flavonoid and flavonoid biosynthesis. Future studies will focus on the function of AhHDA1 interacting proteins and their action on cell growth and stress responses.
ARTICLE | doi:10.20944/preprints201803.0257.v1
Online: 30 March 2018 (06:02:33 CEST)
Recently, selection in pigs has been focused on improving the lean meat content in carcasses; this focus has been most evident in breeds constituting a paternal component in breeding. Such sire-breeds are used to improve the meat quantity of cross-breed pig lines. However, even in one breed, a significant variation in the meatiness level can be observed. In the present study, the comprehensive analysis of genes and microRNA expression profiles in porcine muscle tissue was applied to identify the genetic background of meat content. The comparison was performed between whole gene expression and miRNA profiles of muscle tissue collected from two sire-line pig breeds (Piertain, Hampshire). The RNA-seq approach allowed the identification of 627 and 416 differentially expressed genes (DEGs) between pig groups differing in terms of loin weight between Pietrain and Hampshire breeds, respectively. The comparison of miRNA profiles showed differential expression of 57 microRNAs for Hampshire and 34 miRNAs for Pietrain pigs. Next, 43 genes and 18 miRNAs were selected as differentially expressed in both breeds and potentially related to muscle development. According to Gene Ontology analysis, identified DEGs and microRNAs were involved in the regulation of the cell cycle, fatty acid biosynthesis and regulation of the actin cytoskeleton. The most deregulated pathways dependent on muscle mass were the Hippo signalling pathway connected with the TGF-beta signalling pathway and controlling organ size via the regulation of ubiquitin-mediated proteolysis, cell proliferation and apoptosis. The identified target genes were also involved in pathways such as the FoxO signalling pathway, signalling pathways regulating pluripotency of stem cells and the PI3K-Akt signalling pathway. The obtained results indicate molecular mechanisms controlling porcine muscle growth and development. Identified genes (SOX2, SIRT1, KLF4, PAX6 and genes belonging to the transforming growth factor beta superfamily) could be considered candidate genes for determining muscle mass in pigs.
ARTICLE | doi:10.20944/preprints202202.0149.v1
Subject: Life Sciences, Molecular Biology Keywords: immune response; fatty acid; lipid metabolism; RNA-Seq; transcriptome
Online: 10 February 2022 (10:57:03 CET)
The objective of this study was to identify key transcription factors involved in lipid metabolism and immune response related to the differentially expressed genes (DEG) from the liver samples of 35 pig model for metabolic diseases fed diets containing either 1.5 or 3.0% soybean oil (SOY1.5 or SOY3.0). A total of 281 DEG between SOY1.5 and SOY3.0 diets (log2fold-change ≥ 1 or ≤ −1; FDR-corrected p-value < 0.1) were identified, in which 129 were down-regulated and 152 were up-regulated in SOY1.5 group. The functional annotation analysis detected transcription factors linked to lipid homeostasis and immune response, such as RXRA, EGFR, and SREBP2 precursor. These findings demonstrated that key transcription factors related to lipid metabolism could be modulated by dietary inclusion of soybean oil. It could contribute to nutrigenomics research field that aims to elucidate dietary interventions in animal and human health, as well as to drive the food technology and science.
ARTICLE | doi:10.20944/preprints202111.0565.v1
Subject: Biology, Plant Sciences Keywords: Salt stress; Jerusalem artichoke; Time series analysis; RNA-seq
Online: 30 November 2021 (11:55:51 CET)
Background: Jerusalem artichoke (Helianthus tuberosus L.) is tolerant to salinity stress and has high economic value. The salt tolerance mechanisms of Jerusalem artichoke are still unclear. Especially in the early stage of Jerusalem artichoke exposure to salt stress, the plant physiology, biochemistry and gene transcription are likely to undergo large changes. Elucidating these changes may be of great significance to understanding the salt tolerance mechanisms of it. Results: We obtained high-quality transcriptome from leaves and roots of Jerusalem artichoke exposed to salinity (300 mM NaCl) for 0 h, 6 h, 12 h, 24 h and 48 h, with 150,129 unigenes and 9023 DEGs (Differentially Expressed Genes). The RNA-seq data were clustered into time-dependent groups (nine clusters each in leaves and roots); gene functions were distributed evenly among the groups convergence. KEGG enrichment analysis showed the genes related to plant hormone signal transduction were enriched in almost all treatment comparisons. Under salt stress, genes belongs to PYL (abscisic acid receptor PYR / PYL family), PP2C (Type 2C protein phosphatases), GH3 (Gretchen Hagen3), ETR (ethylene receptor), EIN2/3 (ethylene-insensitive protein 2/3), JAZ (Genes such as jasmonate ZIM-domain gene) and MYC2 (Transcription factor MYC2) had extremely similar expression patterns. The results of qPCR of 12 randomly selected genes confirmed the accuracy of RNA-seq. Conclusions: Under the impact of high salinity (300mM) environment, Jerusalem artichoke in the seedling stage was difficult to survive for a long time, and the phenotype was severe in the short term. Based on the expression of genes on the time scale, we found that the distribution of gene functions in time is relatively even. Upregulation of the phytohormone signal transduction had a crucial role in the response of Jerusalem artichoke seedlings to salt stress, the genes of abscisic acid, auxin, ethylene, and jasmonic acid had the most obvious change pattern.
ARTICLE | doi:10.20944/preprints202103.0196.v1
Subject: Biology, Anatomy & Morphology Keywords: Single cell RNA-seq; spatial reconstruction; development; coalescent embedding
Online: 5 March 2021 (21:21:59 CET)
Single cell RNA-seq (scRNA-seq) profiles conceal temporal and spatial tissue developmental information. De novo reconstruction of single cell temporal trajectory has been fairly addressed, but reverse engineering single cell 3D spatial tissue localization is hitherto landmark based, and de novo spatial reconstruction is a compelling computational open problem. Here we show that a new algorithm - named D-CE - for coalescent embedding of single cell transcriptomic networks can address this open problem. We rely merely on the spatial information encoded in the expression patterns of developmental signal transcription factor (DST) genes, and we find that D-CE of cell-cell association DST-transcriptomic networks reliably reconstructs the Geo-seq or single cell samples’ 3D spatial tissue distribution. Comparison to the novoSpaRC and CSOmap (recent and only available de novo 3D spatial reconstruction methods) on 16 datasets and 681 reconstructions, reveals a significantly distinctive superior performance of D-CE.
Subject: Life Sciences, Molecular Biology Keywords: lncRNA; breast cancer; alternative splicing; estrogen receptor; RNA-Seq
Online: 19 April 2020 (04:29:31 CEST)
Background: DSCAM-AS1 is a cancer-related long noncoding RNA with higher expression levels in Luminal A, B and HER2-positive Breast Cancer (BC), where its expression is strongly dependent on Estrogen Receptor Alpha (ERα). Methods: To decipher its function, DSCAM-AS1 expression was measured by qRT-PCR in tissue samples from 93 BC patients in addition to a meta-analysis of 30 gene expression datasets, together with the evaluation of its association with clinical data. By computational analyses of our RNA-Seq in MCF-7 cells, we investigated the DSCAM-AS1 knock-down effects at both gene and isoform levels. Results: We confirmed DSCAM-AS1 overexpression in high grade Luminal A, B and HER2+ BCs and found a significant correlation with disease relapse. 908 genes were regulated by DSCAM-AS1-silencing, primarily involved in cell cycle and inflammatory response. Noteworthy, the analysis of alternative splicing and isoform regulation revealed 2,085 splicing events regulated by DSCAM-AS1, enriched in differential polyadenylation sites and 3’UTR shortening events. Finally, the DSCAM-AS1-interacting splicing factor hnRNPL was predicted as the most enriched RBP for exon skipping and 3’UTR events. Conclusion: The relevance of DSCAM-AS1 overexpression in BC is confirmed by clinical data and further enhanced by its possible involvement in the regulation of RNA processing, which is emerging as one of the most important dysfunctions in cancer.
ARTICLE | doi:10.20944/preprints201912.0322.v1
Subject: Biology, Plant Sciences Keywords: pm57; physical mapping; rna-seq; common wheat; molecular markers
Online: 24 December 2019 (11:30:27 CET)
Powdery mildew caused by Blumeria graminis f. sp. tritici (Bgt) is one of many severe diseases that threaten bread wheat (Triticum aestivum L.) yield and quality worldwide. The discovery and deployment of powdery mildew resistance genes (Pm) can prevent this disease epidemic in wheat. In a previous study, we transferred the powdery mildew resistance gene Pm57 from Aegilops searsii into common wheat and cytogenetically mapped the gene in a chromosome region with the fraction length (FL) 0.75-0.87, which represents 12% of 2Ss#1 segment on the long arm of chromosome 2Ss#1. In this study, we performed RNA-Seq on three infected and mock-infected wheat-Ae. searsii 2Ss#1 introgression lines with Bgt-isolates inoculation at 0, 12, 24, and 48 hours after inoculation. Then we designed 79 molecular markers based on transcriptome sequences and physically mapped them to Ae. searsii chromosome 2Ss#1- in seven intervals. We used these markers to identify 46 wheat-Ae. searsii 2Ss#1 recombinants induced by ph1b, a deletion mutant of pairing homoelogous (Ph) genes. Analysis of the 46 ph1b-induced 2Ss#1L recombinants with different Bgt-responses using 28 2Ss#1L-specific molecular markers in the interval FL0.72-0.87 where Pm57 is located, and the flanking intervals, we physically mapped Pm57 gene on the long arm of 2Ss#1 in a 5.13 Mb genomic region, which was flanked by markers X67593 (773.72 Mb) and X62492 (778.85 Mb). By comparative synteny analysis of the corresponding region on chromosome 2B in Chinese spring (T. aestivum L.) with other model species we identified ten genes that are putative plant defense-related (R) genes which includes six coiled-coil nucleotide-binding site-leucine-rich repeat (CNL), three nucleotide-binding site-leucine-rich repeat (NL) and a leucine-rich receptor-like repeat (RLP) encoding proteins. This study will lay a foundation for further cloning of Pm57, and benefit the understanding of interactions between resistance genes of wheat and powdery mildew pathogens.
ARTICLE | doi:10.20944/preprints202209.0362.v1
Subject: Life Sciences, Genetics Keywords: RNA-Seq; Vitamin K; Comorbidities; Differential Expressed Genes; Variant analysis
Online: 23 September 2022 (09:13:29 CEST)
Systems genetics is key for integrating a large number of variants associated with diseases. Vitamin K (VK) is one of the scarcely studied conditions in lieu of ascertaining either the differentially expressed genes (DEGs) or variants in an individual subpopulation of diseased phenotypes associated with VK, viz. myocardial infarction, renal failure, prostate cancer, thrombosis, thrombocytopenia, coagulation related diseases to name a few. In this work, we have screened characteristic DEGs common to three VK-related diseases, viz. myocardial infarction, renal failure and prostate cancer and asked whether or not any DEGs in addition to pathogenic variants are common to these conditions. We attempt to bridge the gap in finding characteristic biomarkers and discuss the role of long noncoding RNAs (lncRNAs) in the biogenesis of VK deficiencies.
REVIEW | doi:10.20944/preprints202209.0327.v1
Subject: Medicine & Pharmacology, Behavioral Neuroscience Keywords: scRNA-seq; bioinformatics; subpopulations; analysis methods; single-cell RNA sequencing
Online: 21 September 2022 (11:22:50 CEST)
Single-cell RNA sequencing data facilitates investigation of cell heterogeneity and subpopulations as well as differentially abundant states however modern single-cell RNA sequencing datasets are growing in size and complexity requiring advances in the bioinformatic methods that analyze them. Many methods exist for each step of analysis including read alignment, normalization, quality control, batch effect correction, imputation and dimensionality reduction. With so many options to choose from at each step of the analysis, benchmarking and a synthesis of the literature on the methods available is necessary to inform biological researchers on the most optimal workflow for their data. Here, recent key methods of analysis are highlighted with a focus on methods that facilitate identification of cell subpopulations and differentially abundant cell states. With a constantly expanding toolset for each step in single-cell RNA sequencing dataset analysis, biological researchers should stay informed to utilize the most applicable methods for their own analyses.
ARTICLE | doi:10.20944/preprints202202.0320.v1
Subject: Medicine & Pharmacology, General Medical Research Keywords: Neurodegenerative disease; DJ-1; RNA-seq; Nrf2 signaling; lncRNA; MALAT1
Online: 25 February 2022 (02:40:02 CET)
Microglia activation causes neuroinflammation, which is a hallmark of neurodegenerative disorders, brain injury, and aging. Ladostigil, a bifunctional reagent with antioxidant and anti-inflammatory properties, reduced microglial activation and enhanced brain functioning in elderly rats. In this study, we studied SH-SY5Y, a human neuroblastoma cell line, and tested viability in the presence of hydrogen peroxide and Sin1 (3-morpholinosydnonimine), which generates reactive oxygen and nitrogen species (ROS/RNS). Both stressors caused significant apoptosis and necrotic cell death that was attenuated by ladostigil. Our results from RNA-seq experiments show that long non-coding RNAs (lncRNAs) account for 30% of all transcripts in SH-SY5Y cells treated with Sin1 for 24 hours. Altogether, we identify 94 differently expressed lncRNAs in the presence of Sin1, including MALAT1, a highly expressed lncRNA with anti-inflammatory and anti-apoptotic functions. Additional activities of Sin-1 upregulated lncRNAs include redox homeostasis (e.g., MIAT, GABPB1-AS1), energy metabolism (HAND2-AS1), and neurodegeneration (e.g., MIAT, GABPB1-AS1, NEAT1). Four lncRNAs implicated as enhancers were significantly upregulated in cells exposed to Sin1 and ladostigil. Finally, we show that H2O2 and Sin1 increased the expression of DJ-1, a redox sensor and modulator of Nrf2 (nuclear factor erythroid 2–related factor 2). Nrf2 (NFE2L2 gene) is a major transcription factor regulating antioxidant genes. In the presence of ladostigil, DJ-1 expression is restored to its baseline. The mechanisms governing SH-SY5Y cell survival and homeostasis are highlighted by the beneficial role of ladostigil in the crosstalk involving Nrf2, antioxidant transcription factor DJ-1, and lncRNAs. Stress-dependent induction of lncRNAs represents an underappreciated regulatory level that contributes to cellular homeostasis and the capacity of SH-SY5Y to cope with oxidative stress.
REVIEW | doi:10.20944/preprints202202.0004.v1
Subject: Life Sciences, Biotechnology Keywords: Spatial transcriptomics; Molecular imaging; single-cell RNA-seq; intratumoral heterogeneity
Online: 1 February 2022 (11:08:51 CET)
Intratumoral heterogeneity associates with more aggressive disease progression and worse patient outcomes. Understanding the reasons enabling the emergence of such heterogeneity remains incomplete, which restricts our ability to manage it from a therapeutic perspective. Technological advancements such as high-throughput molecular imaging, single-cell omics and spatial transcriptomics now allow recording the patterns of spatiotemporal heterogeneity in a longitudinal manner, thus offering insights into the multi-scale dynamics of its evolution. Here, we review latest technological trends and biological insights from molecular diagnostics as well as spatial transcriptomics, both of which have witnessed a burgeoning growth in recent past in terms of mapping heterogeneity within tumor cell types as well as stromal constitution. We also discuss ongoing challenges, indicating possible ways to integrate insights across these methods to have a systems-level spatiotemporal map of heterogeneity in each tumor, and a more systematic investigation of implications of heterogeneity for the patient outcomes.
ARTICLE | doi:10.20944/preprints202008.0103.v1
Subject: Biology, Animal Sciences & Zoology Keywords: chicken; Newcastle disease; spleen; immune response; gene expression; RNA-seq
Online: 4 August 2020 (16:09:52 CEST)
As a major infectious disease in chickens, Newcastle disease causes considerable economic losses in the poultry industry, especially in developing countries where there is limited access to effective vaccination. Therefore, enhancing resistance to the virus in commercial chickens through breeding is a promising way to promote poultry production. In this study, we investigated gene expression changes at 2 and 6 dpi after infection at day21 with a lentogenic Newcastle disease virus in a commercial egg-laying chicken hybrid using RNA sequencing analysis. By comparing NDV challenged and nonchallenged groups, 526 differentially expressed genes (DEGs) (FDR < 0.05) were identified at 2 dpi, and only 36 at 6 dpi. For the DEGs at 2 dpi, IPA analysis predicted inhibition of multiple signaling pathways in response to NDV that regulate immune cell development and activity, neurogenesis and angiogenesis. Upregulation of Interferon Induced Protein with Tetratricopeptide Repeats 5 (IFIT5) in response to NDV was consistent between the current and most previous studies. Sprouty RTK Signaling Antagonist 1 (SPRY1), a DEG in the current study is located in a significant QTL associated with virus load at 6 dpi in the same population. These identified pathways and DEGs provide potential targets to further study breeding strategy to enhance NDV resistance in chickens.
ARTICLE | doi:10.20944/preprints202002.0307.v1
Online: 21 February 2020 (08:09:25 CET)
An outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) occurred in China towards the end of 2019, and has spread rapidly ever since. Previous studies showed that some virus could affect the reproductive system and cause long-term complications. Recent studies exploring the source of SARS-CoV-2 using genomic sequencing have revealed that SARS-CoV-2 enters the host cells via the angiotensin-converting enzyme II (ACE2), the receptor that recognizes SARS-CoV. To investigate the expression of ACE2 and to explore the potential risk of infection in the reproductive system, we performed a thorough bioinformatic analysis on data from public databases involving RNA expression, protein expression, and single-cell RNA expression studies. The analyzed data showed high levels of ACE2 mRNA and protein expression in the testis and spermatids and equal levels of ACE2 expression in the uterus and lung. Comprehensive single-cell analysis identified ACE2 expression in the lung, testis, spermatids, and uterus. In conclusion, this study revealed the potential risk associated with the SARS-CoV-2 infection in the reproductive system and predicted that long-term complications might have a significant impact on the prevention and management of COVID-19, the disease caused upon infection with SARS-CoV-2.
ARTICLE | doi:10.20944/preprints202208.0340.v1
Online: 18 August 2022 (10:45:51 CEST)
Numerous proteomic and transcriptomic studies have been carried out to better understand the current multi-variant SARS-CoV-2 virus mechanisms of action and effects. However, they are mostly centered on mRNAs and proteins. The effect of the virus on human post-transcriptional regulatory agents such as microRNAs (miRNAs) involved in the regulation of 60% of human gene activity remains poorly explored. Similar to what we have previously done with other viruses such as Ebola and HIV, in this study we investigated the miRNA profile of lung epithelial cells following infection with SARS-CoV-2. At the 24 and 72 hours post-infection, SARS-CoV-2 did not drastically alter the miRNome. About 90% of the miRNAs remained non-differentially expressed. The results revealed that miR-1246, miR-1290 and miR-4728-5p were the most upregulated over time. miR-196b-5p and miR-196a-5p were the most downregulated at 24 h while at 72 h, miR-3924, miR-30e-5p and miR-145-3p showed the highest level of downregulation. In the top significantly enriched KEGG pathways of genes targeted by differentially expressed miRNAs we found, among others, MAPK, RAS, P13K-Akt and renin secretion signaling pathways. By RT-qPCR, we also showed that SARS-CoV-2 may regulate several predicted host mRNA targets involved in the entry of the virus into host cells (ACE2, TMPRSS2, ADAM17 and FURIN), in renin–angiotensin system (RAS) (Renin, Angiotensinogen, ACE), innate immune response (IL-6, IFN1β, CXCL10, SOCS4) and fundamental cellular processes (AKT, NOTCH, WNT). Finally, we demonstrated by dual luciferase assay a direct interaction between miR-1246 and ACE-2 mRNA. This study highlights the modulatory role of miRNAs in the pathogenesis of SARS-CoV-2.
ARTICLE | doi:10.20944/preprints201903.0286.v1
Subject: Life Sciences, Molecular Biology Keywords: lung adenocarcinoma; KRAS; MYC; ERBB; mouse models of cancer; RNA-SEQ
Online: 30 March 2019 (06:41:07 CET)
Inducible genetically defined mouse models of cancer uniquely facilitate the investigation of early events in cancer progression, however there are valid concerns about the ability of such models to faithfully recapitulate human disease. We developed an inducible mouse model of progressive lung adenocarcinoma (LuAd) that combines sporadic activation of oncogenic KRasG12D with modest overexpression of c-MYC (KM model). Histological examination revealed a highly reproducible transition from adenoma to locally invasive adenocarcinoma within 6 weeks of oncogene activation. Laser-capture microdissection coupled with RNA-SEQ was employed to determine transcriptional changes associated with tumour progression. Upregulated genes were triaged for relevance to human LuAd using datasets from Oncomine and cBioportal. Selected genes were validated by RNAi screening in human lung cancer cell lines and examined for association with lung cancer patient overall survival using KMplot.com. Depletion of progression-associated genes resulted in pronounced viability and/or cell migration defects in human lung cancer cells. Progression-associated genes moreover exhibited strong associations with overall survival, specifically in human lung adenocarcinoma, but not in squamous cell carcinoma. The KM mouse model faithfully recapitulates key molecular events in human lung cancer and is a useful tool for mechanistic interrogation of LuAd progression.
Subject: Biology, Horticulture Keywords: transcriptome; Solanum lycopersicum; RNA-seq; light intensity distributions; differentially expressed genes
Online: 19 March 2019 (10:42:26 CET)
Plants grown under fluctuating light impact plant developments compared with those grown under non-fluctuating light conditions. However, our knowledge on the underlying regulatory mechanisms is still quite limited, particularly from the transcriptional perspective. In order to investigate the influence of different light intensity distributions on tomato plant development, we designed three fluctuating light intensity distributions with the non-fluctuating light intensity as control and compared the transcriptional differences after five weeks of treatment. We found plant height and aerial/root weight were significantly reduced under all fluctuating light treatments. Transcriptome analysis revealed that the number of up and down regulated genes had a distinct distribution pattern between different treatments and control. The largest difference between the numbers of down and up regulated genes was found between treatment 1 and 3, reaching to a total of 416 genes. The number and type of the top 20 enriched pathways differed between treatments and control. The largest number of genes enriched was involved in the biosynthesis of secondary metabolites. These results provide insights into the transcriptional regulations of tomato under different light intensity distributions.
ARTICLE | doi:10.20944/preprints201803.0145.v1
Subject: Life Sciences, Genetics Keywords: repetitive elements; RNA-Seq; genomics; evolution; cytogenetics; supernumerary elements; extra chromosomes
Online: 19 March 2018 (08:33:48 CET)
B chromosomes (B) are supernumerary elements found in many taxonomic groups. Most B chromosomes are rich in heterochromatin and composed of abundant repetitive sequences, especially transposable elements (TEs). Bs origin is generally linked to the A chromosome complement (A). The first report of a B chromosome in African cichlids was on Astatotilapia latifasciata, which can harbor 0, 1 or 2 B chromosomes. Classical cytogenetics studies found high TE content on the species B chromosome. In this study, we aim to understand TE composition and expression on A. latifasciata genome and its relation to the B chromosome. We use bioinformatics analysis to explore TEs genome organization and also their composition on the B chromosome. Bioinformatics findings were validated by fluorescent in situ hybridization (FISH) and real-time PCR (qPCR). A. latifasciata has a TE content similar to other cichlid fishes and several expanded elements on its B chromosome. With RNA sequencing data (RNA-seq) we showed that all major TE classes are transcribed in brain, muscle and male/female gonads. The evaluation of TE expression between B- and B+ individuals showed that few elements have differential expression among groups and expanded B elements were not highly transcribed. Putative silencing mechanisms may the acting on the B chromosome of A. latifasciata to prevent adverse consequences of repeat transcription and mobilization in the genome.
ARTICLE | doi:10.20944/preprints201609.0062.v1
Subject: Biology, Plant Sciences Keywords: Nicotiana tabacum; solanesol; RNA-seq; solanesyl diphosphate synthase; gene expression; chlorophyll
Online: 18 September 2016 (10:45:27 CEST)
Solanesol is a noncyclic terpene alcohol composed of nine isoprene units and it mainly accumulates in solanaceous plants, especially tobacco (Nicotiana tabacum L.). Here, RNA-seq analyses of tobacco leaves, stems, and roots were used to identify solanesol biosynthesis genes. Six 1-deoxy-d-xylulose 5-phosphate synthase, two 1-deoxy-d-xylulose 5-phosphate reductoisomerase, two 2-C-methyl-d-erythritol 4-phosphate cytidylyltransferase, four 4-diphosphocytidyl-2-C-methyl-d-erythritol kinase, two 2-C-methyl-d-erythritol 2,4-cyclodiphosphate synthase, four 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase, two 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate reductase, six isopentenyl diphosphate isomerase, and two solanesyl diphosphate synthase (SPS) genes were identified to be involved in solanesol biosynthesis. Furthermore, the two N. tabacum SPS (NtSPS1 and NtSPS2), which had two conserved aspartate-rich DDxxD domains, were highly homologous with SPS enzymes from other solanaceous plant species. In addition, the solanesol contents of three organs, and leaves from four growing stages, corresponded with the distribution of chlorophyll. Our findings provide a comprehensive evaluation of the correlation between the expression of different biosynthetic genes and the accumulation of solanesol in tobacco.
BRIEF REPORT | doi:10.20944/preprints202109.0349.v1
Subject: Medicine & Pharmacology, Gastroenterology Keywords: RNA-Seq; bioinformatics; web application; gene expression; alternative splicing; visualization; molecular epidemiology
Online: 20 September 2021 (16:56:32 CEST)
Gene expression data is key for the functional annotation of single nucleotide polymorphisms (SNPs) identified in genome-wide association studies (GWAS). Expression and splicing quantitative trait loci (e/sQTLs) in normal colon tissue, such as those from the University of Barcelona and University of Virginia RNA sequencing project (BarcUVa-Seq) and the Genotype-Tissue Expression project (GTEx), are required to gain biological insight of colon-related diseases risk loci. Moreover, transcriptome-wide association studies (TWAS) rely on reference gene expression imputation panels in the tissue of interest to nominate susceptibility genes. Also, it is of high interest to study the relationships between genes in a network framework. For facilitating these analyses, we have updated and expanded the scope of the Colon Transcriptome Explorer (CoTrEx) to the version 2.0. This web-based resource provides exhaustive visualization and analysis of transcriptome-wide gene expression profiles of normal colon tissue from BarcUVa-Seq and GTEx. In addition to the integration of new datasets, CoTrEx 2.0 provides additional e/sQTLs sets, as well as gene expression prediction models and regulatory and co-expression networks. It is freely available at https://barcuvaseq.org/cotrex/. Overall, it is of high interest for researchers aiming to investigate the genetic susceptibility to colon-related complex traits and diseases.
ARTICLE | doi:10.20944/preprints202012.0496.v1
Subject: Life Sciences, Biochemistry Keywords: Hungateiclostridium thermocellum; adaptive laboratory evolution; RNA-seq; cellulosomal genes; EMP pathway; monosaccharides
Online: 21 December 2020 (10:36:00 CET)
Hungateiclostridium thermocellum ATCC 27405 is a promising bacterium with a robust ability to degrade lignocellulosic biomass complexes, including crystalline cellulose components, through a multienzyme cellulosomal system. In contrast, it exhibits poor growth on simple monosaccharides such as fructose and glucose. This phenomenon raises many important questions concerning its glycolytic pathways and sugar transport systems. Until now, the detailed mechanisms of H. thermocellum adaptation to growth on monosaccharides have been poorly explored. In this study, adaptive laboratory evolution was applied to train the bacterium on monosaccharides, and genome resequencing was used to detect the genes that had mutated during adaptation. RNA-seq data of the 1st-generation culture growing on either fructose or glucose revealed that several glycolytic genes in the EMP pathway were expressed at lower levels in these cells than in cellobiose-grown cells. After 8 generations of culture on fructose and glucose, the evolved H. thermocellum strains grew faster and yielded greater biomass than the nonadapted strains. Genomic screening also revealed several mutation events in the genomes of the evolved strains, especially in genes responsible for sugar transport and central carbon metabolism. Consequently, these genes could be applied as targets for further metabolic engineering to improve this bacterium for bioindustrial usage.
ARTICLE | doi:10.20944/preprints202103.0187.v1
Subject: Keywords: Transcriptome analysis; Capra hircus; Differential gene expression; Pashmina goat; Barbari goat; RNA-seq
Online: 5 March 2021 (11:50:26 CET)
The Pashmina and Barbari are two famous goat breeds found in the wide areas of the Indo-Pak region. Pashmina is famous for its long hair-fiber (Cashmere) production while Barbari is not-selected for this trait. So, the mRNA expression profiling in the skin samples of both breeds would be an attractive and judicious approach for detecting putative genes involved in this valued trait. Here, we performed differential gene expression analysis on publicly available RNA-Seq data from both breeds. Out of 44,617,994 filtered reads of Pashmina and 55,995,999 of Barbari which are 76.48% and 73.69% mapped to the ARS1 reference transcriptome assembly respectively. A pairwise comparison of both breeds resulted in 47,159 normalized expressed transcripts while 8,414 transcripts are differentially expressed above the significant threshold. Among these, 4,788 are upregulated in Pashmina while 3,626 transcripts are upregulated in Barbari. Fifty-nine transcripts harbor 57 genes including 32 LOC genes and 24 are annotated genes which were selected on the basis of TMM counts > 500. Genes with ectopic expressions other than uncharacterized and LOC symbol genes are Keratins (KRT) and Keratin Associated Proteins (KRTAPs), CystatinA&6, TCHH, SPRR4, PPIA, SLC25A4, S100A11, DMKN, LOR, ANXA2, PRR9 and SFN. All of these genes are likely to be involved in keratinocyte differentiation, sulfur matrix proteins, dermal papilla cells, hair follicles proliferation, hair curvature, wool fiber diameter, hair transition, hair shaft differentiation and its keratinization. These differentially expressed reported genes are critically valuable for enhancing the quality and quantity of the pashmina fiber and overall breed improvement. This study will also provide important information on hair follicle differentiation for further enrichment analyses and introducing this valued trait to other goat breeds as well.
ARTICLE | doi:10.20944/preprints202006.0144.v1
Subject: Life Sciences, Other Keywords: Corynebacterium pseudotuberculosis; RNA-Seq; co-expression networks; influence genes; stress condition; causal genes
Online: 12 June 2020 (08:46:02 CEST)
Corynebacterium pseudotuberculosis is a Gram-positive bacterium that causes caseous lymphadenitis, a disease that predominantly affects sheep, goat, cattle, buffalo, and horses, but has also been recognized in other animals. This bacterium generates a severe economic impact on countries producing meat. Gene expression studies using RNA-seq is one of the most commonly used techniques to perform transcriptional experiments. Computational analysis on such data through reverse-engineering algorithms leads to a better understanding of the genome-wide complexity of gene interactomes, enabling the identification of genes having the most significant functions inferred by the activated stress response pathways. In this study, we identified the influential or causal genes from four RNA-seq data-sets from different stress conditions (high iron, low iron, acid, osmosis, and PH) in C. pseudotuberculosis, using a consensus-based network inference algorithm called miRsig and identified the causal genes in the network using the miRinfluence tool, which is based on the influence diffusion model. We found that over 50\% of the genes identified as influential have some essential cellular functions in the genomes. In the strains analyzed, most of the causal genes have crucial roles or participate in processes associated with response to extracellular stresses, pathogenicity, membrane components, and essential genes. This research brings new insight into the understanding of virulence and infection by C. pseudotuberculosis.
ARTICLE | doi:10.20944/preprints202209.0388.v1
Subject: Medicine & Pharmacology, Oncology & Oncogenics Keywords: spatial single-cell analysis; intratumor heterogeneity; kriging; spatial entropy; Was-serstein distance; cancer; RNA-seq
Online: 26 September 2022 (08:57:58 CEST)
Intratumor heterogeneity (ITH) is associated with therapeutic resistance and poor prognosis in cancer patients, and attributed to genetic, epigenetic, and microenvironmental factors. We developed a new computational platform, GATHER, for geostatistical modeling of single cell RNA-seq data to synthesize high-resolution and continuous gene expression landscapes of a given tumor sample. Such landscapes allow GATHER to map the enriched regions of pathways of interest in the tumor space and identify genes that have spatial differential expressions at locations representing specific phenotypic contexts using measures based on optimal transport. GATHER provides new applications of spatial entropy measures for quantification and objective characterization of ITH. It includes new tools for insightful visualization of spatial transcriptomic phenomena. We illustrate the capabilities of GATHER using real data from breast cancer tumor to study hallmarks of cancer in the phenotypic contexts defined by cancer associated fibroblasts.
ARTICLE | doi:10.20944/preprints202202.0357.v1
Subject: Life Sciences, Molecular Biology Keywords: granulosa cells; heat stress; apoptosis; oxidative stress; RNA-seq; transcriptomics; differentially expressed genes; signaling pathways
Online: 28 February 2022 (11:08:42 CET)
Heat stress affects the granulosa cells (GCs) and ovarian follicular microenvironment, causing poor oocyte developmental competence and fertility. This study aimed to investigate the physical responses and global transcriptomic changes in bovine GCs to acute heat stress (43 ℃ for 2 h) in-vitro and gave essential insights into the general interaction at cell–stress nexus. Heat-stressed GCs exhibited transient proliferation senescence, resumed proliferation at 48 h post-stress. While post-stress immediate culture-media change had a relatively positive effect on proliferation resumption. Increased accumulation of reactive oxygen species and apoptosis was observed in heat stress group. In spite of the upregulation of pro-apoptotic and caspase executioner genes, antioxidants and anti-apoptotic genes were also upregulated in heat-stressed GCs. Progesterone and Estrogen hormones along with steroidogenic genes expression, declined significantly, in spite of the upregulation of genes involved in cholesterol synthesis. Out of 12385 differentially expressed genes (DEGs), 330 significant DEGs (75 upregulated, 225 downregulated) were subjected to KEGG functional pathway annotation, gene ontology enrichment, and STRING network analyses. Based on the manual query of DEGs, pathway and enrichment analyses, a vast interplay observed among all major signaling pathways strongly evidence the repression of cellular transcriptional and proliferation activity, averting the effects of heat stress through remodeling of cellular structural proteins and energetic-homeostasis. This study presents detailed responses of acute heat-stressed GCs at physical, transcriptional, and pathway levels and presents interesting insights into future studies regarding GCs adaptation and their interaction with oocyte and reproductive system at ovarian level.
REVIEW | doi:10.20944/preprints202003.0290.v1
Subject: Life Sciences, Molecular Biology Keywords: Histone PTM; RNA Polymerase II; ChIP-seq; chromatin; epigenetics; transcriptional interference; plant; Transcription Cycle; Transcription
Online: 18 March 2020 (17:14:28 CET)
Post-translational modifications (PTMs) of histone residues shape the landscape of gene expression by modulating the dynamic process of RNAPII transcription. The contribution of particular histone modifications to the definition of distinct RNAPII transcription stages remains poorly characterized in plants. Chromatin Immuno-precipitation combined with next-generation sequencing (ChIP-seq) resolves the genomic distribution of histone modifications. Here, we review histone PTM ChIP-seq data in Arabidopsis thaliana and find support for a Genomic Positioning System (GPS) that guides RNAPII transcription. We review the roles of histone PTM “readers”, “writers” and “erasers”, with a focus on the regulation of gene expression and biological functions in plants. The distinct functions of RNAPII transcription during the plant transcription cycle may in part rely on the characteristic histone PTMs profiles that distinguish transcription stages.
ARTICLE | doi:10.20944/preprints201808.0244.v1
Subject: Life Sciences, Molecular Biology Keywords: osteoarthritis; RNA-seq; STR/ort; C57BL/6; MRL/MpJ; ACL injury; PTOA; regeneration; inflammation; B4galnt2
Online: 14 August 2018 (05:47:38 CEST)
Injuries to the anterior cruciate ligament (ACL) often result in post-traumatic osteoarthritis (PTOA). To better understand the molecular mechanisms behind PTOA development following ACL injury, we profiled ACL injury-induced gene expression changes in knee joints of three mouse strains with varying susceptibility to OA: STR/ort (highly susceptible), C57BL/6 (moderately susceptible) and super-healer MRL/MpJ (not susceptible). Right knee joints of the mice were injured using a non-invasive tibial compression injury model that closely mimics ACL rupture in humans and global gene expression was quantified before and at 1-day, 1-week, and 2-weeks post-injury using RNA-seq. Following injury, STR/ort displayed severe cartilage degeneration while MRL/MpJ had little cartilage damage. Gene expression analysis suggested that prolonged inflammation and elevated catabolic activity in STR/ort injured joints, compared to the other two strains may be responsible for the severe PTOA phenotype observed in this strain. MRL/MpJ had the lowest expression values for several inflammatory cytokines and catabolic enzymes activated in response to ACL injury. Furthermore, we identified several genes highly expressed in MRL/MpJ compared to the other two strains including B4galnt2 and Tpsab1 which may contribute to enhanced healing in the MRL/MpJ. Overall, this study has increased our knowledge of early molecular changes associated with PTOA development.
ARTICLE | doi:10.20944/preprints202203.0110.v1
Subject: Biology, Other Keywords: benchmarking; bioinformatics; defective viral genomes; gradient boosting; machine learning; RNA-seq; SARS-CoV-2; virus replication
Online: 7 March 2022 (16:25:18 CET)
The generation of different types of defective viral genomes (DVG) is an unavoidable consequence of the error-prone replication of RNA viruses. In recent years, a particular class of DVGs, those containing long deletions or genome rearrangements, has gain interest due to their potential therapeutic and biotechnological applications. Identifying such DVGs in high-throughput sequencing data has become an interesting computational problem. Up to nowadays, several algorithms have been proposed, though all incur in false positives, a problem of practical interest if such DVGs have to be synthetized and tested in the laboratory. Here we develop a novel software, DVGfinder, that wraps the two most commonly used algorithms into a pipeline that predicts DVGs. Using a gradient boosting classifier machine learning algorithm, we evaluate the performance of DVGfinder compared to previous algorithms and found that it outcompetes their precision and sensitivity in simulated datasets. DVGfinder generates user-friendly output files in HTML format that can assist users to identify DVGs based on their associated probability of being true positives.
ARTICLE | doi:10.20944/preprints202201.0348.v1
Subject: Medicine & Pharmacology, Other Keywords: Data Science; Genomic Data Science; Machine Learning; Network Analysis; RNA-Seq; Precision Medicine; Subtyping; Parkinson’s Disease
Online: 24 January 2022 (11:36:51 CET)
Precision medicine emphasizes fine-grained diagnostics, taking individual variability into account to enhance treatment effectiveness. Parkinson's Disease (PD) heterogeneity among individuals is a proof that disease subtypes exist, and assigning individuals to subgroups is necessary for a better understanding of disease mechanisms and designing precise treatment approaches. The purpose of this study was to identify PD subtypes using RNA-Seq data in a combined pipeline including unsupervised machine learning, bioinformatics, and network analysis. 210 post mortem brain RNA-Seq samples from PD (n = 115) and Normal Controls (NC, n = 95) were obtained with a systematic data retrieval following PRISMA statements and a fully data-driven clustering pipeline was performed to identify PD subtypes. Bioinformatics and Network analyses were performed to characterize the disease mechanisms of the identified PD subtypes and to identify target genes for drug repurposing. Two PD clusters were identified and 42 DEGs were found (p.adjusted ≤ 0.01). PD clusters had significantly different gene network structures (p < 0.0001) and phenotype-specific disease mechanisms, highlighting the differential involvement of the Wnt/β-catenin pathway regulating adult neurogenesis. NEUROD1 was identified as a key regulator of gene networks and ISX9 and PD98059 were identified as NEUROD1-interacting compounds with disease-modifying potential, reducing the effects of dopaminergic neurodegeneration. This hybrid data analysis approach could enable precision medicine applications by providing insights for the identification and characterization of pathological subtypes. This workflow has proven useful on PD brain RNA-Seq, but its application to other neurodegenerative diseases is encouraged.
ARTICLE | doi:10.20944/preprints202112.0111.v1
Subject: Biology, Plant Sciences Keywords: Durum wheat; heat stress; grain weight; grain quality; RNA-seq; gene regulatory network; DOF transcription factor
Online: 7 December 2021 (23:38:32 CET)
In a changing climate, extreme weather events such as heat waves will be more frequent and could affect grain weight and the quality of crops such as wheat, one of the most significant crops in terms of global food security. In this work, we characterized the response of Triticum turgidum spp. durum wheat to a short-term heat-stress (HS) treatment at transcriptomic and physiological levels during early grain filling in glasshouse experiments. We found a significant reduction in grain weight and size from HS treatment. Grain quality was also affected, showing a decrease in starch content in addition to increments in grain protein levels. Moreover, an RNA-seq analysis of durum wheat grains allowed us to identify 1590 differentially expressed genes related to photosynthesis, response to heat, and carbohydrate metabolic process. A gene regulatory network analysis of HS-responsive genes uncovered novel transcription factors (TFs) controlling the expression of genes involved in abiotic stress response and grain quality, such as a member of the DOF family predicted to regulate glycogen and starch biosynthetic processes in response to HS in grains. In summary, our results provide new insights into the extensive transcriptome reprogramming that occurs during short-term HS in durum wheat grains.
REVIEW | doi:10.20944/preprints202007.0466.v1
Subject: Life Sciences, Genetics Keywords: Alternative Splicing; RNA-Seq; Machine Learning; Deep Learning; Recommender Systems; Multiple Instance Learning; mRNA Isoforms; Gene Ontology
Online: 20 July 2020 (10:53:23 CEST)
Multiple mRNA isoforms of the same gene are produced via alternative splicing, a biological mechanism that regulates protein diversity while maintaining genome size. Alternatively spliced mRNA isoforms of the same gene may sometimes have very similar sequence, but they can have significantly diverse effects on cellular function and regulation. The products of alternative splicing have important and diverse functional roles, such as response to environmental stress, regulation of gene expression, human heritable and plant diseases. The mRNA isoforms of the same gene, such as the apoptosis associated CASP3 gene, can have dramatically different functions. The shorter mRNA isoform product CASP3-S inhibits apoptosis, while the longer CASP3-L mRNA isoform promotes apoptosis. Despite the functional importance of mRNA isoforms, very little has been done to annotate their functions. The recent years have however seen the development of several computational methods aimed at predicting mRNA isoform level biological functions. These methods use a wide array of proteo-genomic data to develop machine learning-based mRNA isoform function prediction tools. In this review, we discuss the computational methods developed for predicting the biological function at the individual mRNA isoform level.
ARTICLE | doi:10.20944/preprints202004.0108.v1
Subject: Medicine & Pharmacology, Nutrition Keywords: prebiotics; oligosaccharides; GOS; FOS; RNA-seq; transcriptome; differential gene expression; functional pathway analysis; Caco-2; polarized monolayers
Online: 7 April 2020 (13:37:18 CEST)
Prebiotic oligosaccharides are widely used as human and animal feed additives for their beneficial effects on the gut microbiota. However, there are limited data to assess the direct effect of such functional foods on the transcriptome of intestinal epithelial cells. The purpose of this study is to describe the differential transcriptomes and cellular pathways of colonic cells directly exposed to galacto-oligosaccharides (GOS) and fructo-oligosaccharides (FOS). We have examined the differential gene expression of polarized Caco-2 cells treated with GOS or FOS and their respective mock-treated cells using mRNA sequencing (RNA-seq). A total of 89 significant differentially expressed genes were identified between GOS and mock-treated groups. For FOS treatment, a reduced number of 12 significant genes were observed to be differentially expressed relative to the control group. KEGG and Gene Ontology functional analysis revealed that genes up-regulated in the presence of GOS were involved in digestion and absorption processes, fatty acids and steroids metabolism, potential antimicrobial proteins, energy-dependent and -independent transmembrane trafficking of solutes and amino acids. Using our data, we have established complementary non-prebiotic modes of action for these frequently used dietary fibers.
ARTICLE | doi:10.20944/preprints201811.0183.v2
Subject: Life Sciences, Molecular Biology Keywords: sequencing technologies; NGS; genome research; genome assembly; variant calling; RNA-Seq; transcriptome assembly; bioinformatics; molecular biology; education
Online: 13 November 2018 (10:22:06 CET)
Combined awareness about the power and limitations of bioinformatics and molecular biology enables advanced research based on high-throughput data. Despite an increasing demand for scientists with a combined background in both fields, the education in dry lab and wet lab is often separated. This work describes an example of integrated education with focus on genomics and transcriptomics. Participants learn computational and molecular biology methods in the same practical course. Peer-review is applied as a teaching method to foster cooperative learning of students with heterogeneous backgrounds. Evaluation results indicate acceptance and appreciation of this approach.
ARTICLE | doi:10.20944/preprints202205.0378.v1
Subject: Life Sciences, Genetics Keywords: Drosophila; leg imaginal disc; lncRNA; development; scRNA-seq; scATAC-seq
Online: 27 May 2022 (09:48:47 CEST)
The Drosophila imaginal disc has been an excellent model for the study of developmental gene regulation. In particular, long non-coding RNAs (lncRNAs) have gained widespread attention in recent years due to their important role in gene regulation. Their specific spatiotemporal expressions further support their role in developmental processes and diseases. In this study, we explored the role of a novel lncRNA in Drosophila leg development by dissecting and dissociating w1118 third-instar larval third leg (L3) discs into single cells and single nuclei, and performing single-cell RNA-sequencing (scRNA-seq) and single-cell assays for transposase-accessible chromatin (scATAC-seq). Single-cell transcriptomics analysis of the L3 discs across three developmental timepoints revealed different cell types and identified lncRNA:CR33938 as a distal specific gene with high expression in late development. This was further validated by fluorescence in-situ hybridization (FISH). The scATAC-seq results reproduced the single-cell transcriptomics landscape and elucidated the distal cell functions at different timepoints. Furthermore, overexpression of lncRNA:CR33938 in the S2 cell line increased the expression of leg development genes, further confirming its important role in development.
ARTICLE | doi:10.20944/preprints202104.0344.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Lysine; Rice; Amino Acids; Saline Stress; Abiotic Stress; Gene Regulatory Network; Bayesian Network; Parameter Estimation; Inference; RNA Seq
Online: 13 April 2021 (10:52:26 CEST)
Lysine is the first limiting essential amino acid in rice because it is present in the lowest quantity compared to all the other amino acids. Amino acids are the building block of proteins and play an essential role in maintaining the human body’s healthy functioning. Rice is a staple food for large proportion of the global population, thus increasing the lysine content in rice will improve its nutritional value. In this paper, we studied the lysine biosynthesis pathway in rice (Oryza Sativa) to identify the regulators of the lysine reporter gene LYSA (LOC_Os02g24354). Genetically intervening at the regulators has the potential to increase the overall lysine content in rice. We modeled the lysine biosynthesis pathway in rice seedlings under normal and saline (NaCl) stress conditions using Bayesian networks. We estimated the model parameters using experimental data and identified the gene DAPF(LOC_Os12g37960) as a positive regulator of the lysine reporter gene LYSA under both normal and saline stress conditions. Based on this analysis, we conclude that the gene DAPF is a potent candidate for genetic intervention. Upregulating DAPF using methods such as CRISPR-Cas9 has the potential to upregulate the lysine reporter gene LYSA and increase the overall lysine content in rice.
ARTICLE | doi:10.20944/preprints201907.0140.v1
Subject: Life Sciences, Molecular Biology Keywords: PlGF; PGF; blood-retinal barrier; RNA Seq; HREC; gene ontology; fastQC; Trimmomatic; KEGG; pentose phosphate pathway; TGF-β
Online: 10 July 2019 (07:48:20 CEST)
Placental growth factor (PlGF or PGF) is a member of the VEGF family, which is known to play a critical role in pathological angiogenesis, inflammation, and endothelial cell barrier function. However, the molecular mechanisms by which PlGF mediates its effects in non-proliferative diabetic retinopathy (DR) remain elusive. In this study, we performed transcriptome-wide profiling of differential gene expression for human retinal endothelial cells (HRECs) treated with PlGF antibody. The effect of antibody treatment on the samples was validated using trans-endothelial electric resistance (TEER), and western blot. A total of 3760 genes (1750 upregulated and 2010 downregulated) were found to be differentially expressed between the control and PlGF antibody treatment group. These differentially expressed genes (DEGs) were used for gene ontology and enrichment analysis to identify gene function, signal pathway, and interaction networks. The gene ontology results revealed that catalytic activity (GO:0003824) of molecular function, cell (GO:0005623) of the cellular component, and cellular process (GO:0009987) were among the most enriched biological processes. Pathways such as TGF-β, VEGF-VEGFR2, p53, apoptosis, pentose phosphate pathway, and ubiquitin-proteasome pathway, were among the most enriched, and TGF-β1 was identified as a primary upstream regulator. These data provide new insights into the underlying molecular mechanisms of PlGF in mediating biological functions, in relation to DR.
ARTICLE | doi:10.20944/preprints202101.0443.v1
Subject: Life Sciences, Biochemistry Keywords: Trichome; type IV; K-seq; QTLs mapping; QTL-seq; tomato; Solanum pimpinellifolium
Online: 22 January 2021 (12:11:59 CET)
Trichomes are a common morphological defense against pests, in particular, type IV glandular trichomes have been associated with resistance against different invertebrates. Cultivated tomatoes usually lack or have a very low density of type IV trichomes. Thus, specific breeding programs to incorporate these natural defences, that are common within the Solanum genus, might improve a more sustainable management. We have identified a S. pimpinellifolium accession with very high density of this type of trichomes. Two F2 mapping populations using two different parents have been developed, characterized and genotyped using a new genotype methodology, K-seq. We have been able to build an ultra-dense genetic map with 147,326 markers with an average distance between markers of 0.2 cM that has allowed us to perform a detailed mapping. We have used two different families and two different approaches, QTL mapping and QTL-seq, to identify several QTLs implicated in the control of trichome type IV developed in this accession on the chromosomes 5, 6, 9 and 11. The QTL located on chromosome 9 is a major QTL that has not been previously reported in S. pimpinellifolium that increases by a factor of 9 the density of trichomes.
ARTICLE | doi:10.20944/preprints201610.0041.v1
Subject: Life Sciences, Microbiology Keywords: RNA; DNA; Repetitive sequences; RNA stem loops; RNA group identities
Online: 12 October 2016 (10:58:59 CEST)
Current knowledge of the RNA world indicates two different genetic codes being present throughout the living world. In contrast to non-coding RNAs that are built of repetitive nucleotide syntax, the sequences that serve as templates for proteins share – as main characteristics – a non-repetitive syntax. The differences in their syntax structure is coherent with the difference of the functions they represent. Whereas non-coding RNAs build groups that serve as regulatory tools in nearly all genetic processes, the coding sections represent the evolutionarily successful function of the genetic information storage medium. The DNA genomes themselves are rather inactive, whereas the non-coding RNA domain is highly active, even as non-random genetic innovation operators. This indicates that repetitive syntax is the essential pre-requisite for RNA interactions to install variable RNA-group-identities, whereas the non-repetitive syntax serves as a stable conservation tool for successful selection processes out of RNA-groups cooperation and competition. The interaction opportunities of RNA loops with repetitive syntax are higher than with non-repetitive ones. Interestingly, these two genetic codes resemble the function of all natural languages, i.e., (a) everyday language use for organization and coordination of biotic group behavior, and (b) artificial (instrumental) language use for conservation of blueprints for complex protein-body constructions.
ARTICLE | doi:10.20944/preprints201902.0172.v4
Subject: Life Sciences, Molecular Biology Keywords: RNA-dependent amplification of mammalian mRNA; physiologically occurring intracellular PCR, iPCR; RNA-dependent RNA polymerase, RdRp; chimeric RNA; sense-strand RNA; antisense-strand RNA
Online: 12 June 2019 (12:21:59 CEST)
The transfer of protein-encoding genetic information from DNA to RNA to protein, a process formalized as the “Central Dogma of Molecular Biology”, has undergone a significant evolution since its inception. It was amended to account for the information flow from RNA to DNA, the reverse transcription, and for the information transfer from RNA to RNA, the RNA-dependent RNA synthesis. These processes, both potentially leading to protein production, were initially described only in viral systems, and although RNA-dependent RNA polymerase activity was shown to be present, and RNA-dependent RNA synthesisfound to occur, in mammalian cells, its function was presumed to be restricted to regulatory. However, recent results, obtained with multiple mRNA species in several mammalian systems, strongly indicate the occurrence of protein-encoding RNA to RNA information transfer in mammalian cells. It can result in the rapid production of the extraordinary quantities of specific proteins as was seen in cases of terminal cellular differentiation and during cellular deposition of extracellular matrix molecules. A malfunction of this process may be involved in pathologies associated either with the deficiency of a protein normally produced by this mechanism or with the abnormal abundanceof a protein or of its C-terminal fragment. It seems to be responsible for some types of familial thalassemia and may underlie the overproduction of beta amyloid in sporadic Alzheimer’s disease. The aim of the present article is to systematize the current knowledge and understanding of this pathway. The outlined framework introduces unexpected features of the mRNA amplification such as its ability to generate polypeptides non-contiguously encoded in the genome, its second Tier, a physiologically occurring intracellular polymerase chain reaction, iPCR, a Two-Tier Paradox and RNA Dark Matter. RNA-dependent mRNA amplification represents a new mode of genomic protein-encoding information transfer in mammalian cells. Its potential physiological impact is substantial, it appears relevant to multiple pathologies and its understanding opens new venues of therapeutic interference, it suggests powerful novel bioengineering approaches and its further rigorous investigations are highly warranted.
REVIEW | doi:10.20944/preprints202104.0484.v1
Subject: Life Sciences, Biochemistry Keywords: RNA world theory; Viral RNA; Genome stability; Viral evolution; Mutational signatures; RNA dependent RNA polymerase, RdRp; RNA recombination; RNA damage; Hypermutation; APOBEC; ADAR; RNA editing; SARS-CoV-2; rubella virus
Online: 19 April 2021 (13:22:01 CEST)
The current SARS- CoV-2 pandemic underscores the importance of understanding the evolution of RNA genomes. While RNA is subject to the formation of similar lesions as DNA, the evolutionary and physiological impacts RNA lesions have on viral genomes are yet to be characterized. Lesions that may drive the evolution of RNA genomes can induce breaks that are repaired by recombination or can cause base substitution mutagenesis, also known as base editing. Over the past decade or so, base editing mutagenesis of DNA genomes has been subject to many studies, revealing that exposure of ssDNA is subject to hypermutation that is involved in the etiology of cancer. However, base editing of RNA genomes has not been studied to the same extent. Recently hypermutation of single-stranded RNA viral genomes have also been documented though its role in evolution and population dynamics. Here, we will summarize the current knowledge of key mechanisms and causes of RNA genome instability covering areas from the RNA world theory to the SARS- CoV-2 pandemic of today. We will also highlight the key questions that remain as it pertains to RNA genome instability, mutations accumulation, and experimental strategies for addressing these questions.
ARTICLE | doi:10.20944/preprints202112.0071.v1
Subject: Biology, Plant Sciences Keywords: Phaseolus vulgaris; Colletotrichum lindemuthianum; RNA silencing; Argonaute; double-stranded RNA binding (DRB); RNA-dependent RNA polymerase (RDR); Pol IV
Online: 6 December 2021 (12:42:51 CET)
RNA silencing serves key roles in a multitude of cellular processes, including development, stress responses, metabolism, and maintenance of genome integrity. Dicer, Argonaute (AGO), double-stranded RNA binding (DRB), RNA-dependent RNA polymerase (RDR) and DNA-dependent RNA polymerases known as Pol IV and Pol V form core components to trigger RNA silencing. Common bean (Phaseolus vulgaris) is an important staple crop worldwide. In this study, we aimed to unravel the components of the RNA-guided silencing pathway in this non-model plant taking advantage of the availability of two genome assemblies of Andean and Meso-American origin. We identified six PvDCLs, thirteen PvAGOs, 10 PvDRB, 5 PvRDR, in both genotypes, suggesting no recent gene amplification or deletion after the gene pool separation. In addition, we identified one PvNRPD1 and one PvNRPE1 encoding the largest subunits of Pol IV and Pol V, respectively. These genes were categorized into subgroups based on phylogenetic analyses. Comprehensive analyses of gene structure, genomic localization and similarity among these genes were performed. Their expression patterns were investigated by means of expression models in different organs using online data and quantitative RT-PCR after pathogen infection. Several of the candidate genes were up-regulated after infection with the fungus Colletotrichum lindemuthianum.
ARTICLE | doi:10.20944/preprints201809.0082.v1
Subject: Medicine & Pharmacology, Cardiology Keywords: atherosclerosis; coronary aortic disease; gene set enrichment analysis; heart disease; Apoe mouse; transcriptomics; RNA-seq analysis; pathway enrichment analysis; mouse; precision medicine; New Zealand White rabbit
Online: 5 September 2018 (04:49:40 CEST)
The central promise of personalized medicine is individualized treatments that target molecular mechanisms underlying the physiological changes and symptoms arising from disease. We demonstrate a bioinformatics analysis pipeline as a proof-of-principle to test the feasibility and practicality of comparative transcriptomics to classify two of the most popular in vivo diet-induced models of coronary atherosclerosis, apolipoprotein E null mice and New Zealand White rabbits. Transcriptomics analyses indicate the two models extensively share dysregulated genes albeit with some unique pathways. For instance, while both models have alterations in the mitochondrion, the biochemical pathway analysis revealed, Complex IV in the electron transfer chain is higher in mice, whereas the rest of the electron transfer chain components are higher in the rabbits. Several fatty acids anabolic pathways are expressed higher in mice, whereas fatty acids and lipids degradation pathways are higher in rabbits. This reflects the differences between two translational models of atherosclerosis. This study validates transcriptome analysis as a potential method to precisely identify altered cellular and molecular pathways in atherosclerotic disease, which can be used to individualize treatment even in the absence of genetic data.
ARTICLE | doi:10.20944/preprints202112.0225.v1
Subject: Chemistry, Medicinal Chemistry Keywords: RNA targeting; RNA-based interactions; bis-3-chloropiperidines
Online: 14 December 2021 (11:13:29 CET)
After a long limbo, RNA has gained its credibility as a druggable target, fully earning its de-served role in the next-generation area of pharmaceutical R&D. We have recently probed the Trans-Activation Response element (TAR), a RNA stem–bulge–loop domain of the HIV-1 genome with bis-3-chloropiperidines (B-CePs), and revealed the compounds unique behavior in stabiliz-ing TAR structure, thus impairing in vitro the chaperone activity of the HIV-1 nucleocapsid (NC) protein. Seeking to elucidate the determinants of B-CePs inhibition, we have further characterized here their effects on the target TAR and its NC recognition, while developing quantitative analyti-cal approaches for the study of multicomponent RNA-based interactions.
REVIEW | doi:10.20944/preprints202012.0452.v1
Subject: Life Sciences, Biochemistry Keywords: RNA; self-amplifying RNA; replicon; vaccine; drug delivery
Online: 18 December 2020 (11:12:44 CET)
This review will explore the four major pillars required for design and development of an saRNA vaccine: antigen design, vector design, non-viral delivery systems, and manufacturing (both saRNA and lipid nanoparticles (LNP)). In will report on the major innovations, preclinical and clinical data reported in the last five years and will discuss future prospects.
Subject: Biology, Other Keywords: endosome; exosome; extracellular vesicles; fungal RNA biology; membrane trafficking; RNA transport; RNA recognition motif
Online: 21 January 2020 (03:26:40 CET)
Membrane-coupled RNA transport is an emerging theme in fungal biology. This review focuses on the RNA cargo and mechanistic details of transport via two inter-related sets of organelles: endosomes and extracellular vesicles for intra- and intercellular RNA transfer. Simultaneous transport and translation of messenger RNAs (mRNAs) on the surface of shuttling endosomes is a conserved process pertinent to highly polarised eukaryotic cells, such as hyphae or neurons. Here we detail the endosomal mRNA transport machinery components and mRNA targets of the core RNA-binding protein Rrm4. Extracellular vesicles (EVs) are newly garnering interest as mediators of intercellular communication, especially between pathogenic fungi and their hosts. Landmark studies in plant-fungus interactions indicate EVs as a means of delivering various cargos, most notably small RNAs (sRNAs), for cross-kingdom RNA interference. Recent advances and implications of the nascent field of fungal EVs are discussed and potential links between endosomal and EV-mediated RNA transport are proposed.
ARTICLE | doi:10.20944/preprints202002.0299.v1
Subject: Life Sciences, Cell & Developmental Biology Keywords: SARS-CoV-2; infection; scRNA-Seq; ACE2; spermatogonia
Online: 21 February 2020 (02:42:15 CET)
In December 2019, a novel coronavirus (SARS-CoV-2) was identified in patients with pneumonia (called COVID-19) in Wuhan, Hubei Province, China. SARS-CoV-2 shares high sequence similarity and uses the same cell entry receptor, angiotensin-converting enzyme 2 (ACE2), as does severe acute respiratory syndrome coronavirus (SARS-CoV). Several studies have provided bioinformatic evidence of potential routes for SARS-CoV-2 infection in respiratory, cardiovascular, digestive and urinary systems. However, whether the reproductive system is a potential target of SARS-CoV-2 infection has not been determined. Here, we investigate the expression pattern of ACE2 in adult human testis at the level of single-cell transcriptomes. The results indicate that ACE2 is predominantly enriched in spermatogonia, Leydig and Sertoli cells. Gene ontology analyses indicate that GO categories associated with viral reproduction and transmission are highly enriched in ACE2-positive spermatogonia while male gamete generation related terms are down-regulated. Cell-cell junction and immunity related GO terms are increased in ACE2-positive Leydig and Sertoli cells, but mitochondria and reproduction related GO terms are decreased. These findings provide evidence that human testes are a potential target of SARS-CoV-2 infection which may have significant impact on our understanding of the pathophysiology of this rapidly spreading disease.
ARTICLE | doi:10.20944/preprints202107.0531.v1
Subject: Biology, Anatomy & Morphology Keywords: A.thaliana; HaloTag; RNA-binding proteins; RNA pulldown assay; RNA-protein complexes; cold shock domain protein
Online: 23 July 2021 (09:32:28 CEST)
Study of RNA-protein interactions and identification of RNA targets are among the key aspects of understanding the RNA biology. Currently, various methods are available to investigate these interactions, in particular, RNA pulldown assay. In the present paper, a method based on the HaloTag technology is presented that is called Halo-RPD (HaloTag RNA PullDown). The proposed protocol uses plants with stable fusion protein expression and the MagneBeads magnetic beads to capture RNA-protein complexes directly from the cytoplasmic lysate of transgenic A. thaliana plants. The key stages described in the paper are as follows: 1) preparation of the magnetic beads 2) tissue homogenization and collection of control samples 3) precipitation and wash of RNA-protein complexes; 4) evaluation of protein binding efficacy; 5) RNA isolation; 6) analysis of the obtained RNA. Recommendations for better NGS assay designs are provided.
ARTICLE | doi:10.20944/preprints202103.0179.v1
Subject: Life Sciences, Biochemistry Keywords: RNA interference; dsRNA delivery; small RNA production; dsRNA formulation
Online: 5 March 2021 (10:01:04 CET)
Plant pathogenic fungi are the largest group of disease-causing agents on crop plants and represent a persistent and significant threat to agriculture worldwide. Conventional approaches based on the use of pesticides raise social concern for the impact on the environment and human health and alternative control methods are urgently needed. The rapid improvement and extensive implementation of RNAi technology for various model and non-model organisms has provided the initial framework to adapt this post-transcriptional gene silencing technology for the management of fungal pathogens. In this review, we describe exogenous RNAi involved in plant pathogenic fungi and discuss small RNA production, formulation, and RNAi delivery methods. We explore some challenges with possible solutions. Furthermore, exogenous RNAi holds great potential for RNAi-mediated plant pathogenic fungal disease control.
ARTICLE | doi:10.20944/preprints202011.0213.v1
Subject: Biology, Anatomy & Morphology Keywords: Gekkota; reptiles; DNA-seq; sex chromosomes; sex determination; qPCR
Online: 5 November 2020 (14:14:53 CET)
Geckos demonstrate a remarkable variability in sex determination systems, but our limited knowledge prohibits accurate conclusions on the evolution of sex determination in this group. Eyelid geckos (Eublepharidae) are of particular interest, as they encompass species with both environmental and genotypic sex determination. We identified for the first time the X-specific gene content in the Yucatán banded gecko, Coleonyx elegans, possessing X1X1X2X2/X1X2Y multiple sex chromosomes by comparative genome coverage analysis between sexes. The X-specific gene content of Coleonyx elegans was revealed to be partially homologous to genomic regions linked to the chicken autosomes 1, 6 and 11. A qPCR-based test was applied to validate a subset of X-specific genes by comparing the difference in gene copy numbers between sexes, and to explore the homology of sex chromosomes across 11 eublepharid, two phyllodactylid and one sphaerodactylid species. Homologous sex chromosomes are shared between Coleonyx elegans and Coleonyx mitratus, two species diverged approximately 34 million years ago, but not with other tested species. As far as we know, the X-specific gene content of Coleonyx elegans / Coleonyx mitratus was never involved in the sex chromosomes of other gecko lineages, indicating that the sex chromosomes in this clade of eublepharid geckos evolved independently.
REVIEW | doi:10.20944/preprints202102.0496.v1
Subject: Medicine & Pharmacology, Allergology Keywords: non-coding; leukemia; B-cell; RNA-sequencing; small RNA-sequencing
Online: 22 February 2021 (16:33:30 CET)
Non-coding RNAs (ncRNAs) comprise a diverse class of non-protein coding transcripts that regulate critical cellular processes associated with cancer. Advances in RNA-sequencing (RNA-Seq) have led to the characterization of non-coding RNA expression across different types of human cancers. Through comprehensive RNA-Seq profiling, a growing number of studies demonstrate that ncRNAs, including long non-coding RNA (lncRNAs) and microRNAs (miRNA), play central roles in progenitor B-cell Acute Lymphoblastic Leukemia (B-ALL) pathogenesis. Furthermore, due to their central roles in cellular homeostasis and their potential as biomarkers, the study of ncRNAs continues to provide new insight into the molecular mechanisms of B-ALL. This article reviews the ncRNA signatures reported for all B-ALL subtypes, focusing on technological developments in transcriptome profiling and recently discovered examples of ncRNAs with biologic and therapeutic relevance in B-ALL.
HYPOTHESIS | doi:10.20944/preprints202105.0520.v1
Subject: Life Sciences, Biochemistry Keywords: genome evolution; ribozymes; RNA ligase; early Earth; autocatalytic sets; RNA world
Online: 21 May 2021 (10:16:35 CEST)
The evolutionary origin of the genome remains elusive. Here, I hypothesize that its first iteration, the protogenome, was a multi-ribozyme RNA. It evolved, likely within liposomes (the protocells) forming in dry-wet cycling environments, through the random fusion of ribozymes by a ligase and was amplified by a polymerase. The protogenome thereby linked, in one molecule, the information required to seed the protometabolism (a combination of RNA-based autocatalytic sets) in newly forming protocells. If this combination of autocatalytic sets was evolutionarily advantageous, the protogenome would have amplified in a population of multiplying protocells. It likely was a quasispecies with redundant information, e.g., multiple copies of one ribozyme. As such, new functionalities could evolve, including a genetic code. Once one or more components of the protometabolism were templated by the protogenome (e.g., when a ribozyme was replaced by a protein enzyme), and/or addiction modules evolved, the protometabolism became dependent on the protogenome. Along with increasing fidelity of the RNA polymerase, the protogenome could grow, e.g., by incorporating additional ribozyme domains. Finally, the protogenome could have evolved into a DNA genome with increased stability and storage capacity. I will provide suggestions for experiments to test some aspects of this hypothesis.
ARTICLE | doi:10.20944/preprints202105.0322.v1
Subject: Biology, Anatomy & Morphology Keywords: Virus; plant virus; long noncoding RNA; replication; positive sense RNA virus
Online: 14 May 2021 (11:01:56 CEST)
Long noncoding RNAs (lncRNAs) of virus origin accumulate in cells infected by many positive strand (+) RNA viruses to bolster viral infectivity. Their biogenesis mostly utilizes exoribonucleases of host cells that degrade viral genomic or subgenomic RNAs in the 5’-to-3’ direction until being stalled by well-defined RNA structures. Here we report a viral lncRNA that is produced by a novel replication-dependent mechanism. This lncRNA corresponds to the last 283 nucleotides of the turnip crinkle virus (TCV) genome, hence is designated tiny TCV subgenomic RNA (ttsgR). ttsgR accumulated to high levels in TCV-infected Nicotiana benthamiana cells when the TCV-encoded RNA-dependent RNA polymerase (RdRp), also known as p88, was overexpressed. Both (+) and (-) strand forms of ttsgR were produced in these cells in a manner dependent on the RdRp functionality. Strikingly, templates as short as ttsgR itself were sufficient to program ttsgR amplification, as long as the TCV-encoded replication proteins, p28 and p88, were provided in trans. Consistent with its replicational origin, ttsgR accumulation required a 5’ terminal G3(A/U)4 motif shown by others to be crucial for the replication of a TCV satellite RNA. More importantly, introducing a new G3(A/U)4 motif elsewhere in the TCV genome was alone sufficient to cause the emergence of another lncRNA. Collectively our results unveil a replication-dependent mechanism for the biogenesis of viral lncRNAs, thus suggesting that multiple mechanisms, individually or in combination, may be responsible for viral lncRNA production.
ARTICLE | doi:10.20944/preprints202003.0347.v1
Online: 23 March 2020 (07:46:54 CET)
SARS-CoV-2 is the causative agent for the ongoing COVID19 pandemic, and this virus belongs to the Coronaviridae family. Like other members of this family, the virus possesses a positive-sense single-stranded RNA genome. The genome encodes for the nsp12 protein, which houses the RNA-dependent-RNA polymerase (RdRP) activity responsible for the replication of the viral genome. A homology model of nsp12 was prepared using the structure of the SARS nsp12 (6NUR) as a model. The model was used to carry out in silico screening to identify molecules among natural products, or FDA approved drugs that can potentially inhibit the activity of nsp12. This exercise showed that vitamin B12 (methylcobalamin) may bind to the active site of the nsp12 protein. A model of the nsp12 in complex with substrate RNA and incoming NTP showed that Vitamin B12 binding site overlaps with that of the incoming nucleotide. A comparison of the calculated energies of binding for RNA plus NTP and methylcobalamin suggested that the vitamin may bind to the active site of nsp12 with significant affinity. It is, therefore, possible that methylcobalamin binding may prevent association with RNA and NTP and thus inhibit the RdRP activity of nsp12. Overall, our computational studies suggest that methylcobalamin form of vitamin B12 may serve as an effective inhibitor of the nsp12 protein.
ARTICLE | doi:10.20944/preprints201805.0234.v1
Subject: Biology, Other Keywords: non-coding RNA; telomerase RNA; secondary structure; synteny; homology search; yeast
Online: 16 May 2018 (11:58:28 CEST)
The telomerase RNA in yeasts is large, usually >1000 nt, and contains functional elements that have been extensively studied experimentally in several disparate species. Nevertheless, they are very difficult to detect by homology-based methods and so far have escaped annotation in the majority of the genomes of Saccharomycotina. This is a consequence of sequences that evolve rapidly at nucleotide level, are subject to large variations in size, and are highly plastic with respect to their secondary structures. Here we report on a survey that was aimed at closing this gap in RNA annotation. Despite considerable efforts and the combination of a variety of different methods, it was only partially successful. While 27 new telomerase RNAs were identified, we had to restrict our efforts to the subgroup Saccharomycetacea because even this narrow subgroup was diverse enough to require different search models for different phylogenetic subgroups. More distant branches of the Saccharomycotina still remain without annotated telomerase RNA.
ARTICLE | doi:10.20944/preprints202012.0421.v1
Online: 17 December 2020 (09:13:29 CET)
Whole genome pooled sequence data of 12 Pakistani Teddy goats is analyzed for positive selection signatures as their breed defining characteristics. Selection imprints left in the Teddy genome are unveiled by genomic differentiation after the successful paired-end alignment of 635,357,043 reads with (ARS1) reference genome assembly. Pooled-heterozygosity ( ) and Tajima’s D (TD) are applied for validation and getting better hits of selection signals, while pairwise FST statistics is conducted on Teddy vs. Bezoar (wild goat ancestor) for genomic differentiation. Annotation of regions under positive selection reveals 59 genes underlying production and adaptive traits. score ≥ 5 detected six windows having highest scores on Chr. 29, 9, 25, 15 and 14 that harbor HRASLS5, LACE1 and AXIN1 genes which are candidate for embryonic development, lactation and body height. Secondly, TD value of ≤ -2.2 showed 4 windows with very strong hits on Chr.5 & 9 harbor STIM1 and ADM genes related to body mass and weight. Lastly, FST analysis generated three strong signals with threshold ≤ 0.42 on Chr.12 & 5 harbor ITGB1 gene associated with milk production & lactation traits. Other significant selection signatures encompass genes associated with wool production, prolificacy, immunity and coat colors. In brief, this study identified the genes under selection in this Pakistani goat breed that will be helpful to refining future breeding policies and converging required productive traits within and across other goat breeds and to explore full genetic potential of this valued livestock species.
REVIEW | doi:10.20944/preprints202206.0005.v1
Subject: Life Sciences, Genetics Keywords: cancer; gene regulation; small nucleolar RNA (snoRNA); small nucleolar derived RNA (sdRNA); microRNA (miRNA); RNA; snoRNA; sdRNA; miRNA; genetics
Online: 1 June 2022 (05:58:58 CEST)
In the past decade, RNA fragments derived from full length small nucleolar RNAs (snoRNAs) have been shown to be specifically excised and functional. These sno-derived RNAs (sdRNAs) have been implicated as gene regulators in a multitude of cancers, controlling a variety of genes post-transcriptionally via association with the RNA-induced silencing complex (RISC). In this review, we have summarized the literature connecting sdRNAs to cancer gene regulation. SdRNAs possess miRNA-like functions, and are able to fill the role of tumor-suppressor or tumor-promoter in a tissue context-dependent manner. Indeed, there are many miRNAs that are actually derived from snoRNA transcripts, meaning that they are truly sdRNAs and as such are included in this review. As sdRNAs are frequently discarded from ncRNA analyses, we emphasize that sdRNAs are functionally relevant gene regulators and likely represent an overlooked subclass of miRNAs. Based on the evidence provided by the papers reviewed here, we propose that sdRNAs deserve more extensive study to better understand their underlying biology and to identify previously overlooked biomarkers and therapeutic targets for a multitude of human cancers.
REVIEW | doi:10.20944/preprints202109.0322.v1
Subject: Life Sciences, Virology Keywords: Grapevine; Viral Disease; Diagnostic Methods; RNA Sequencing; Nanopore Sequencing Technology; RNA modifications
Online: 20 September 2021 (10:43:01 CEST)
Among all economically important plant species in the world, grapevine (Vitis vinifera L.) is the most cultivated fruit plant. It has a significant impact on the economies of many countries through wine and fresh and dried fruit production. In recent years, the grape and wine industry has been facing outbreaks of known and emerging viral diseases across the world. Although high-throughput sequencing (HTS) has been used extensively in grapevine virology, the application and potential of third-generation sequencing have not been explored in understanding grapevine viruses and their impact on the grapevine. Nanopore sequencing, a third-generation technology, can be used for direct sequencing of both RNA and DNA with minimal infrastructure. Compared to other HTS methods, the MinION nanopore platform is faster and more cost-effective and allows for long-read sequencing. Due to the size of the MinION device, it can be easily carried for field viral disease surveillance. This review article discusses grapevine viruses and their diagnostic methods, the principle of nanopore sequencing technology and its application in grapevine virus detection, virus–plant interactions, as well as the characterization of viral RNA modifications.
ARTICLE | doi:10.20944/preprints202003.0393.v1
Subject: Life Sciences, Biophysics Keywords: SARS-CoV2; RNA depended RNA polymerase; Valproic acid Co-A; drug repurposing
Online: 26 March 2020 (15:04:22 CET)
SARS-CoV2 RNA depended RNA polymerase is an essential enzyme for the survival of the virus in hosts as it helps in the replication of viral RNA. There are no human polymerases that share either sequence or structural homology with viral RNA depended RNA polymerase. These make it a good target for inhibitor discovery, as a specific inhibitor cannot cross-react with the human polymerases. We have used virtual screening, docking, binding energy calculation and simulation to show that valproic acid Co-A, a metabolite from prodrug valproic acid, forms stable interaction with nsP12 of CoV. Our results suggest valproic acid Co-A could be a potential inhibitor of nsP12 of SARS-CoV2.
REVIEW | doi:10.20944/preprints202201.0073.v1
Subject: Medicine & Pharmacology, Other Keywords: Messenger RNA • Hospital-based mRNA therapeutics • circular mRNA • self-amplifying mRNA • RNA-based CAR T-cell • RNA-based gene-editing tools
Online: 6 January 2022 (11:20:59 CET)
Hospital-based programs democratize mRNA therapeutics by facilitating the processes to translate a novel RNA idea from the bench to the clinic. Because mRNA is essentially biological software, therapeutic RNA constructs can be rapidly developed. The generation of small batches of clinical grade mRNA to support IND applications and first-in-man clinical trials, as well as personalized mRNA therapeutics delivered at the point-of-care, is feasible at a modest scale of cGMP manufacturing. Advances in mRNA manufacturing science and innovations in mRNA biology, are increasing the scope of mRNA clinical applications.
ARTICLE | doi:10.20944/preprints202111.0539.v1
Subject: Life Sciences, Molecular Biology Keywords: Replication fork trap; Tus-Ter; dif; ChIP-Seq; GC-skew; Enterobacterales
Online: 29 November 2021 (12:52:31 CET)
In Escherichia coli, DNA replication termination is orchestrated by two clusters of Ter sites forming a DNA replication fork trap when bound by Tus proteins. The formation of a ‘locked’ Tus-Ter complex is essential for halting incoming DNA replication forks. However, the absence of replication fork arrest at some Ter sites raised questions about their significance. In this study, we examined the genome-wide distribution of Tus and found that only the six innermost Ter sites (TerA-E and G) were significantly bound by Tus. We also found that a single ectopic insertion of TerB in its non-permissive orientation could not be achieved, advocating against a need for ‘back-up’ Ter sites. Finally, examination of the genomes of a variety of Enterobacterales revealed a new replication fork trap architecture mostly found outside the Enterobacteriaceae family. Taken together, our data enabled the delineation of a narrow ancestral Tus-dependent DNA replication fork trap consisting of only two Ter sites.
ARTICLE | doi:10.20944/preprints201906.0259.v1
Subject: Life Sciences, Molecular Biology Keywords: long non-coding RNA; cell type specific; alternative splicing; functional enrichment; RNA-binding proteins; protein binding lncRNA sponges; secondary RNA structure; cancer
Online: 26 June 2019 (05:23:29 CEST)
Background: Recent developments in our understanding of the interactions between long non-coding RNA (lncRNA) and cellular components have improved treatment approaches for various human diseases including cancer, vascular diseases, and neurological diseases. Although investigation of specific lncRNAs revealed their role in the metabolism of cellular RNA, our understanding of their contribution to post-transcriptional regulation is relatively limited. In this study, we explore the role of lncRNAs in modulating alternative splicing and their impact on downstream protein-RNA interaction networks. Results: Analysis of alternative splicing events across 39 lncRNA wildtype and knockout RNA-sequencing datasets from three human cell lines: HeLa (Cervical Cancer), K562 (Myeloid Leukemia), and U87 (Glioblastoma), resulted in high confidence (fdr < 0.01) identification of 4432 skipped exon events and 2474 retained intron events, implicating 759 genes to be impacted at post-transcriptional level due to the loss of lncRNAs. We observed that a majority of the alternatively spliced genes in a lncRNA knockout were specific to the cell type, in agreement with the finding that genes affected by alternative splicing also displayed enriched functions in a cell type specific manner. To understand the mechanism behind this cell-type specific alternative splicing patterns, we analyzed RNA binding protein (RBP)-RNA interaction profiles across the spliced regions. Conclusions: Despite limited RBP binding data across cell lines, alternatively spliced events detected in lncRNA perturbation experiments were associated with RBPs binding in proximal intron-exon junctions, in a cell type specific manner. The cellular functions affected by alternative splicing were also affected in a cell type specific manner. Based on the RBP binding profiles in HeLa and K562 cells, we hypothesize that several lncRNAs are likely to exhibit a sponge effect in disease contexts, resulting in the functional disruption of RBPs due to altered titration of the RBPs from their target loci. We propose that such lncRNA sponges can extensively rewire the post-transcriptional gene regulatory networks by altering the protein-RNA interaction landscape in a cell-type specific manner.
REVIEW | doi:10.20944/preprints201807.0596.v1
Online: 30 July 2018 (15:36:54 CEST)
We are currently assisting at the explosion of the epitranscriptomics, which studies the functional role of chemical modifications into RNA molecules. Among more than 100 RNA modifications, the N6-methyladenosine (m6A), in particular, has attracted the interest of researchers all around the world. m6A is the most abundant internal chemical modification in mRNA and it can control any aspect of mRNA post-transcriptional regulation. m6A is installed by “writers”, removed by “erasers”, and recognized by “readers”, thus, it can be compared to the reversible and dynamic epigenetic modifications in histones and DNA. Given its fundamental role in determining the way mRNAs are expressed, it comes as no surprise that alterations to m6A modifications have a deep impact in cell differentiation, normal development and human diseases. Here, we review the proteins involved in m6A modification in mammals, m6A role in gene expression and its contribution to cancer development. In particular, we will focus on AML that, among first, has indicated how alteration in m6A modification can disrupt normal cellular differentiation and lead to cancer.
ARTICLE | doi:10.20944/preprints202105.0492.v1
Subject: Life Sciences, Biochemistry Keywords: Drug resistance; nsp12; protein design; fitness; RNA-dependent RNA polymerase; resistance mutations; SARS-CoV-2.
Online: 20 May 2021 (13:18:14 CEST)
Favipiravir is a broad-spectrum inhibitor of viral RNA-dependent RNA polymerase (RdRp) currently being used to manage COVID-19 in several countries. By acting as a substrate for RdRp, favipiravir gets incorporated into the nascent viral RNA and prevents strand extension. A high mutation rate of SARS-CoV-2 RdRp may facilitate antigenic drift as an answer to the host immune response, thereby generating resistance of virus to favipiravir. Therefore, it is extremely crucial to predict potential mutational sites in the RdRp and the emergence of structural modifications contributing to drug resistance. Here, we used high-throughput interface-based protein design to generate >100,000 designs and identify mutation hotspot residues in the favipiravir-binding site of RdRp. Several mutants had lower binding affinities to favipiravir, out of which hotspot residues with a high propensity to undergo positive selection were identified. The results showed that the designs retained an average of 97 to 98% sequence identity, suggesting that SARS-CoV-2 can develop favipiravir resistance with just a few mutations. Notably, we observed that out of 134 mutations predicted designs, 63 specific mutations were already present in the CoV-GLUE database, thus attaining ~47% correlation match with the clinical sequencing data. The findings improve our understanding of the potential signatures of adaptation in SARS-CoV-2 against favipiravir and management of COVID-19. Furthermore, they can help develop exhaustive strategies for robust antiviral design and discovery.
BRIEF REPORT | doi:10.20944/preprints202005.0084.v1
Subject: Life Sciences, Virology Keywords: SARS-CoV-2; Vitamin D; Ivermectin; RNA-dependent-RNA polymerase; Spike glycoprotein; Knowledge based docking
Online: 5 May 2020 (15:18:30 CEST)
COVID-19 has emerged as deadly pandemic worldwide with no vaccine or suitable antiviral drugs to prevent or cure the disease. Because of the time-consuming process to develop new vaccines or antiviral agents, there has been a growing interest in repurposing some existing drugs to combat SARS-CoV-2. Vitamin D is known to be protective against acute respiratory distress syndrome (ARDS), pneumonia and cytokine storm. Recently it has been used as a repurposed drug for the treatment of H5N1 virus-induced lung injury. Circumstantial evidences indicate that people with low level of vitamin D are more susceptible to SARS-CoV-2. Although, vitamin D was suggested to interfere with viral replication, its interaction with any SARS-CoV-2 protein is unexplored yet. Beside this, ivermectin, a well-known anti-parasitic agent, exhibits potent anti-viral activities in vitro against viruses such as HIV-1 and dengue. Very recently, ivermectin has been found to reduce viral load of SARS-CoV-2 in vitro. We have analyzed available structures of SARS-CoV-2 proteins to identify probable binding partner(s) of vitamin D and ivermectin through knowledge-based docking studies and figured out possible implication of their binding in SARS-CoV-2 infection. Our observations suggest that the non-structural protein nsp7 possesses a potential site to house 25-hydroxyvitamin D3 (VDY) or the active form of Vitamin D, calcitrol. Binding of vitamin D with nsp7 likely to hamper the formation of nsp7-nsp8 complex which is required to bind with RNA dependent RNA polymerase (RdRP), nsp12 for optimal function. On the other hand, potential binding site of ivermectin has been identified in the S2 subunit of trimeric spike(S) glycoprotein of SARS-CoV-2. We propose that deeply inserted mode of ivermectin binding at three inter-subunit junctions may restrict large scale conformational changes of S2 helices which is necessary for efficient fusion of viral and host membrane. Our study, therefore, opens up avenues for further investigations to consider vitamin D and ivermectin as potential drugs against SARS-CoV-2.
REVIEW | doi:10.20944/preprints201803.0187.v1
Subject: Medicine & Pharmacology, Oncology & Oncogenics Keywords: noncoding RNA; miRNA; lncRNA; circRNA; ncRNA network in cancer; cancer biomarkers; RNA aided cancer therapy
Online: 21 March 2018 (07:28:25 CET)
The past decade has witnessed enormous progress, which has seen the noncoding RNAs (ncRNAs) turn from the so called dark matter RNA to critical functional molecules, influencing most physiological processes in development and disease contexts. Many ncRNAs interact with each other and are part of networks that influence the cell transcriptome and proteome and consequently the outcome of biological processes. The regulatory circuits controlled by ncRNAs have become increasingly more relevant in cancer. Further understanding of these complex network interactions and how ncRNAs are regulated, is paving the way for the identification of better therapeutic strategies in cancer.
ARTICLE | doi:10.20944/preprints202207.0358.v1
Subject: Life Sciences, Virology Keywords: Foot-and-mouth disease virus; safe sample transport; single-stranded positive-sense RNA; TRIzol extraction; naked RNA; infectivity; RNA transfection; lipofectamine; self-transfection; BHK cells
Online: 25 July 2022 (08:14:51 CEST)
Safe sample transport is of great importance for infectious diseases diagnostics. Various treatments and buffers are used to inactivate pathogens in diagnostic samples. At the same time, adequate sample preservation, particularly of nucleic acids, is essential to allow an accurate laboratory diagnosis. For viruses with single-stranded RNA genomes of positive polarity, such as foot-and-mouth disease virus (FMDV), however, naked full-length viral RNA can itself be infectious. In order to assess the risk of infection from inactivated FMDV samples, two animal experiments were performed. In the first trial, six cattle were injected with FMDV RNA (isolate A22/IRQ/24/64) into the tongue epithelium. All animals developed clinical disease within two days and FMDV was reisolated from serum and saliva samples. In the second trial, another group of six cattle was exposed to FMDV RNA by instilling it on the tongue and spraying it into the nose. The animals were observed for 10 days after exposure. All animals remained clinically unremarkable and virus isolation as well as FMDV genome detection in serum and saliva were negative. No transfection reagent was used for any of the animal inoculations. In conclusion, cattle can be infected by injection with naked FMDV RNA, but not by non-invasive exposure to the RNA. Inactivated FMDV samples that contain full-length viral RNA carry only a negligible risk of infecting animals.
REVIEW | doi:10.20944/preprints202201.0474.v1
Subject: Life Sciences, Biotechnology Keywords: Biosensors; DNA; RNA; Cancer; Biomarkers; Proteomics
Online: 31 January 2022 (21:21:33 CET)
The deadliest disease in the world, cancer, kills many people every year. The early detection is the only hope for the survival of malignant cancer patients. As a result, in the preliminary stages of , the diagnosis of cancer biomarkers at the cellular level is critical for improving cancer patient survival rates. For decades, scientists have focused their efforts on the invention of biosensors. Biosensors, in addition to being employed in other practical scenarios, can essentially function as cost effective and highly efficient devices for this purpose. Traditional cancer screening procedures are expensive, time-consuming, and inconvenient for repeat screenings. Biomarker-based cancer diagnosis, on the other hand, is rising as one of the most potential tools for early detection, disease progression monitoring, and eventual cancer treatment. As Biosensor is an analytical device, it allows the selected analyte to bind to the biomolecules being studied (– for example RNA, DNA, tissue, proteins, cells). They can be divided based on the kind of biorecognition or transducer elements on the sensor. Most biosensor analyses necessitate the analyte being labeled with a specific marker. In this review article, the application of distinct variants of biosensors against cancer has been described.
REVIEW | doi:10.20944/preprints202107.0044.v3
Online: 19 October 2021 (13:23:01 CEST)
Tuberculosis (TB) is an infectious disease caused by Mycobacterium tuberculosis (Mtb), with 10.4 million new cases per year reported in the human population. Recent studies on the Mtb transcriptome have revealed the abundance of noncoding RNAs expressed at various phases of mycobacteria growth, in culture, in infected mammalian cells and in patients. Among these noncoding RNAs are both small RNAs (sRNAs) between 50-350 nts in length and smaller RNAs (sncRNA) <50 nts. In this review, we provide an up-to-date synopsis of the identification, designation, and function of these Mtb-encoded sRNAs and sncRNAs. The methodological advances including RNA sequencing strategies, small RNA antagonists and locked nucleic acid sequence specific RNA probes advancing the studies on these small RNA are described. Initial insights into the regulation of the small RNA expression and putative processing enzymes required for their synthesis and function are discussed. There are many open questions remaining about the biological and pathogenic roles of these small non-coding RNAs, and potential research directions needed to define the role of these mycobacterial noncoding RNAs summarized.
ARTICLE | doi:10.20944/preprints202107.0552.v1
Online: 23 July 2021 (22:17:49 CEST)
Despite several decades of research, the physics underlying translation – protein synthesis at the ribosome – remains poorly studied. For instance, the mechanism coordinating various events occurring in distant parts of the ribosome is unknown. Very recently, we have suggested that this allosteric mechanism could be based on the transport of electric charges (electron holes) along RNA molecules and localization of these charges in the functionally important areas; this assumption was justified using tRNA as an example. In this study, we turn to the ribosome and show computationally that holes can also efficiently migrate within the whole ribosomal small subunit (SSU). The potential sites of charge localization in SSU are revealed, and it is shown that most of them are located in the functionally important areas of the ribosome – intersubunit bridges, Fe4S4 cluster and the pivot linking the SSU head to the body. As a result, we suppose that hole localization within the SSU can affect intersubunit rotation (ratcheting) and SSU head swiveling, in agreement with the scenario of electronic coordination of ribosome operation. We anticipate that our findings will improve the understanding of the translation process and advance the molecular biology and medicine.
REVIEW | doi:10.20944/preprints202105.0362.v1
Online: 16 May 2021 (22:27:20 CEST)
Alphaviruses are positive-sense RNA arboviruses that are capable of causing severe disease in otherwise healthy individuals. There are many aspects of viral infection that determine pathogenesis and major efforts regarding the identification and characterization of virulence determinants have largely focused on the roles of the nonstructural and structural proteins. Nonetheless, the viral RNAs of the alphaviruses themselves play important roles in regard to virulence and pathogenesis. In particular, many sequences and secondary structures within the viral RNAs play an important part in the development of disease and may be considered important determinants of virulence. In this review article, we summarize the known RNA-based virulence traits and host:RNA interactions that influence alphaviral pathogenesis for each of the viral RNA species produced during infection. Overall, the viral RNAs produced during infection are important contributors to alphaviral pathogenesis and more research is needed to fully understand how each RNA species impacts the host response to infection as well as the development of disease.
REVIEW | doi:10.20944/preprints202104.0676.v1
Subject: Medicine & Pharmacology, Allergology Keywords: RNA; Protamine; Transfection; Cancer Therapy; Vaccines
Online: 26 April 2021 (13:37:51 CEST)
Protamine is a natural cationic peptide mixture mostly known as a drug for the neutralization of heparin and as a compound in formulations of slow-release insulin. Protamine is also used for cellular delivery of nucleic acids due to opposite charge-driven coupling. This year marks60 years since the first use of Protamine as a transfection enhancement agent. Since then, Protamine has been broadly used as a stabilization agent for RNA delivery. It has also been involved in several compositions for RNA-based vaccinations in clinical development. Protamine stabilization of RNA shows double functionality: it not only protects RNA from degradation within biological systems, but also enhances penetration into cells. A Protamine-based RNA delivery system is a flexible and versatile platform that can be adjusted according to therapeutic goals: fused with targeting antibodies for precise delivery, digested into a cell penetrating peptide for better transfection efficiency or not-covalently mixed with functional polymers. This manuscript gives an overview of the strategies employed in protamine-based RNA delivery, including the optimization of the nucleic acid’s stability and translational efficiency, as well as the regulation of its immunostimulatory properties from early studies to recent developments.
ARTICLE | doi:10.20944/preprints202007.0395.v1
Subject: Biology, Horticulture Keywords: Actinidia; waterlogging; RNA-sequencing; transcriptional adjustment
Online: 17 July 2020 (15:40:11 CEST)
Kiwifruit vines are generally sensitive to waterlogging stress. So far, molecular responses of different kiwifruit genotypes for waterlogging stress are less well-explored. In this study, using RNA-sequencing, we examined transcriptional regulation in the roots of a waterlogging-tolerant genotype KR5 (Actinidia valvata), and a sensitive genotype ‘Hayward’ (Actinidia deliciosa) subjected to 0, 12, 24, and 72 h of waterlogging. Compared with 0 h, transcriptional adjustments of these two genotypes occurred as early as 12 h and became notably pronounced 72 h after waterlogging. Waterlogging stress for 72 h promoted the expression of genes involved in ethylene biosynthesis, sucrose and hexose transport, anaerobic fermentation, nitrate reduction, alanine accumulation, and reactive oxygen scavenging in both genotypes. The differential regulation of genes encoding 9-cis-epoxycarotenoid dioxygenase, phosphoglucomutase, alanine-glyoxylate transaminase, and other enzymes pointed to their diverse strategies upon waterlogging in these two genotypes. In addition, more sucrose and trehalose contents, as well as a higher activity of alcohol dehydrogenase and manganese superoxide dismutases were stimulated in KR5 roots after 72h of waterlogging than that in ‘Hayward’. Overall, our results provided more insights into the molecular basis of the waterlogging response in kiwifruit.
Online: 28 November 2019 (09:38:55 CET)
The current framework of evolutionary theory postulates that evolution relies on random mutations generating a diversity of phenotypes on which natural selection acts. This framework was established using a top-down approach as it originated from Darwinism, which is based on observations made on complex multicellular organisms, and then modified to fit a DNA-centric view. In this article, I argue that, based on a bottom-up approach starting from the physicochemical properties of nucleic and amino acid polymers, we should reject the facts that: i) natural selection plays a dominant role in evolution, and ii) the probability of mutations is independent of the generated phenotype. I will show that the adaptation of a phenotype to an environment does not correspond to organism fitness but rather corresponds to maintaining the genome stability and integrity. In a stable environment, the phenotype maintains the stability of its originating genome, and both (genome and phenotype) are reproduced identically. In an unstable environment (i.e., corresponding to variations in physicochemical parameters above a physiological range), the phenotype no longer maintains the stability of its originating genome but instead influences its variations. Indeed, environment- and cellular-dependent physicochemical parameters define the probability of mutations in terms of frequency, nature and location in a genome. Evolution is non-deterministic because it relies on probabilistic physicochemical rules, and evolution is driven by a bidirectional interplay between genome and phenotype, the phenotype ensuring the stability of the genotype in a cellular and environment physicochemical parameter-depending manner.
ARTICLE | doi:10.20944/preprints201904.0250.v1
Subject: Life Sciences, Molecular Biology Keywords: prebiotic chemistry; protein synthesis; hairpin RNA
Online: 22 April 2019 (12:11:21 CEST)
A model of the early RNA world is proposed. Nearly self-complementary sequences that could adopt double-stranded, smallhairpin-like (shRNA), structures would be selected for due to their greater hydrolytic stability. These would be phosphorylated attheir 5' ends. We suppose that dehydrating conditions arise (perhaps intermittently) in the early environment allowing amino acidsto condense with these RNA molecules. The resulting phosphate-amino acid anhydrides would play the role of early, charged,tRNAs. A crude genetic code could emerge owing to the greater resistance of some amino acid-shRNA pairings to hydrolysisrelative to others. Early on there is no division of labor between mRNAs and tRNAs; the same molecules perform both functions.But the first systems would have encoded little in the way of protein sequence information. Rather they would have served as catalysts for the random polymerization of amino acids. It is speculated that the selective advantage inhering in such systems lay intheir ability to supply raw materials for the formation of coacervates within which the various molecules essential to proto-lifecould be concentrated. This would greatly facilitate the necessary chemistries. The evolution of homochiral protein and RNA populations is discussed. An appealing feature of this model is its ability to explain the transition from phosphorylated amino acids to the 3' ester-linked aminoacyl-tRNAs employed by modern life.
ARTICLE | doi:10.20944/preprints201903.0041.v1
Online: 4 March 2019 (10:37:46 CET)
Untranslated regions (UTRs) of flaviviruses contain a large number of RNA structural elements involved in mediating the viral life cycle, including cyclisation, replication, and encapsidation. Here we report on a comparative genomics approach to characterize evolutionarily conserved RNAs in the 3'UTR of tick-borne, insect-specific and no-known-vector flaviviruses in silico. Our data support the wide distribution of previously experimentally characterized exoribonuclease resistant RNAs xrRNAs within tick-borne and no-known-vector flaviviruses and provide evidence for the existence of a cascade of duplicated RNA structures within insect-specific flaviviruses. On a broader scale, our findings indicate that viral 3'UTRs represent a flexible scaffold for evolution to come up with novel xrRNAs
ARTICLE | doi:10.20944/preprints201809.0406.v1
Subject: Medicine & Pharmacology, Pharmacology & Toxicology Keywords: Sutherlandia frutescens, RNA sequencing, inflammation, TNF
Online: 20 September 2018 (10:16:04 CEST)
Sutherlandia frutescens (S. frutescens) has been traditionally used as an herbal medicine to ameliorate symptoms associated with cancer, infectious diseases, as well as inflammation. The objective of this investigation was to explore the impact of S. frutescens on the expression of genes in a murine macrophage cell line (i.e., RAW 264.7). We found that treatment with an ethanolic-extract of S. frutescens (SFE) 1 h prior to the stimulation with LPS and IFNγ for 24 h significantly affected the expression of 715 genes in RAW 264.7 cells. When the post-stimulation period was shortened to 8 h, the number of genes that were significantly impacted by SFE diminished to 50. Pathway analysis revealed that inflammatory signaling pathways, such as NF-κB, MAPK, and TNF, as well as signaling pathways associated with immune-related responses, were inhibited by SFE treatment. These findings are consistent with previously reported anti-inflammatory activity of SFE and enable better understanding of the immune-modulating properties of this botanical. To our knowledge, this represents the first report on the impact of S. frutescens on global gene expression in an immune cell population.
ARTICLE | doi:10.20944/preprints201803.0244.v1
Subject: Life Sciences, Virology Keywords: RNA silencing; gemycircularvirus; mycovirus; antiviral; dicer
Online: 29 March 2018 (05:44:40 CEST)
This study aimed to demonstrate the existence of antiviral RNA silencing mechanisms in Sclerotinia sclerotiorum by probing wild-type and RNA-silencing-deficient strains of the fungus with an RNA virus and a circular DNA virus. Key silencing-related genes, specifically dicers, were disrupted in order to dissect the RNA silencing pathway and provide useful information on fungal control. Dicers Dcl-1, Dcl-2, and both Dcl-1/Dcl-2- genes were displaced by selective marker(s). Disruption mutants were then compared for changes in phenotype, virulence, susceptibility to viral infection, and small RNA accumulation compared to the wild-type strain. Disruption of Dcl-1 or Dcl-2 resulted in no changes in phenotype compared to wild-type S. sclerotiorum; however, the double dicer mutant strain exhibited slower growth. To examine the effect of viral infection on strains containing null-mutations of Dcl-1, Dcl-2 or both genes, mutants were transfected with full-length RNA transcripts of a hypovirus SsHV2L and copies of a single-stranded DNA mycovirus- SsHADV-1 as a synthetic virus. Results indicate that the ΔDcl-1/Dcl-2 double mutant which was slow growing without virus infection exhibited much more severe debilitation following virus infection. Altered colony morphology including: reduced pigmentation, significantly slower growth, and delayed sclerotial formation. Additionally, there is an absence of virus-derived small RNAs in the virus-infected ∆Dcl-1/Dcl-2 mutant compared to the virus-infected wild-type strain which displays a high percentage of virus-derived small RNA. The findings of these studies suggest that if both dicers are silenced, invasive nucleic acids which include mycoviruses ubiquitous in nature- can greatly debilitate the virulence of fungal plant pathogens.
ARTICLE | doi:10.20944/preprints202204.0008.v1
Subject: Medicine & Pharmacology, General Medical Research Keywords: COVID-19 pandemic; KERRA; SARS-CoV-2 main protease; RNA-dependent RNA polymerase; anti-FIPV activity
Online: 1 April 2022 (14:53:44 CEST)
The COVID-19 pandemic affects all parameters, especially health care professionals, drugs and medical supplies. The KERRA is a mixed medicinal plant capsule that is used for the treatment of patients with high fever with food and drug administration approved by FDA Thailand. Recently, KERRA showed quicker recovery for COVID-19 patients. Therefore, it is possible that some ingredients in KERRA could inhibit SARS-CoV-2. In this study, two important replication-related enzymes in SARS-CoV-2, a main protease and an RNA-dependent RNA polymerase (RdRp), were used to study the effect of KERRA. The results showed that KERRA inhibited the SARS-CoV-2 main protease and SARS-CoV-2 RdRp with IC50 values of 49.91 ± 1.75 ng/mL and 36.23 ± 5.23 µg/mL, respectively. KERRA displayed no cytotoxic activity on macrophage cells at concentrations lower than 1 mg/mL and exhibited anti-inflammatory activity. Additionally, KERRA was against a feline coronavirus (feline infectious peritonitis [FIP]) infection with an EC50 value of 134.3 g/mL. This study supports the potential use of KERRA as a candidate drug for COVID-19.
ARTICLE | doi:10.20944/preprints202202.0170.v1
Subject: Life Sciences, Virology Keywords: influenza virus; RNA-polymerase; RNA-polymerase II; protein-protein interaction; PPI; cap snatching; transcription; binary complementation assay
Online: 14 February 2022 (09:51:21 CET)
Influenza virus transcription is catalyzed by the viral RNA-polymerase (FluPol) through a cap-snatching activity. The snatching of the cap of cellular mRNA by FluPol is preceded by its binding to the flexible C-terminal domain (CTD) of the RPB1 subunit of RNA-polymerase II (Pol II). To better understand how FluPol brings the 3’-end of the genomic RNAs in close proximity to the host-derived primer, we hypothesized that FluPol may recognize additional Pol II subunits/domains to ensure cap-snatching. Using binary complementation assays between the Pol II and FluPol subunits and their structural domains, we revealed an interaction between the N-third domain of PB2 and RPB4. This interaction was confirmed by a co-immunoprecipitation assay and found to occur with the homologous domains of influenza B and C FluPols. Residues [1-72] of RPB4 were found critical in this interaction. Numerous punctual mutants generated at conserved positions between influenza A, B and C FluPols in the N-third domain of PB2 exhibited strong transcriptional activity defect. These results suggest that FluPol interacts with several domains/subunits of Pol II, the CTD to bind Pol II initiating host transcription and a second on RPB4 to locate FluPol at the proximity of the 5’-end of nascent host mRNA.
REVIEW | doi:10.20944/preprints201810.0596.v1
Subject: Life Sciences, Molecular Biology Keywords: biogenesis; microRNAs; ribosomal RNA-derived fragment (rRF); ribosomes; small ribosomal RNA (srRNA); ribosomal DNA (rDNA); small RNAs
Online: 25 October 2018 (05:59:58 CEST)
The advent of RNA-sequencing (RNA-Seq) technologies has markedly improved our knowledge and expanded the compendium of small non-coding RNAs, most of which derive from the processing of longer RNA precursors. In this review article, we will discuss about the biogenesis and function of small non-coding RNAs derived from eukaryotic ribosomal RNA (rRNA), called rRNA fragments (rRFs), and their potential role(s) as regulator of gene expression. This relatively new class of ncRNAs remained poorly investigated and underappreciated until recently, due mainly to the a priori exclusion of rRNA sequences—because of their overabundance—from RNA-Seq datasets. The situation surrounding rRFs resembles that of microRNAs (miRNAs), which used to be readily discarded from further analyses, for more than five decades, because we could not believe that RNA of such a short length could bear biological significance. As if we had not yet learned our lesson not to restrain our investigative, scientific mind from challenging widely accepted beliefs or dogmas, and from looking for the hidden treasures in the most unexpected places.
REVIEW | doi:10.20944/preprints202205.0342.v1
Subject: Medicine & Pharmacology, Oncology & Oncogenics Keywords: noncoding RNA; biomarkers; breast cancer; prognostic; diagnostic
Online: 25 May 2022 (05:09:11 CEST)
For decades since the central dogma, cancer biology research has been focusing on the involvement of genes encoding proteins. It has been not until more recent times that a new molecular class has been discovered, named non-coding RNA (ncRNA), which has been shown to play crucial roles in shaping the activity of cells. An extraordinary number of studies into shown that ncRNAs represent an extensive and prevalent group of RNAs, including both oncogenic or tumor suppressive molecules. Henceforth, various clinical trials involving ncRNAs as extra ordinary biomarkers or therapies have started to emerge. In this review, we will focus on the prognostic and diagnostic role of ncRNAs for breast cancer.
REVIEW | doi:10.20944/preprints202107.0030.v1
Subject: Life Sciences, Biochemistry Keywords: Crop, CRISPR/Cas9; Resistance; RNA interference; Stress
Online: 1 July 2021 (14:13:20 CEST)
With the rapid population growth, there is an urgent need for innovative crop improvement approaches to meet the increasing demand for food. Classical crop improvement approaches involve, however, a backbreaking process that cannot equipoise with increasing crop demand. RNA based approaches i.e. RNAi-mediated gene regulation and site-specific nuclease based CRISPR/Cas9 system for gene editing has made advances in the efficient targeted modification in many crops for the higher yield and resistance to diseases and different stresses. In functional genomics, RNA interference (RNAi) is a propitious gene regulatory approach that plays a significant role in crop improvement by permitting down-regulation of gene expression by small molecules of interfering RNA without affecting the expression of other genes. Gene editing technologies viz. clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (CRISPR/Cas) have appeared prominently as a powerful tool for precise targeted modification of nearly all crops genome sequence to generate variation and accelerate breeding efforts. In this regard, the review highlights the diverse roles and applications of RNAi and CRISPR/Cas9 system as powerful technologies to improve agronomically important plants to enhance crop yields and increase tolerance to environmental stress (biotic or abiotic). Ultimately, these technologies can prove to be important in view of global food security and sustainable agriculture.
Subject: Life Sciences, Biochemistry Keywords: evolution; darwinism; genetic code; RNA; homoestasis; physics
Online: 6 January 2021 (15:06:41 CET)
The physics–biology continuum relies on the fact that life emerged from prebiotic molecules. Here, I argue that life emerged from the physical coupling between the synthesis of nucleic acids and the synthesis of amino acid polymers. Owing to this physical coupling, amino acid polymers (or proto-phenotypes) maintained the physicochemical parameter equilibria (proto-homeostasis) in the immediate environment of their encoding nucleic acids (or proto-genomes). This protected the proto-genome physicochemical integrity (i.e., atomic composition) from environmental physicochemical stresses, and therefore increased the probability of reproducing the proto-genome without variation. From there, genomes evolved depending on the biological activities they generated in response to environmental fluctuations. Thus, a genome generating an internal environment whose physicochemical parameters guarantee homeostasis and genome integrity has a higher probability to be reproduced without variation and therefore to reproduce the same phenotype in offspring. Otherwise, the genome is modified by the imbalances of the internal physicochemical parameters it generates, until new emerging biological activities maintain homeostasis. In sum, evolution depends on feedforward and feedback loops between genome and phenotype, since the internal physicochemical conditions that a genome generates in response to environmental fluctuations in turn either guarantee the stability or direct the variation of the genome.
ARTICLE | doi:10.20944/preprints202004.0522.v1
Subject: Chemistry, Physical Chemistry Keywords: RNA Nucleotides; Uracil; Intermolecular Binding; Cyclic Compounds
Online: 30 April 2020 (08:58:21 CEST)
Exogenous RNA comprises the genetic material associated with several diseases which require immediate treatment, and thus mechanisms to hinder intracellular translation and reproduction of RNA viral agents are of great importance. Applying recent developments from this lab in methods relating to the interaction of DNA with steroid hormones, cyclic compounds are presented for intermolecular binding to nucleic acids. The requirements to achieve binding with RNA nucleotide pairs are described, which involve at a minimum functional elements positioned to interact with the lateral phosphate groups for each of the RNA strands through coupling with a positively charged ion, such as Mg2+, Ca2+, or Zn2+ ions; and an intermolecular hydrogen bond with the oxygen element of uracil at the carbon two location. Additional features of the binding molecules are examined for enhancements and differentiation in binding capability and include aromatic groups that have both a structural role of steric hindrance and a functional role to stabilize the binding mechanisms. Several categories of cyclic compounds are associated to have specific binding capabilities, and the interaction of these structures with potential receptor molecules are evaluated for assessment in delivery and binding of the compound to nucleic acids.
ARTICLE | doi:10.20944/preprints201907.0161.v1
Subject: Life Sciences, Virology Keywords: RNA virus; evolution; epidemics; phylogeography; secondary structure
Online: 11 July 2019 (15:31:59 CEST)
Chikungunya virus (CHIKV), a mosquito-borne alphavirus of the family Togaviridae, has recently emerged in the Americas from lineages from two continents, Asia and Africa. Historically, CHIKV circulated as at least four lineages worldwide with both enzootic and epidemic transmission cycles. To understand the recent patterns of emergence and the current status of the CHIKV spread, updated analyses of the viral genetic data and metadata are needed. Here, we performed phylogenetic and comparative genomics screens of CHIKV genomes, taking advantage of the public availability of many recently sequenced isolates. Based on these new data and analyses, we derive a revised phylogeny from nucleotide sequences in coding regions. Using this phylogeny, we uncover the presence of several distinct lineages in Africa that were previously considered a single one. In parallel, we performed thermodynamic modeling of CHIKV untranslated regions (UTRs), which revealed evolutionarily conserved structured and unstructured RNA elements in the 3'UTR. We provide evidence for duplication events in recently emerged American isolates of the Asian CHIKV lineage and propose the existence of a flexible 3'UTR architecture among different CHIKV lineages.
HYPOTHESIS | doi:10.20944/preprints201811.0564.v2
Online: 27 February 2019 (11:32:02 CET)
Current cellular facts allow us to follow the link from chemical to biochemical metabolites, from the ancient to the modern world. In this context, the "RNA world" hypothesis proposes that early in the evolution of life, the ribozyme was responsible for the storage and transfer of genetic information and for the catalysis of biochemical reactions. Accordingly, the hammerhead ribozyme (HHR) and the hairpin ribozyme, belong to a family of endonucleolytic RNAs performing self-cleavage that might occur during replication. Furthermore, regarding the ultraconserved occurrence of HHR in several genomes of modern organisms (from mammals to small parasites and elsewhere), these small ribozymes have been regarded as living fossils of a primitive RNA world. They fold into 3D structures that generally require long-range intramolecular interactions to adopt the catalytically active conformation under specific physicochemical conditions. By studying viroids as plausible remains of ancient RNA, we recently demonstrated that they replicate in non-specific hosts, emphasizing their adaptability to different environments, which enhanced their survival probability over the ages. All these results exemplify ubiquitously features of life. Those are the versatility and efficiency of small RNAs, viroids and ribozymes, as well as their diversity and adaptability to various extreme conditions. All these traits must have originated in early life to generate novel RNA populations.
REVIEW | doi:10.20944/preprints201811.0384.v1
Subject: Life Sciences, Molecular Biology Keywords: RNA modification; tRNA methyltransferase; tRNA modification; methylase
Online: 16 November 2018 (07:31:19 CET)
More than 90 different modified nucleosides have been identified in tRNA. Among the tRNA modifications, the 7-methylguanosine (m7G) modification is found widely in eubacteria, eukaryotes, and a few archaea. In most cases, the m7G modification occurs at position 46 in the variable region and is a product of tRNA (m7G46) methyltransferase. The m7G46 modification forms a tertiary base pair with C13-G22, and stabilizes the tRNA structure. Recently, we have proposed a reaction mechanism for eubacterial tRNA m7G methyltransferase (TrmB) based on the results of biochemical studies and previous biochemical, bioinformatic, and structural studies by others. However, an experimentally determined mechanism of methyl-transfer remains to be ascertained. The physiological functions of m7G46 in tRNA have started to be determined over the past decade. To be able to better respond to diseases and infections in which the m7G modification is considered to be involved, it is still necessary to further understand the catalytic mechanism of AdoMet and/or the tRNA bound form of m7G methyltransferases. In this review, information of tRNA m7G modifications and tRNA m7G methyltransferases are summarized and the differences in reaction mechanism between tRNA m7G methyltransferase and rRNA or mRNA m7G methylation enzyme are discussed.
ARTICLE | doi:10.20944/preprints201702.0085.v1
Subject: Life Sciences, Molecular Biology Keywords: alfalfa; drought; microRNA; small RNA; differential expression
Online: 23 February 2017 (09:50:07 CET)
Alfalfa, an important legume forage, is an ideal crop for sustainable agriculture and a potential bioenergy plant. Drought, one of the most common environmental stresses, substantially affects plants’ growth, development and productivity. MicroRNAs (miRNAs) are newly discovered gene expression regulators that have been linked to several plant stress responses. To elucidate the role of miRNAs in drought stress regulation of alfalfa, a high-throughput sequencing approach was used to analyze 12 small RNA libraries comprising of 4 samples, each with 3 biological replicates. We identified 348 known miRNAs, belonging to 80 miRNA families, from the 12 libraries and 281 novel miRNAs using Mireap software. 18 known miRNAs in roots and 12 known miRNAs in leaves were screened out as drought-responsive miRNAs. Except for miR319d and miR157a which were upregulated under drought stress, the expression pattern of drought-responsive miRNAs were different between roots and leaves in alfalfa. This is the first study discovering miR157a, miR1507, miR3512, miR3630, miR5213, miR5294, miR5368 and miR6173 are drought-responsive miRNAs. Target transcripts of drought-responsive miRNAs were computationally predicted. All 447 target genes for the known miRNAs were predicted using an online tool. This study provides a significant insight on understanding drought-responsive mechanisms of alfalfa.