Preprint Review Version 1 Preserved in Portico This version is not peer-reviewed

Computational Approaches to Functionally Annotate Long Noncoding RNA (lncRNA)

Version 1 : Received: 8 June 2020 / Approved: 9 June 2020 / Online: 9 June 2020 (04:45:12 CEST)

A peer-reviewed article of this Preprint also exists.

Ramakrishnaiah, Y., Kuhlmann, L., & Tyagi, S. (2020). Towards a comprehensive pipeline to identify and functionally annotate long noncoding RNA (lncRNA). Computers in Biology and Medicine, 127, [104028]. https://doi.org/10.1016/j.compbiomed.2020.104028 Ramakrishnaiah, Y., Kuhlmann, L., & Tyagi, S. (2020). Towards a comprehensive pipeline to identify and functionally annotate long noncoding RNA (lncRNA). Computers in Biology and Medicine, 127, [104028]. https://doi.org/10.1016/j.compbiomed.2020.104028

Abstract

Long noncoding RNA (lncRNA) are implicated in various genetic diseases and cancer, attributed to their critical role in gene regulation. RNA sequencing is used to capture their transcripts from certain cell types or conditions. For some studies, lncRNA interactions with other biomolecules have also been captured, which can give clues to their mechanisms of action. Complementary \textit{in silico} methods have been proposed to predict non-coding nature of transcripts and to analyze available RNA interaction data. Here we provide a critical review of such methods and identify associated challenges. Broadly, these can be categorized as reference-based and reference-free or \textit{ab initio}, with the former category of methods requiring a comprehensive annotated reference. The \textit{ab initio} methods can make use of machine learning classifiers that are trained on features extracted from sequences, making them suitable to predict novel transcripts, especially in non-model species. Machine learning approaches such as Logistic Regression, Support Vector Machines, Random Forest, and Deep Learning are commonly used. Initial approaches relied on basic sequential features to train the model, whereas the use of secondary structural features appears to be a promising approach for functional annotation. However, adding secondary features will result in model complexities, thus demanding an algorithm that can handle it and furthermore, considerably increasing the utilization of computation resources. Computational strategies combining identification and functional annotation which can be easily customized are currently lacking. These can be of immense value to accelerate research in this class of RNAs.

Keywords

noncoding RNA; lncRNA; epigenomics; gene regulation; machine learning; bioinformatics

Subject

Biology and Life Sciences, Biochemistry and Molecular Biology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.