Preprint Review Version 1 This version is not peer-reviewed

Computational Approaches to Functionally Annotate Long Noncoding RNA (lncRNA)

Version 1 : Received: 8 June 2020 / Approved: 9 June 2020 / Online: 9 June 2020 (04:45:12 CEST)

How to cite: Ramakrishnaiah, Y.; Kuhlmann, L.; Tyagi, S. Computational Approaches to Functionally Annotate Long Noncoding RNA (lncRNA). Preprints 2020, 2020060116 (doi: 10.20944/preprints202006.0116.v1). Ramakrishnaiah, Y.; Kuhlmann, L.; Tyagi, S. Computational Approaches to Functionally Annotate Long Noncoding RNA (lncRNA). Preprints 2020, 2020060116 (doi: 10.20944/preprints202006.0116.v1).

Abstract

Long noncoding RNA (lncRNA) are implicated in various genetic diseases and cancer, attributed to their critical role in gene regulation. RNA sequencing is used to capture their transcripts from certain cell types or conditions. For some studies, lncRNA interactions with other biomolecules have also been captured, which can give clues to their mechanisms of action. Complementary \textit{in silico} methods have been proposed to predict non-coding nature of transcripts and to analyze available RNA interaction data. Here we provide a critical review of such methods and identify associated challenges. Broadly, these can be categorized as reference-based and reference-free or \textit{ab initio}, with the former category of methods requiring a comprehensive annotated reference. The \textit{ab initio} methods can make use of machine learning classifiers that are trained on features extracted from sequences, making them suitable to predict novel transcripts, especially in non-model species. Machine learning approaches such as Logistic Regression, Support Vector Machines, Random Forest, and Deep Learning are commonly used. Initial approaches relied on basic sequential features to train the model, whereas the use of secondary structural features appears to be a promising approach for functional annotation. However, adding secondary features will result in model complexities, thus demanding an algorithm that can handle it and furthermore, considerably increasing the utilization of computation resources. Computational strategies combining identification and functional annotation which can be easily customized are currently lacking. These can be of immense value to accelerate research in this class of RNAs.

Subject Areas

noncoding RNA; lncRNA; epigenomics; gene regulation; machine learning; bioinformatics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.