Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

iPReditor-CMG: Improving Predictive RNA Editor for Crop Mitochondrial Genomes Using Genomic Sequence Features and Optimal Support Vector Machine

Version 1 : Received: 20 October 2021 / Approved: 22 October 2021 / Online: 22 October 2021 (15:11:40 CEST)

How to cite: Qin, S.; Fan, Y.; Hu, S.; Wang, Y.; Wang, Z.; Cao, Y.; Liu, Q.; Tan, S.; Dai, Z.; Zhou, W. iPReditor-CMG: Improving Predictive RNA Editor for Crop Mitochondrial Genomes Using Genomic Sequence Features and Optimal Support Vector Machine. Preprints 2021, 2021100332 (doi: 10.20944/preprints202110.0332.v1). Qin, S.; Fan, Y.; Hu, S.; Wang, Y.; Wang, Z.; Cao, Y.; Liu, Q.; Tan, S.; Dai, Z.; Zhou, W. iPReditor-CMG: Improving Predictive RNA Editor for Crop Mitochondrial Genomes Using Genomic Sequence Features and Optimal Support Vector Machine. Preprints 2021, 2021100332 (doi: 10.20944/preprints202110.0332.v1).

Abstract

Cytosine (C) to uracil (U) RNA editing is one of the most important post-transcriptional processes, however exploring C-to-U editing events efficiently within the crop mitochondrial genome remains a challenge. An improving predictive RNA editor for crop mitochondrial genomes, iPReditor-CMG, was proposed, which was based on SVM, three common crop mitochondrial genomes and self-sequenced tobacco mitochondrial ATPase. After multi-combination feature extracting, high-dimension feature screening and multi-test independent predicting, the results showed that the average accuracy of intraspecific prediction was 0.85, and the highest value even up to 0.91, which outperformed the previous reference models. While the prediction accuracies were 0.78 between dicotyledons and no more than 0.56 between dicotyledons and monocotyledons, implying a possible similarity in C-to-U editing mechanisms among close relatives. The best model was finally identified with an independent test accuracy of 0.91 and an area under the curve of 0.88, and further suggested that five unreported feature sequences TGACA, ACAAC, GTAGA, CCGTT and TAACA were closely associated with the editing phenomenon. Multiple evaluation findings supported that the iPReditor-CMG could be effectively applied to predict crop mitochondrial editing sites, which may contribute to insight into their recognition mechanisms and even other post-transcriptional events in crop mitochondria.

Keywords

iPReditor-CMG; RNA editing site; Mitochondrial genomes; genomic sequence feature; support vector machine

Subject

BIOLOGY, Plant Sciences

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.