Introduction
Circular RNAs contribute considerably to transcriptome diversity.
Thirty-three years ago, RT-PCR analysis of the tumor suppressor gene DCC (Deleted in Colorectal Carcinoma) revealed the order of exons to be reversed in a small portion of mRNA. For example, for an RNA corresponding to a genomic clone with the order exon B - exon C - exon D, the orientation exon D to exon B was observed
[1]. It was later found that eukaryotic genes form circular RNAs (circRNA) using canonical splice sites [2,3], which explained the scrambled arrangement. Early reports indicated that a circular RNA made from an archea pre-23 rRNA is translated into an endonuclease of 194 amino acids [
4].
Technical progress in RNA sequencing showed then a widespread expression of circular RNAs that are present in all eukaryotes tested [5,6].
Circular RNAs are covalently closed RNAs generated from pre-mRNA. They are less abundant than linear mRNAs, making it necessary to enrich them using rRNA removal and digestion of linear mRNA with RNase R, which is followed by next generation sequencing. This approach identifies the backsplice junction, which allows to predict the RNA structure based on the annotated exons. However, with this approach the sequence information outside the reads of the backsplice site remains unknown. Recently, rolling circle amplification of circular RNAs, followed by adaptor addition and long range nanopore sequencing allowed identification of full-length circular RNAs [
7,
8,
9]. These circRNAs were collected in databases (
Table 1). Together, sequencing efforts identified more than 880,000 backsplicing events in humans that potentially generate more than 1.8 million human circular RNA isoforms [
10], which compares to about 320,000 human mRNA isoforms from annotated genes
and over 30 million transcripts from other human genomic loci [11], suggesting that circRNAs play a so far overlooked role in the eukaryotic transcriptome.
Formation of circular RNAs through backsplicing that depends on intronic regions.
Circular RNAs are generated through backsplicing, where an upstream 5′ splice site is connected to a downstream 3′ splice site (
Figure 1). These splice sites can be thousands of base pairs apart in the pre-mRNA, indicating that additional mechanisms are necessary to bring the backsplice sites into close vicinity. Pre-mRNAs secondary structures formed by complementary intronic sequences are likely a major contributor to circRNA formation [
12]. The complementary sequences are often provided by repetitive elements, for example Alu-elements in primates or B1 elements in rodents. As these genomic elements are species-specific, they could contribute to the species-specificity found in some circRNAs.
Alu-elements are about 300 nt long repetitive elements and comprise 11% of the human genome [
13]. RNA in situ conformation sequencing (RIC-seq) showed that Alu elements promoted more than 37% of all RNA-RNA interactions across enhancers [
14]. It is likely that Alu elements play a similar prominent role in forming pre-mRNA structures, which is supported by the findings that in humans circRNA backsplice sites are often surrounded by Alu-elements and numerous circRNAs are human-specific [
10]. Alu elements are similar to the 5′ and 3′ end of the 7SL RNA from which they originated but lack a 155 nt long 7SL-specific part. Alu elements consist of a left arm composed of a head to tail dimer of two similar sequences, each of which is about 130 nt long. Following a stretch of adenosines, the right Alu-arm contains an additional insert [
15]. Reflecting the highly structed 7SL RNA, Alu-elements exhibit complementarity through direct RNA interactions, especially when present in a sense to antisense orientation in the pre-mRNA. Double stranded RNAs are the substrate of ADAR enzymes (adenosine deaminase acing on RNA) that change adenosines to inosines. In humans more than 99% of all edited RNAs are derived from Alu-elements [
16,
17],
demonstrating that intramolecular dsRNA regions formed by Alu-elements are common in pre-mRNAs.
Mouse B1 elements are related to human Alu elements and are also derived from 7SL-RNA. However, B1 elements are more diverse and shorter (about 140 nt) than Alu elements, consisting of only one monomer, which likely decreases the preponderance of self-complementarity and overall RNA editing [
18].
Back splicing can also be promoted by proteins. For example, the interaction between Alu-element RNA can be increased by SLU7 Homolog- Splicing Factor (SLU7) binding to the flanking Alu sequences of circCAPG, which promotes circRNA generation [
19]. Protein dependent promotion of circRNA formation can occur without Alu elements, and more than 15 RNA-binding proteins were shown to promote circRNA formation, among them QKI [
20], RBM20 [
21], hnRNP L and FUS (reviewed in [
24]).
Circular RNAs are cytosolic, more stable than mRNAs and use a nuclear-cytoplasmic export pathway different from mRNAs.
Due to their generation through backsplicing, circRNAs lack a 7-methyl guanosine cap at their 5’ end (5’ cap) and also do not have a poly A tail at their at 3’ end. CircRNAs collected in recent databases have an average (median) length of around 750 nt and about 20% of circRNAs are longer than 500 nt, with up to 81,000 nt in length. Circular RNAs almost always use exons that are already used by linear RNAs, i.e. are composed of exons annotated in the database [
10].
The generation of circRNAs through backsplicing occurs in the nucleus. Once generated, circRNAs form a complex with insulin-like growth factor 2 mRNA-binding protein (IGF2BP1). This complex is exported into the cytosol through interaction with exportin 2 and Ran-GTP [
25]. Thus, the circRNA export mechanism is distinct from mRNAs that use recognition of the exon-junction by the TREX (transcription export) complex for cytosolic export [
26,
27]. The nuclear-cytosolic distribution of circRNAs can be influenced by external stimuli. For example, hypoxia leads to a translocation of circLRP6 from the nucleus to the cytosol, possibly mediated by binding to hnRNP M that accumulates in the cytosol under hypoxic conditions [
28].
The high stability of circRNAs is another striking difference between circular and mRNAs. Half-lives have been measured for several circRNAs, for example for circYAP1 [
29], circTRCP [
30], circARHGAP35 and circHER2 [
32]. In cell-based assays circRNA half-lives were found to be longer than 24 hrs, which is the upper limit of detection, indicating that the true half-lives may be longer.
This compares to a global mRNA half-life of 4.3 hrs making circRNA exceptionally stable.
Circular RNAs can sequester nucleic acids and proteins.
A) Interaction with nucleic acids
Despite the widespread expression of circRNAs, their biological functions are poorly understood. CircRNAs can sequester nucleic acids and proteins. Binding to miRNAs was the first function identified for circRNAs where the highly abundant circRNA ciRS-7/CDR1as binds to miR-7 and exerts biological effects through miR-7 sequestration (‘sponging’) [
34], which has been found for many other circRNAs reviewed in [
35,
36]. Binding to DNA via base-pair interaction is also possible, for example in arabidopsis thalania the circular RNA made from exon 6 of the SEPALLATA3 gene binds to its own DNA locus, forming an R-loop of circRNA and DNA [
37].
B) Binding to proteins
CircRNAs were shown to bind to numerous proteins, mainly RNA binding proteins, and computational predicted binding sites were collected in databases [
38]. Sequestration of proteins by circRNAs can have numerous effect on transcription, cell cycle progression, apoptosis and cell migration (reviewed in [
39]).
Evidence for translation of circular RNAs
Most circular RNAs contain open reading frames and initially over 4,000 circRNAs were predicted to encode proteins [
40], that number has increased to more than 300,000 in the recent transCirc database [
41]. There is now an increasing number of studies showing translation of circular RNAs.
Early evidence that circRNAs can be translated came from studies in 1988 using the archaebacterium Desulfurococcus mobilis, where an intron from the pre-23rRNA is excised as a stable circular RNA that encodes an endonuclease of 194 amino acids that can spread through other strains
[4]. In mammalian systems, exon 2 of circSLC8A1 (previously named NCX1) was shown in 1999 to encode a protein that transports calcium across a membrane [
42].
Proof of principle studies in 1995 showed that circular RNAs can be translated in vitro if they contain internal ribosomal entry sites (IRES) [
43], which has been confirmed for different IRESes [
44]. Rolling circle translation has been observed even in chemically modified circRNAs containing phosphoramidate linkages [
45]. However, the translational potential of cellular circular RNAs was controversial, as initial screens of circular RNAs with open reading frames did not detect translation [
46]. Further studies validated the translation of a growing number of circular RNAs in physiological systems (
Table 2.). Evidence that circRNAs are translated comes from their association with polysomes, indicating active translation; validation of backsplice site-encoded peptides using mass-spectrometry and detection of circRNA encoded proteins using specific antisera.
Association with polysomes
Over 3,000 publicly available Ribo-seq databases were analyzed for the presence of circRNAs, which identified 1,969 polysome associated circRNAs in six different species, which were collected in the riboCirc database
that expanded the previous transCirc database [41]. The polysome association was confirmed for numerous transcripts using RT-PCR of independent polysome preparations [48]. Similarly, numerous studies focusing on a defined circRNA showed polysome association, indicated in Table 2.
Ribosome foot printing experiments showed a significant increase in circular RNAs when the polysomes were prepared in the presence of detergent (0.1% Triton), indicating that circRNA translating ribosomes are associated with membranes [
49]. Since detergent is omitted in most polysome preparation protocols, the actual number of circRNAs associated with polysomes could be underestimated. Translation and polysome association of circular RNAs could be cell-type specific. For example, erythrocytes and platelets contain high levels of circRNAs, but only 32 circRNAs were polysome associated in erythroblasts [
50].
Validation by Mass-spectrometry
The translation of several circRNAs was also confirmed by mass-spectrometry that detects a peptide corresponding to the backsplice junction. One of the first of these proteins was drosophila muscleblind [
49], and subsequently backsplice site peptides of numerous human proteins were identified (
Table 2). During sperm development, mRNAs are degraded in pachytene spermatocytes, leading to an enrichment of circular RNAs. Mass-spectrometry analysis using mouse testes identified over 1,600 peptides corresponding to backsplice junctions, indicating widespread circular RNA translation [
51]. Analysis of human hearts showed 8,878 circular RNAs, of which 40 were associated with polysomes and mass-spectrometry evidence for translation was found for six of these proteins [
52], showing translation of human circular RNAs in a differentiated tissue.
Detection of circRNA encoded proteins using specific antisera.
Proteins encoded by circRNAs often have specific amino and carboxy termini (
Figure 2A), which makes it possible to generate circRNA-encoded protein specific antisera (
Table 2). A common problem is that these circRNA-specific sequences shown in Supplemental
Figure 1 are usually short, which can make it difficult to generate specific antisera.
Detection of circRNA encoded proteins using reporter genes.
Transfection studies with reporter genes were first used to prove the possibility of circRNA translation. Usually, these reporter genes contain a protein tag like a 3x flag tag that is located upstream of the start codon and thus depends on circRNA formation for translation. Specific vectors for expression of proteins from circRNAs have been developed [
53]. The circRNA translational reporters often rely on heterologous flanking intronic sequences, provided for example by the highly expressed circZKSCAN1 gene. However, heterologous and authentic flanking introns can give different amounts of circRNAs and circRNA-encoded protein expression [
54]. The reporter gene assays can produce artefacts, such as trans-splicing [
55], and expression of linear byproducts
[56]. These problems can be addressed by controls, such as comparison with mutated splice sites (GT->TT), introduction of in frame stop codons or splitting up the epitope tag in the backsplice site.
General features of proteins made from circular RNAs
In their capacity to encode proteins, circular RNAs are distinct from linear RNAs as they can undergo rolling circle translation, where the ribosome moves around the circular RNA. Due to the backsplicing mechanism, circRNAs lack the first exon of mRNAs that cannot provide a 3′ splice site.
Circular RNAs can increase proteome diversity through frameshifts during rolling circle translation.
The potential of circRNA rolling circle translation is illustrated by a 220 nt long circular RNA that represent the viroid-like satellite RNA of the rice yellow mottlevirus. Viroids are single stranded circular RNAs that infect flowering plants and that potentially are also present in prokaryotes [
58].
The 220 nt long viroid-like satellite RNA
can undergo three rounds of rolling circle translation where the reading frame shifts during each round of translation. The three frameshifts generate three distinct proteins of 16-, 18-, and 23 kDa from only 220 nt genetic information [
59].
The circular RNA translation allows the generation of novel proteins that contain parts of the full-length linear proteins. If the number of nucleotides of a circRNA cannot be divided by three (is not an integer of three) frameshifts will occur during translation, which generates proteins with a specific amino-terminus, carboxy terminus or both a specific amino and carboxy terminus. If the circular RNA lacks stop codons and the number of nucleotides can be divided by three, multimers of proteins can be generated using rolling circle translation. The number of rounds of translation can be from 1.5 to more than five and depends on the circRNA and cell type. In some cases, the circRNA encodes a completely new protein using a circRNA-specific reading frame [
61,
62] (
Figure 2A).
Backsplicing frequently removes membrane-translocation signals.
A common feature of circRNA encoded proteins is that they change membrane association, which affects the intracellular localization. Proteins can be inserted into the membrane or enter the endoplasmic reticulum when they express a signal peptide at their amino-terminus. This 16-30 amino acid long hydrophobic sequence binds to the signal recognition particle and enters the endoplasmic reticulum through the signal recognition particle (SRP) receptor. The signal peptide is usually located in the first exon that lacks a 3’ splice site and can thus not participate in backsplicing. Thus genes encoding proteins localized in the endoplasmic reticulum, Golgi or endosomes as well as membrane proteins can generate cytosolic variants using backsplicing (
Figure 2B), which has been confirmed for CircCadherin and circINSIG1 [
65].
The presence of non-coding exons in the 5′ UTR are exceptions to this mechanism. For example, the first exon of presenilin 1 pre-mRNA is non-coding and the circPSEN1 RNA formed by backsplicing to exon 2 contains the start codon with the signal peptide. CircPSEN1 RNA is present in polysomes and thus likely translated.
Currently known translated circular RNAs
Experimentally validated circRNA-encoded proteins are summarized in
Table 2. Most of the circRNA encoded proteins play a role in cancer, which reflects the current research focus and possibly are due to hypoxia conditions in tumors that promote circRNA translation. The structures of the proteins are given in
Supplemental Table 1 , where circRNA-specific protein parts are indicated. Reflecting the backsplicing mechanism, the circRNA encoded proteins are smaller than mRNA encoded proteins, with an average size of 268 aa (human mean 456 aa, [
66]). 36% (12/33) of the experimentally studied proteins show a high propensity (>0.8) for phase separation calculated by MolPhase
[67], indicating that they could form aggregates. The mean molecular recognition features (MoRF) for circRNA encoded proteins is 12.75%, compared to 1% to eukaryotic proteins, indicating that circRNA encoded proteins have a high likelihood to interaction with other proteins and adopt a structure upon binding (
Supplemental Table 2). Thus, circRNA-encoded proteins likely have different biophysical characteristics than mRNA encoded proteins and could change cellular properties once translated.
Mechanisms of circular RNA translation
Cap dependent translation initiation of mRNAs
Cellular circRNAs were initially considered non-protein coding RNAs, because they lack a 5′ 7-methyl guanosine cap or known ribosomal entry sites necessary for translational initiation. Thus, circRNA cannot be translated similar to the vast majority of mRNAs in a cap-dependent manner. The 40S ribosomal subunit binds to a ternary complex of eukaryotic initiation factors eIF2 bound to Met-tRNA(i) and GTP, which binds to the 40S ribosomal subunit with the help of eIF1, eIF1A, eIF3 and eIF5, generating the 43S ribosomal pre-initiation complex. This 43S complex binds to the mRNA cap via eIF4E, via the initiation factors eIF4E, and also contains eIF4G and A as well as PABPC2 (polyA binding protein C2). The 43S preinitiation complex then scans the mRNA for the AUG start codon. Thus, the translational initiation complex ‘sees’ a circular RNA structure, mediated by protein interactions between the 7-methyl guanosine cap (
7mG) and the poly adenosine tail [
69,
70]
(Figure 3A).
Cap independent translation initiation of mRNAs
Under cellular stress conditions, caused by hypoxia, nutrient starvation, oxidative or endoplasmic reticulum stress, mRNAs can be translated without a 7mG cap [
70]. Cap-independent translation has been first discovered in viruses, where an internal ribosomal entry site (IRES) is present
[71].
The IRES forms tertiary RNA structures that interact with the 40S rRNA allowing to
recruit the 43S pre-initiation complex (reviewed in [72]). IRESes act through various mechanisms, including direct rRNA-IRES contacts and trans-acting factors mediated contacts.
After their discovery in viruses, IRESes have been identified in many cellular mRNAs and cap-independent translation using IRES sequences is potentially used by up to 15% of human mRNAs [
73,
74].
A second mechanism of cap independent translation is methylation of adenosine on the N6 position, called m6A. M6A modifications are one of the most common RNA modifications likely present in all classes of RNAs (reviewed in [
75]). The m6A methylation is catalyzed by METTL3 (methyltransferase like 3), which needs cofactors for positioning which is provided by METTL14 for circRNAs. M6A modification are reversible and can be removed by fat mass and obesity associated protein (FTO) and AlkB homolog 5 (ALKBH5). M6A modifications are recognized by the YTH domain (yeast two hybrid, after YTH521-B, now YTHDC1, [
76]). M6A modifications influence RNA stability, tertiary structure, subcellular localization, and pre-mRNA processing. M6A modification promotes circRNA formation, likely mediated by its nuclear reader YTHDC1
[77]. YTHDF1 promotes cap-dependent translation of m6A modified mRNAs by interacting with eIF3
[78], which can make mRNA translation independent of the 7mG cap under cellular stress conditions (Figure 2B).
Thus, numerous mRNAs undergo cap-independent translation that usually occurs under cellular stress conditions, such as hypoxia and is frequently observed in cancer
[70].
Bacterial RNA lacks a 5′ cap and translational initiation is mediated by mRNA-ribosome interaction. This explains why circular RNAs can also be translated in bacteria, which has been observed in the first translated circRNAs from archea and has been suggested as a protein expression platform. In this approach circRNAs are generated as a group I intron and are translated by E. coli if the Shine-Dalgarno Sequence and the downstream sequence element surrounding the AUG start codon are present [
80]. This mechanism could be physiologically important, as viroid-like circular RNAs were discovered in the microbiome of the human gut
[81].
Given the widespread use of cap-independent translation in cellular mRNA, it is not surprising that circular RNAs can be translated as well, especially under stress conditions in cancer. Three major models have been proposed for circRNA translation [
82]: internal ribosomal entry sites (IRES), m6A methylation, adenosine to inosine RNA editing and 40S recruitment via eIF4A3 that is deposited on exon junctions.
Internal ribosomal entry sites of circRNAs
The introduction of viral IRESes into circRNA strongly increased translation through eIF4G recruitment, leading to translation [
43]. A systematic screen for ribosomal entry sites using GFP reporters identified 97 purine rich hexamer sequences in circRNAs that promote translation [
83]. The identified hexamers partially overlap with m6A sites but given the presence of adenosines these sequences could also be substrates for A>I editing. The potential m6A sites are annotated in databases and computational analysis using known translated circRNAs identified short, A-U rich putative ribosomal entry sites [
84]. As screen that looked at longer RNA regions, taking circRNA structure and complementarity to 18S rRNA into account identified over 17,000 possible circRNA IRES sequences [
85], possibly indicating that circRNA use several molecular mechanisms for translational initiation.
m6A methylation of circRNAs promotes their translation.
Using an siRNA screen, it was found that the m6A reader YTHDF3 is responsible for the translation of circRNAs, but YTHDF3 knock out did not affect mRNA translation. YTHDF3 binds directly to eIF4G2, leading to cap-independent circRNA translation [77,86]. At least 13% of circRNAs show m6A modification, indicating that m6A could contribute to the translation of numerous circRNAs. Analysis of pachytene spermatocytes showed that backsplicing occurred mostly at m6A enriched sites, resulting in the validated translation of 1,600 circular RNAs [
51],
demonstrating a general role of m6A modification in circRNA translation.
Splicing- promoted translation of circular RNAs through the exon junction complex.
During the splicing reaction, an exon junction protein complex is deposited on the nascent mRNA. The exon junction complex is necessary for RNA export into the cytosol and is composed of eIF4A3, MAGOH, Y12/RBM8A and CASC3 [
87,
88]. Every circRNA generated by backsplicing contains an exon junction and thus has possibly eIF4A3 bound. EIF4A3 likely plays a role in circRNA biogenesis as it promotes circRNA formation of some circRNAs [
89,
90,
91]. eIF4A3 initiates translation by recruiting eIF3 through direct interaction with its subunit eIF3G. eIF3 recruits the small 40S ribosomal subunit that scans for a start codon, which initiating translation. This mechanism has been found for luciferase-reporter genes and translation of the circCTNNB1 (beta catenin) RNA [
92], circINSIG1 and for the translation of succinate dehydrogenase assembly factor 2 (circSDHAF2) [
93]. CircISIG1 translation is likely promoted by hypoxia acting on eIF4A3. Notably, circSDHAF2 contains no known m6A-enhancers or m6A modifications. Its translation is further enhanced by the presence of introns, suggesting that translation of this circRNA mainly depends on the exon junction complex [
93].
Inosine-dependent translational initiation
CircRNAs form intermolecular double stranded regions [
54,
57] making them substrates for ADAR enzymes (adenine deaminase acting on RNA) that recognize double stranded RNA structures and convert adenosines into inosines [
94]. A screen of trans-acting factors found that ADAR1 and ADAR2 strongly increases translation of the circTau RNAs
[54]. In humans, there are three different ADAR genes: ADAR1-ADAR3. ADAR1 is expressed in all tissues with an interferon-induced cytosolic (p150) and a constitutive nuclear (p110) isoform; ADAR2 is weakly expressed in brain and the catalytic inactive ADAR3 gene is highly expressed in brain. ADAR1-p150 has the strongest effect on circRNA translation [
95], possibly due to its cytosolic localization. ADAR activity results in decoding of RNAs, as an inosine is read as a guanosine and thus an AUA changed into an AUI could serve as a start codon. This decoding was shown for circTau 12->10 RNA, generated by backsplicing of exon 12 to exon 10. This 288 nt long circRNA does contains an indefinite open reading frame, but lacks start or stop codon. ADAR activity changes an AUA to AUI, which initiates translation in reporter gene assays, showing that in principle ADAR activity can change the amino acid usage of circRNAs
[54], (Figure 2).. RNAseq of reporter genes in the presence of co-transfected ADAR enzymes showed a widespread change of adenosines to inosines. In most positions less that 10% of a given adenosine site is changed. The editing profile depends on the flanking introns, suggesting that at least some of the editing occurs in the nucleus. The major substrates for ADAR enzymes in humans are RNAs derived from Alu-elements [
16], suggesting that human circular RNAs surrounded by Alu elements might show differences in translation. The circTau reporter genes did not show a strong increase in translation when m6A-dependent circRNA translation was initiated. However, when both pathways were activated, an additive effect was seen [
96].
The trans-acting factors connecting circRNA translation to their A>I editing are unknown. Several proteins, NOGO, p54nrb and vigilin/HDLBP were shown to predominantly bind to inosines and vigilin/HDLBP directly interacts with the 40S rRNA subunit [
98], indicating that vigilin HDLBP might promote translational initiation of edited circRNAs, i.e. it could act as a reader for inosines.
Circular RNAs as inhibitors of mRNA translation
Several reports indicate that circular RNAs interact with ribosomes, which can inhibit translation. For example, circHIPK2 binds directly to the pre-ribosomal initiation factor RPL7, which inhibits myogenesis [
99]. Similarly, circTRPS1 binds to several ribosomal proteins, which reduces translation [
100]. Testing translation of specific mRNAs, it was found that circMALAT1 inhibits PAX5 mRNA translation [
101], circSMAD2 sequesters eIF4A3 which inhibits translation of linear SMAD2 [
102].
Outlook
circRNAs are translated under conditions that favor cap-independent translation, which in most cases studied is cellular stress seen in cancer or in neurodegeneration. Thus, most circRNA encoded proteins are associated with cancer which likely also reflects a research focus. In addition, circular RNAs translation becomes apparent when mRNAs are removed, as in stages during sperm cell development. There are at least three different translational mechanism for circRNAs: internal ribosomal entry sites, m6A modification, and RNA editing.
The number of known circRNA isoforms far exceeds the number of mRNA isoforms suggesting that protein encoded by circRNAs can provide special functions to an organism, especially the nervous system that shows the highest expression of circRNAs. Some circRNAs like LINC-PINT are derived from non-coding RNAs and could thus lead to proteins made from non-coding RNAs and potentially other uncapped RNAs such as DWORF [
103].
CircRNAs are highly stable, allowing the accumulation of RNA modifications, which is the basis of at least two circRNA translational mechanisms. As more than 100 RNA modifications are known, it is possible that other RNA modifications could lead to circRNA translation and there is some evidence the N1-methyladenosine could contribute to circRNA translation [
104].
The stability of circRNAs makes them attractive as vectors for protein delivery and circRNAs have been suggested as vaccine platforms [
105]. The expression of some circRNA-encoded proteins is tumor-specific, making them targets for possible anti-tumor vaccines [
106,
107,
108]. Finally, circRNAs could be therapeutic targets using siRNAs that target the unique backsplice junction [
109,
110].
Table 1.
Resources and databases for circular RNAs.List of web-based resources useful to study circRNAs.
Table 1.
Resources and databases for circular RNAs.List of web-based resources useful to study circRNAs.
FL-circAS |
Full-length circular RNA sequences, shows internal sequences and alternative splicing |
[10] |
circVis |
Visualization of circRNA |
[111] |
circNET 2.0 |
Regulatory network in cancer |
[112] |
circMine |
Disease related circRNAs |
[113] |
riboCirc |
Translatable circRNA |
[47] |
transCirc |
Translatable circRNA |
[41] |
Circ2Disease |
circRNAs in human disease |
[114] |
CircInteractome |
Database of circRNA interaction with miRNAs and proteins |
[38] |
Table 2.
Validated proteins made from circular RNAs.Summary of validated proteins encoded by circRNAs. Alu-elements indicate the presence of Alu elements in introns directly flanking the first and last exon on the circRNA, L: left intron, R: right intron, B: both introns. For iORFs generated by rolling circle, the length of one round is indicated. RC: rolling circle. aa_nb: number of amino acids.
Table 2.
Validated proteins made from circular RNAs.Summary of validated proteins encoded by circRNAs. Alu-elements indicate the presence of Alu elements in introns directly flanking the first and last exon on the circRNA, L: left intron, R: right intron, B: both introns. For iORFs generated by rolling circle, the length of one round is indicated. RC: rolling circle. aa_nb: number of amino acids.
# |
Name |
aa_nb |
validation |
function |
Summary function |
circRNA length [n] |
Alu elements |
Number of exons |
References |
1 |
LINC-PINT |
87 |
Mass-spectrometry, antiserum |
Suppresses glioblastoma cell proliferation |
LINC-PINT Long Intergenic Non-Protein Coding RNA, (also p53 Induced Transcript) forms a circular RNA from exon 2, There is no protein translation from the LINC-PINT transcript, however, there is translation from the circular RNA. The 87 amino acid long protein binds to PAF1 (polymerase associated factor (PAF1) complex. The protein could arrest the PAF1 complex on promoters of oncogenes, leading to a suppression of cell proliferations in glioblastoma. |
1084 |
no |
1 |
[61] |
2 |
circSHPRH |
146
|
Mass-spectrometry, antiserum |
Inhibits tumor cell proliferation |
SHPRH (Histone Linker PHD RING Helicase [SNF: sucrose non fermenting]), forms a 440 nt long circRNA through backsplicing from exon 29 to 26. The 440 nt long circRNA is translated into circSHPRH-146aa. The protein could act by sequestering the ubiquitin E3 ligase DTL that then no longer acts and destabilizes the linear SHPRH protein. The full length SHPRH causes PCNA degradation, and thus circSHPRH-146aa indirectly stops proliferation of glioblastoma cells. circSHPRH-146 is downregulated in glioblastoma. The protein starts and stops within the same 4 nucleotides: TGATG it starts at ATG, and due to frameshift uses TGA as stop. |
440 |
L |
4 |
[115,116] |
3 |
circFBXW7 |
185 |
Mass-spectrometry, antiserum |
represses glioma tumorigenesis |
FBXW7 (F-Box And WD Repeat Domain Containing 7, E3 ) is part of an ubiquitin ligase complex acting as a E3 ligase. The circFBXW7 protein corresponds to the N-terminus and lacks the WXD40 domain necessary for substrate recognition. circFBXW7 competes with the linear FBPXW7 for binding to the deubiquitinase USP28. Thus, an increase in amount of circFBXW7 frees linear FBPXW7 protein for substrate degradation, which was shown for c-Myc. circFBXW7 interacts with catenin, influences Wnt signal, leading to cancer cell resistance |
620 |
no |
2 |
[117,118] |
4 |
circCDH1 |
254 |
Mass-spectrometry, antiserum |
Activates EGF receptor |
circCDH1 (Cadherin 1) in generated through backsplicing from exon 10 to 7 in the E-cadherin gene. Due to a frameshift after one round of translation a unique C-terminus is created. The circRNA was detected in 84% of glioma, but not in controls. The circCDH1 activated STAT3, PI3K-AKT and MAPK-ERK signaling in glioblastoma. In contrast to linear E-cadherin that is localized in the plasma membrane, circCDH1 is secreted out of cells and binds and activates EGFR using its circRNA-specific region. The circular RNA encoded protein contain cadherin domains but lacks a transmembrane domain and the signal peptide. The activation of EGFR promotes tumor formation and prevents the therapeutic effect of an anti EGFR-antibody (nimotuzumab). |
733 |
B |
4 |
[64] |
5 |
circINSIG1 |
131 |
Mass-spectrometry, antiserum |
Induces cholesterol biosynthesis and colorectal cancer progression. |
INSIG1 (Insulin Induced Gene 1) is an endoplasmic reticulum membrane domain protein with six transmembrane regions. Backsplicing from exon 4 to 3 generates circINSIG1 that has a specific C-terminus. circINSIG1 recruits a ubiquitination adaptor complex made from the proteins CUL5 and ASB6. This complex promotes ubiquitination and degradation of the linear INSIG1 protein, which promotes cholesterol biosynthesis and colorectal cancer proliferation and metastasis. |
292 |
no |
2 |
[65] |
6 |
circNFIB |
RC
|
Polysome
|
Inhibits breast tumor growth, decreases arachidonic acid |
NFIB (Nuclear Factor I B) is a transcriptional activator. Backsplicing of exon 6 to 3 generates a 361 circular RNA that undergoes rolling circle translation. The circNFIB is downregulated in breast cancer. Its overexpression prevents cancer cell proliferation, whereas knock down has the opposite effect. |
361 |
no |
4 |
[119] |
7 |
circZNF609 |
250 |
Polysome, reporter gene
|
Controls myoblast proliferation |
ZNF609 (zinc finger protein 609) forms a circRNA through backsplicing of exon 2. |
874 |
B |
1 |
[120] |
8 |
circYAP |
220 |
Mass-spectrometry, antiserum, polysome
|
binds to LATS1 |
circYAP is generated through backsplicing of exon 7 to 2 of YAP1.circYAP competes with linear YAP for binding to the kinase LATS1, causing loss of phosphorylation of linear YAP that translocated into the nucleus and turns on oncogenic transcription program. The translation occurs via METT3/14, YTHDF3 eIF4G2. |
842 |
B |
5 |
[29] |
9 |
circFAM53B |
219 |
Mass-spectrometry, antiserum
|
Inhibits tumor growth |
circFAM53B (Family with Sequence Similarity 53 Member B) regulates the Wnt signaling pathway by regulating beta-catenin (CTNNB1) nuclear localization |
659 |
no |
1 |
[106] |
10 |
circ Beta TRCP(HUGO name: BTRC) |
343 |
Mass-spectrometry, antiserum
|
Protein mediates tratuzumab resistance by binding to NRF2 transcription factor |
Beta-Transducin Repeat Containing E3 Ubiquitin (BTRC) generates circ Beta TRCP through backsplicing of exon 13 to exon 7. circ Beta TRCP contains WD40 repeats that bind to NRF2 and promotes trastuzumab resistance in breast cancer. In contrast to the linear proteins, circ Beta TRCP lacks an F box needed to bind to the ubiquitination complex SKP1-Cul1-Rbx1. Thus, circ Beta TRCP prevents NRF2 from being ubiquitinated, leading to an increase in NRF2, which promotes HER2-positive breast cancer. |
913 |
B |
7 |
[30] |
11 |
circMET |
404 |
polysomes, antiserum, mass-spectrometry
|
Promotes glioblastoma |
MET (MET Proto-Oncogene, Receptor Tyrosine Kinase) generates circMET through backsplicing its exon 2, which is the first coding exon in the pre-mRNAs. The circRNA is translated after undergoing m6A modification, mediated by YTHDF2. The circMET protein contains the signal peptide and the protein is secreted and binds to the extracellular domain of the linear MET protein, which promotes dimerization without the physiological HGF (hepatocyte growth factor) ligand. circMET promotes glioblastoma tumorgenicity through MET activation. |
1214 |
no |
1 |
[121] |
12 |
circCAPG |
171 |
Mass-spectrometry
|
Promotes breast cancer by binding of serine/threonine kinase 38 (STK38) to SMAD-specific E3 ubiquitin protein ligase 1 (SMURF1) |
CAPG (Capping Actin Protein, Gelsolin Like) binds to the barbed ends of F-actin filaments regulating the filament’s length. Backsplicing of exon 8 to exon 6 generates circCAPG. The circCAPG levels are elevated in triple negative breast cancer and promote tumor growth. It sequesters serine threonine kinase 38, which ultimately prevents MEKK2 proteasomal degradation. The formation of the circRNA is repressed by the splicing factor SLU7, possibly acting on the flanking Alu elements. |
376 |
B |
3 |
[19] |
13 |
circRSRC1 |
161
|
polysomes, mass-spectrometry
|
Regulates assembly of mitochondrial ribosomes. Loss of the protein decreases male fertility. |
RSRC1 (Arginine and Serine Rich Coiled-Coil 1), acts in alternative splicing. Backsplicing from exon 3 to 2 creates a circRSRC1, that is highly conserved between mouse and human. Knock out in mice reduced spermatogenesis. circ RSRC1 binds to C1qbp (Complement C1q Binding Protein), a multifunctional protein. Through interaction with C1qbp, circ RSRC1-161aa could influence mitochondrial ribosome assembly. |
322 |
R |
2 |
[122] |
14 |
circTMEFF1 |
RC3x
|
polysomes, reporter genes
|
Promotes muscle atrophy through binding to TDP-43 |
TMEFF1 (Transmembrane Protein with EGF Like And Two Follistatin Like Domains 1) in involved in receptor signaling. It forms a circular RNA by backsplicing of exon 7 to 5 that is upregulated in muscular atrophy. circTMEFF1 promotes atrophy in cell and mouse models, which can be antagonized by siRNAs.
|
339 |
R |
3 |
[123] |
16 |
circPPP1R12A |
72
|
Mass-spectrometry from reporter constructs
|
Activates YAP, promotes metastasis |
PPP1R12A (Protein Phosphatase 1 Regulatory Subunit 12A), also known as Myosin phosphatase target subunit 1, regulates myosin-actin interaction. Backsplicing of exon 25 to 24 generates circPPP1R12A. The encoded protein is in a different reading frame than the linear protein and has no orthologs in the protein database. The protein promotes metastasis by indirectly affecting the phosphorylation of YAP, which activates oncogenes. |
1138 |
R |
2 |
[62,124] |
17 |
circGSPT1 |
238 |
mass-spectrometry |
Binds to vimentin tumor suppressor |
GSPT1 (G1 To S Phase Transition 1) is acting in translational termination, also known as eukaryotic release factor 3A (eRF3A). Backsplicing from exon 11 to 4 generates circ GSPT1that promotes autophagy and apoptosis in cancer cell models by binding to Vimentin/beclin/14-3-3. CircGSPT1is downregulated in gastric cancer, and act like a tumor suppressor. |
826 |
L |
6 |
[125] |
18 |
circMAPK14 |
175 |
mass-spectroscopy |
Binds to MKK6 |
MAPK14 (Mitogen-Activated Protein Kinase 14) is a serine/threonine kinase activated by environmental stress or cytokines. It generates circMAPK14 through backsplicing of exon 10 to 4. circMAPK14. is downregulated in colorectal colon cancer reduces proliferation. Both linear and circular MAPK14 bind to MAP2K6 (aka MKK6, Mitogen-Activated Protein Kinase Kinase 6) and circMAPK14 antagonizes the linear MAPK14 MAP2K6 interaction, leading to a change in a transcriptional program. |
506 |
L |
6 |
[126] |
19 |
circPLCE1 |
411
|
mass spectrometry from a reporter gene
|
Influenced NFKb through ubiquitination |
PLCE1 (Phospholipase C Epsilon 1) hydrolyzes phosphatidylinositol-4,5-bisphosphate generating inositol 1,4,5-triphosphate (IP3) and diacylglycerol (DAG). Backsplicing of its exon 2 genererates circPLCE1 that is downregulated in colectoral carcinoma. CircPLCE1 promotes cancer cell proliferation and metastasis. By sequestering HSP90alpha in the HSP90alpha/RPS3 complex, leading to RPS3 ubiquitination and degradation. This pathway is disturbed in cancer cells due to the reduced circPLCE1 expression, which ultimately leads to NF kappa B activation in the nucleus and a tumor program. |
1570 |
no |
1 |
[127] |
20 |
circCTNNB1
|
370 |
mass spectrometry from a reporter gene
|
promotes migration and proliferation of cancer cells.
|
CTNNB1 (Catenin Beta 1) is part of adherents junctions that regulate cell-cell interactions and forms circCTNNB1 by backsplicing of exon 7 to 2. circCTNNB1 is upregulated in liver cancer and non-small lung cell carcinoma tissue and competes with the full length CTNNB1 for phosphorylation by GSK3. As the phosphorylation leads to degradation of the protein, circ CTNNB1 ‘protects’ full length CTNNB1 from degradation, allowing the full-length protein to activate the Wnt/beta catenin pathway. |
1068 |
no |
5 |
[128], [129] |
21 |
circARHGAP35 |
1289 |
Mass-spectrometry of reporter genes, polysomes.
|
Encoded protein promotes cancer cell progression |
ARHGAP35 (Rho GTPase Activating Protein 35) is a cytosolic GTPase activating protein. It creates circARHGAP35 by backsplicing exon 2 to exon 3, The circRNA is upregulated in hepatocellular carcinoma and promotes proliferation and metastasis. In contrast to the cytosolic linear protein, the circular protein is nuclear and still contains FF domains, necessary for binding to TFII-I, which likely starts a cancerogenic expression program. |
4014 |
L |
2 |
[31] |
22 |
circEGFR |
RC |
mass-spectrometry using reporter cell
|
Prevents receptor endocytosis |
EGFR (Epidermal Growth Factor Receptor) is a transmembrane protein acting as protein kinase upon acidification. Backsplicing of exons15 to 14 creates circEGFR encode an infinite open reading frame (iORF). The circprotein remains associated with the cell membrane and prevents endocytosis and inactivation of the receptor. The circRNA is upregulated in glioblastoma and expression levels correlate with survival. |
249 |
B |
2 |
[60] |
23 |
CircAPP |
175 |
mass-spectrometry from reporter genes and brain tissue
|
circAPP was identified in sporadic AD |
APP (Amyloid Beta Precursor Protein) is a cell surface receptor that is cleaved by secretases into a number of peptides, some of them form protein aggregates involved in Alzheimer’s disease. The gene creates a circular RNA by backsplicing exon 17 to 14. |
524 |
B |
4 |
[130] |
24 |
circFNDC3B |
218
|
mass spectrometry
|
Inhibits proliferation of cancer cells |
FNDC3B (Fibronectin Type III Domain Containing 3B) is a single pass membrane protein present in the endoplasmic reticulum and the plasma membrane. It generates circFNDC3B through backsplicing from exon 6 to 5. circFNDC3B expression is downregulated in colorectal cancer cells. circ FNDC3B inhibits the proliferation of colorectal cancer cells, possibly by inhibiting the expression of SNAIL, a transcriptional regulator that inhibits FBP1 (Fructose-Bisphosphatase 1 ) expression. |
526 |
L |
2 |
[131] |
25 |
circRTN4(mouse) |
RC |
mass-spectrometry using reporter constructs |
To be determined |
Mouse Reticulon 4 (RTN4) generates a circRTN4 by backsplicing exon 3 to 2. |
2457 |
B |
2 |
[132] |
26 |
circNlgn(mouse) |
173 |
Antisera, mass-spectrometry
|
Promoter activation |
Neuroligin 1 is a neuronal cell surface protein. It generates circNlgn. through circularization of its first coding exon that includes the signal peptide. The encoded protein has 9 circRNA specific amino acids at its C-terminus that interact with Lamin B1, leading to the nuclear localization of the circRNA protein where it activates promoters of ING4 and C8orf44-SGK3 genes that impacts on cardiac fibroblast proliferation. Transgenic mice with circNlgn-173 have a heart phenotype
|
813 |
B |
1 |
[133] |
27 |
circAXIN1 |
295 |
Antisera and mass-spectrometry
|
Highly expressed and associated with lymph node metastasis in gastric cancer |
AXIN1 (Axis Inhibition Protein 1) encodes a protein phosphatase. Circularization of exon 2 generates circAXIN1 that contains the start codon of the linear protein but has a unique C-terminus consisting of two amino acids TD. Circ AXIN1 binds competitively with APC, i.e., removes linear AXIN1. This releases beta catenin from a larger cytosolic complex. Beta catenin translocate into the nucleus and activates the wnt pathway, promoting gastric cell cancer growth.
|
959 |
R |
1 |
[134] |
28 |
circEIF6 |
224 |
Polysome, mass-spectrometry, antisera.
|
proliferation, and invasion of triple negative breast cancer cells
|
eIF6 (Eukaryotic Translation Initiation Factor 6) prevents the association of the 40S and 60S rRNA subunits during translation. It also known as Integrin beta 4 binding protein (ITGB4BP) and is part of hemidesmosomes that link the basal membrane to the cytoskeleton. Backsplicing from exon 7 to exon 3 generates circEIF6 promotes proliferation, and invasion of triple negative breast cancer cells. It binds to MYH9, which prevents MYH9 ubiquitination and degradation, leading to an activation of the Wnt/beta-catenin pathway |
906 |
R |
5 |
[135] |
29 |
circHER2 |
103
|
polysome mass-spectrometry antisera
|
promotes heterodimerization between EGFR and HER3 receptor
|
HER2 (Erb-B2 Receptor Tyrosine Kinase 2) is part of the epidermal growth factor receptor family. Through heterodimerization it enhances ligand binding and activation of other receptor kinases. Through backsplicing from exon 7 to 3, it creates circHER2, that is 100% identical to the linear one. CircHER2 expression correlates with poor survival in breast cancer. |
750 |
no |
5 |
[32] |
30 |
circMAPT12 to7 |
RC |
Reporter genes
|
Tau aggregation |
MAPT (microtuble associated protein tau) generates circTau RNA through backsplicing of exon 12 to 7. The encoded protein promotes aggregation of linear tau protein in reporter cells. |
752 |
B |
5 |
[54] |
31 |
circMAPT12 to 10 |
RC |
Reporter genes and mass spectrometry
|
Tau aggregation |
MAPT (microtuble associated protein tau) generates circTau RNA through backsplicing of exon 10 to 7. This circTau RNA lacks a start codon, but is translated after undergoing A>I editing, likely creating a AUI start codon. |
288 |
B |
3 |
[54] |
32 |
circMAN21-400 |
186 |
Reporter genes
|
circMAN2A1 expressin corelated with AD |
circMAN2A1 RNA expression correlates with Alzheimer’s disease progression. The encoded protein lacks a catalytic domain.
|
400 |
B |
2 |
[95] |
33 |
circSDHA2 |
146 |
Reporter genes
|
Tumor growth |
SDHAF2 (Succinate Dehydrogenase Complex Assembly Factor 2) is necessary for the flavination of a succinate dehydrogenase complex subunit A. SDHAF2 creates circSDHAF2 by backsplicing exon 4 to 3. Overexpression of circSDHAF2 promotes tumor growth. |
334 |
B |
2 |
[93] |
34 |
circSLC8A1 |
605 |
Western blot using microsomes from rabbit heart |
Ca++ exchanger |
SLC8A1 (previously called NCX1) generates circSLC8A1 through backsplicing of exon 2. The protein lacks the hydrophobic domain of the linear protein but can still transport Ca++ across a membrane. |
1832 |
no |
1 |
[42] |