Preserved in Portico This version is not peer-reviewed
Genesis of Non-coding RNA Genes- A Sequence Connection with Protein Genes Separated by Evolutionary Time
: Received: 19 July 2020 / Approved: 20 July 2020 / Online: 20 July 2020 (04:39:41 CEST)
A peer-reviewed article of this Preprint also exists.
Journal reference: Non-coding RNA 2020
A small phylogenetically conserved sequence of 11,231 bp termed FAM247 is repeated in human chromosome 22 by segmental duplications. This sequence forms part of diverse genes that span evolutionary time, the protein genes being the earliest as they are present in zebrafish and/or mice genomes, the long non-coding RNA genes and pseudogenes the most recent as they appear to be present only in the human genome. We propose that the conserved sequence provides a nucleation site for new gene development at evolutionary conserved chromosomal loci where the FAM247 sequences reside. The FAM247 sequence also carries information in its open reading frames that provides protein exon amino acid sequences; one exon plays an integral role in immune system regulation, specifically, the function of ubiquitin specific protease (USP18) in the regulation of interferon. An analysis of this multifaceted sequence and the genesis of genes that contain it are presented.
gene evolution; gene formation; long non-coding RNA genes; pseudogenes; USP18; GGT5
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.