Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Principal Component Analysis of RNA-seq Data Unveils a Novel Prostate Cancer-Associated Gene Expression Signature

Version 1 : Received: 5 February 2021 / Approved: 9 February 2021 / Online: 9 February 2021 (10:26:47 CET)

How to cite: Perera, Y.; Gonzalez, A.; Perez, R. Principal Component Analysis of RNA-seq Data Unveils a Novel Prostate Cancer-Associated Gene Expression Signature. Preprints 2021, 2021020234. https://doi.org/10.20944/preprints202102.0234.v1 Perera, Y.; Gonzalez, A.; Perez, R. Principal Component Analysis of RNA-seq Data Unveils a Novel Prostate Cancer-Associated Gene Expression Signature. Preprints 2021, 2021020234. https://doi.org/10.20944/preprints202102.0234.v1

Abstract

Prostate cancer (Pca) is a highly heterogeneous disease and the second more common tumor in males. Molecular and genetic profiles have been used to identify subtypes and guide therapeutic intervention. However, roughly 26% of primary Pca are driven by unknown molecular lesions. We use Principal Component Analysis (PCA) and custom RNAseq-data normalization to identify a gene expression signature which segregates primary PRAD from normal tissues. This Core-Expression Signature (PRAD-CES) includes 33 genes and accounts for 39% of data complexity along the PC1-cancer axis. The PRAD-CES is populated by protein-coding (AMACR, TP63, HPN) and RNA-genes (PCA3, ARLN1) sparsely found in previous studies, validated/predicted biomarkers (HOXC6, TDRD1, DLX1), and/or cancer drivers (PCA3, ARLN1, PCAT-14). Of note, the PRAD-CES also comprises six over-expressed LncRNAs without previous Pca association, four of them potentially modulating driver’s genes TMPRSS2, PRUNE2 and AMACR. Overall, our PCA capture 57% of data complexity within PC1-3. GO enrichment and correlation analysis involving major clinical features (i.e., Gleason Score, AR Score, TMPRSS2-ERG fusion and Tumor Cellularity) suggest that PC2 and PC3 gene signatures might describe more aggressive and inflammation-prone transitional forms of PRAD. Of note, surfaced genes may entail novel prognostic biomarkers and molecular alterations to intervene. Particularly, our work uncovered RNA genes with appealing implications on Pca biology and progression.

Keywords

Principal Component Analysis, RNA-seq, prostate cancer, biomarkers, RNA genes

Subject

Biology and Life Sciences, Anatomy and Physiology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.