Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R

Version 1 : Received: 11 March 2022 / Approved: 14 March 2022 / Online: 14 March 2022 (10:04:52 CET)

A peer-reviewed article of this Preprint also exists.

Archer, K.J.; Seffernick, A.E.; Sun, S.; Zhang, Y. ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R. Stats 2022, 5, 371-384. Archer, K.J.; Seffernick, A.E.; Sun, S.; Zhang, Y. ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R. Stats 2022, 5, 371-384.

Abstract

Stage of cancer is a discrete ordinal response that indicates aggressiveness of disease and is often used by physicians to determine the type and intensity of treatment to be administered. For example, the FIGO stage in cervical cancer is based on the size and depth of the tumor as well as the level of spread. It may be of clinical relevance to identify molecular features from high-throughput genomic assays that are associated with stage of cervical cancer, to elucidate pathways related to tumor aggressiveness, identify improved molecular features that may be useful for staging, and identify therapeutic targets. High-throughput RNA-Seq data and corresponding clinical data (including stage) for cervical cancer patients has been made available through The Cancer Genome Atlas Project (TCGA). We recently described penalized Bayesian ordinal response models that can be used for variable selection for over-parameterized datasets such as the TCGA-CESC dataset. Herein, we describe our ordinalbayes R package, available from the Comprehensive R Archive Network (CRAN), which is capable of fitting cumulative logit models when the outcome is ordinal and the number of predictors exceeds the sample size, P>N, such as for TCGA data. We demonstrate use of this package through application to TCGA cervical cancer dataset. Our ordinalbayes package can be used to fit models to high-dimensional dataset and effectively performs variable selection.

Keywords

cumulative logit; penalized models; LASSO; variable inclusion indicators; spike-and-slab

Subject

Computer Science and Mathematics, Probability and Statistics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.