Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Stable Variable Selection Method with Shrinkage Regression Applied to the Selection of Genetic Variants Associated with Alzheimer’s Disease

Version 1 : Received: 7 March 2024 / Approved: 7 March 2024 / Online: 8 March 2024 (04:29:51 CET)

A peer-reviewed article of this Preprint also exists.

Afreixo, V.; Tavares, A.H.; Enes, V.; Pinheiro, M.; Rodrigues, L.; Moura, G. Stable Variable Selection Method with Shrinkage Regression Applied to the Selection of Genetic Variants Associated with Alzheimer’s Disease. Appl. Sci. 2024, 14, 2572. Afreixo, V.; Tavares, A.H.; Enes, V.; Pinheiro, M.; Rodrigues, L.; Moura, G. Stable Variable Selection Method with Shrinkage Regression Applied to the Selection of Genetic Variants Associated with Alzheimer’s Disease. Appl. Sci. 2024, 14, 2572.

Abstract

In this work we looked for a stable and accurate procedure to perform feature selection in datasets with a much higher number of predictors than individuals, as in Genome-Wide Association Studies. Due to the instability in feature selection when many potential predictors are measured, a variable selection procedure is proposed that combines several replications of shrinkage regression models. A weighted formulation is used to define the final predictors. The procedure is applied to investigate Single Nucleotide Polymorphisms (SNPs) predictors associated to Alzheimer’s disease in the Alzheimer’s disease Neuroimaging Initiative (ADNI) dataset. Two data scenarios are investigated: one considering only the set of SNPs and another with the covariates age, sex, educational level and APOE4 genotype. The SNP rs2075650 and the APOE4 genotype are given as risk factors for AD, which is in line with the literature, and other four new SNPs are pointed, opening new hypotheses for in vivo analyses. These experiments demonstrate the potential of the new method for stable feature selection.

Keywords

penalized regression; Akaike’s Information Criterion; high-dimensional data; stability; overall weighted coefficients; Alzheimer’s disease; SNP

Subject

Computer Science and Mathematics, Applied Mathematics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.