ARTICLE | doi:10.20944/preprints202009.0699.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: SVM; MRMR; Bootstrap; Genes; Gene Expression; Biological Relevance; Subject Classification
Online: 29 September 2020 (09:09:52 CEST)
Selection of biologically relevant genes from high dimensional expression data is a key research problem in gene expression genomics. Most of the available gene selection methods are either based on relevancy or redundancy measure, which are usually adjudged through post selection classification accuracy. Through these methods the ranking of genes was done on a single high-dimensional expression data, which leads to the selection of spuriously associated and redundant genes. Hence, we developed a statistical approach through combining Support Vector Machine with Maximum Relevance and Minimum Redundancy under a sound statistical setup for the selection of biologically relevant genes. Here, the genes are selected through statistical significance values computed using a non-parametric test statistic under a bootstrap based subject sampling model. Further, a systematic and rigorous evaluation of the proposed approach with nine existing competitive methods was carried on six different real crop gene expression datasets. This performance analysis was carried out under three comparison settings, i.e. subject classification, biological relevant criteria based on quantitative trait loci, and gene ontology. Our analytical results showed that the proposed approach selects genes that are more biologically relevant as compared to the existing methods. Moreover, the proposed approach was also found to be better with respect to the competitive existing methods. The proposed statistical approach provides a framework for combining filter, and wrapper methods of gene selection.
ARTICLE | doi:10.20944/preprints202111.0345.v1
Subject: Engineering, Biomedical & Chemical Engineering Keywords: brain-computer interface (BCI); electroencephalography (EEG); stress state recognition; feature selection; particle swarm optimization (PSO); mRMR; SVM; DEEP; SEED
Online: 19 November 2021 (11:01:19 CET)
Mental stress state recognition using electroencephalogram (EEG) signals for real-life applications needs a conventional wearable device. This requires an efficient number of EEG channels and an optimal feature set. The main objective of the study is to identify an optimal feature subset that can best discriminate mental stress states while enhancing the overall performance. Thus, multi-domain feature extraction methods were employed, namely, time domain, frequency domain, time-frequency domain, and network connectivity features, to form a large feature vector space. To avoid the computational complexity of high dimensional space, a hybrid feature selection (FS) method of minimum Redundancy Maximum Relevance with Particle Swarm Optimization and Support Vector Machine (mRMR-PSO-SVM) is proposed to remove noise, redundant, and irrelevant features and keep the optimal feature subset. The performance of the proposed method is evaluated and verified using four datasets, namely EDMSS, DEAP, SEED, and EDPMSC. To further consolidate, the effectiveness of the proposed method is compared with that of the state-of-the-art heuristic methods. The proposed model has significantly reduced the features vector space by an average of 70% in comparison to the state-of-the-art methods while significantly increasing overall detection performance.