Submitted:
23 April 2026
Posted:
28 April 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We provide a fold-complete subject-wise LOSOCV study of our recently proposed GAF–PLV representation family on the retained executable cohort.
- We report a transparent retained-cohort analysis on the 105 held-out subject folds for which subject-wise rerun outputs were available in this study and pair it with bootstrap and inferential statistics so that cross-subject uncertainty is visible rather than implicit.
- We provide a direct protocol-sensitive comparison with our earlier pooled-window proof-of-concept benchmark, thereby showing how the same representation family changes meaning when the validation unit changes from windows to unseen subjects.
- We quantify that protocol-sensitive shift using both the cohort mean and held-out subject anchors rather than treating the drop as a single fixed number.
- We convert a strong proof-of-concept benchmark result into a more rigorous field reference by making the retained cohort, fold-level variability, uncertainty, and scope boundaries explicit.
- We distil the empirical findings into a practical reporting standard for future subject-independent MI-EEG studies and add exploratory classical-baseline context while making cohort mismatch explicit.
2. Related Work
3. Methodology
3.1. Study Design and Rationale
3.2. Dataset and Problem Formulation
3.3. Subject Inclusion, Exclusion, and Reporting Transparency
3.4. Preprocessing and Temporal Partitioning
3.5. Temporal Representation Using GAF and Spearman Integration
3.6. Spatial Representation Using PLV
3.7. Dual-Input Parallel CNN
3.8. Subject-Wise LOSOCV
3.9. Performance Metrics
4. Results and Discussion
4.1. Cohort-Level LOSOCV Performance
4.2. Statistical Analysis of Subject-Wise Performance

| Metric | Mean ± SD | Median | Q1–Q3 | Min–Max | 95% CI | Baseline | t(104) | Wilcoxon p | Cohen’s d |
|---|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.5807 ± 0.0827 | 0.5714 | 0.5238–0.6429 | 0.3810–0.7857 | 0.5649–0.5964 | 0.5 | 10.00 () | 0.976 | |
| Macro-F1 | 0.5348 ± 0.1119 | 0.5524 | 0.4312–0.6050 | 0.3226–0.7846 | 0.5139–0.5562 | 0.5 | 3.19 () | 0.311 | |
| Cohen’s kappa | 0.1615 ± 0.1654 | 0.1429 | 0.0476–0.2857 | –0.5714 | 0.1306–0.1927 | 0 | 10.00 () | 0.976 |
4.3. Subject-to-Subject Variability
4.4. Class-Wise Performance
4.5. Protocol Sensitivity Relative to the Earlier Pooled-Window Proof-of-Concept Study
4.6. Why Subject-Wise Evaluation Reveals the Generalization Gap
4.7. Protocol Trust and the Interpretation of Performance
4.8. Field Significance of the Present Benchmark
4.9. Closest Same-Dataset Subject-Independent Comparisons
4.10. Transparent Cohort Reporting and the Absent Four Subjects
4.11. Computational Cost
5. Study Scope and Next Steps
- state the validation unit explicitly (window, trial, subject, or nested subject-based design);
- report benchmark-oriented discrimination claims and subject-independent transfer claims as separate evidence levels rather than presenting one as proof of the other;
- report the full subject-wise fold profile rather than only a cohort mean;
- disclose cohort inclusion and exclusion rules with exact subject identifiers whenever possible;
- include chance-corrected and/or class-balanced metrics alongside raw accuracy; and
- provide confidence intervals or equivalent uncertainty estimates for the main performance statistics.
6. Conclusion
Author Contributions
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Padfield, N.; Zabalza, J.; Zhao, H.; Masero, V.; Ren, J. EEG-Based Brain-Computer Interfaces Using Motor-Imagery: Techniques and Challenges. Sensors 2019, vol. 19(no. 6), 1423. [Google Scholar] [CrossRef]
- Wang, X. , An in-depth survey on deep learning-based motor imagery electroencephalogram classification. Artif. Intell. Med. 2024, vol. 150, Art. no. 102738. [Google Scholar] [CrossRef] [PubMed]
- Schirrmeister, R. T. , Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 2017, vol. 38(no. 11), 5391–5420. [Google Scholar] [CrossRef] [PubMed]
- Dose, H.; Møller, J. S.; Iversen, H. K.; Puthusserypady, S. An end-to-end deep learning approach to MI-EEG signal classification for BCIs. Expert Syst. With Appl. 2018, vol. 114, 532–542. [Google Scholar] [CrossRef]
- Huang, W.; Chang, W.; Yan, G.; Zhang, Y.; Yuan, Y. Spatio-spectral feature classification combining 3D-convolutional neural networks with long short-term memory for motor movement/imagery. Eng. Appl. Artif. Intell. 2023, vol. 120. [Google Scholar] [CrossRef]
- Hou, Y.; et al. GCNs-Net: A Graph Convolutional Neural Network Approach for Decoding Time-Resolved EEG Motor Imagery Signals. IEEE Trans. Neural Netw. Learn. Syst. 2024, vol. 35(no. 6), 7312–7323. [Google Scholar] [CrossRef]
- Lv, R.; Chang, W.; Yan, G.; Sadiq, M. T.; Nie, W.; Zheng, L. Enhanced classification of motor imagery EEG signals using spatio-temporal representations. In Information Sciences; 2025. [Google Scholar]
- Del Pup, F.; Zanola, A.; Tshimanga, L. F.; Bertoldo, A.; Finos, L.; Atzori, M. The role of data partitioning on the performance of EEG-based deep learning models in supervised cross-subject analysis: A preliminary study. Comput. Biol. Med. 2025, vol. 196, Art.(no. 110608). [Google Scholar] [CrossRef]
- Goldberger, A. L. , PhysioBank, PhysioToolkit, and PhysioNet. Circulation 2000, vol. 101(no. 23). [Google Scholar] [CrossRef]
- Lomelin-Ibarra, V. A.; Gutierrez-Rodriguez, A. E.; Cantoral-Ceballos, J. A. Motor Imagery Analysis from Extensive EEG Data Representations Using Convolutional Neural Networks. Sensors 2022, vol. 22(no. 16, Art. no. 6093). [Google Scholar] [CrossRef]
- Roots, K.; Muhammad, Y.; Muhammad, N. Fusion Convolutional Neural Network for Cross-Subject EEG Motor Imagery Classification. Computers 2020, vol. 9(no. 3, Art. no. 72). [Google Scholar] [CrossRef]
- Chowdhury, R. R.; Muhammad, Y.; Adeel, U. Enhancing Cross-Subject Motor Imagery Classification in EEG-Based Brain-Computer Interfaces by Using Multi-Branch CNN. Sensors 2023, vol. 23(no. 18, Art. no. 7908). [Google Scholar] [CrossRef]
- Lun, X.; Yu, Z.; Chen, T.; Wang, F.; Hou, Y. A Simplified CNN Classification Method for MI-EEG via the Electrode Pairs Signals. Front. Hum. Neurosci. 2020, vol. 14, Art.(no. 338). [Google Scholar] [CrossRef]
- Li, D.; Ortega, P.; Wei, X.; Faisal, A. Model-Agnostic Meta-Learning for EEG Motor Imagery Decoding in Brain-Computer-Interfacing. Proc. 10th Int. IEEE/EMBS Conf. Neural Engineering (NER), 2021; pp. 527–530. [Google Scholar]
- Majoros, T.; Oniga, S. Overview of the EEG-Based Classification of Motor Imagery Activities Using Machine Learning Methods on the PhysioNet Four-Class Motor Imagery Dataset. Electronics 2022, vol. 11(no. 15, Art. no. 2293). [Google Scholar] [CrossRef]
- Aung, H. W.; Li, J. J.; An, Y.; Su, S. W. EEG_GLT-Net: Optimising EEG graphs for real-time motor imagery signals classification. Biomed. Signal Process. Control 2025, vol. 104, Art.(no. 107458). [Google Scholar] [CrossRef]
- Huang, W.; Yan, G.; Chang, W.; Zhang, Y.; Yuan, Y. EEG-based classification combining Bayesian convolutional neural networks with recurrence plot for motor movement/imagery. Pattern Recognit. 2023, vol. 144, Art.(no. 109838). [Google Scholar] [CrossRef]
- Ghimire, A.; Sekeroglu, K. Classification of EEG Motor Imagery Tasks Utilizing 2D Temporal Patterns with Deep Learning. Proc. 2nd Int. Conf. Image Processing and Vision Engineering (IMPROVE), 2022; pp. 182–188. [Google Scholar]
- Huang, W.; Chang, W.; Yan, G.; Yang, Z.; Luo, H.; Pei, H. EEG-based motor imagery classification using convolutional neural networks with local reparameterization trick. Expert Syst. With Appl. 2022, vol. 187, Art.(no. 115968). [Google Scholar] [CrossRef]
- Wang, X.; Hersche, M.; Tömekce, B.; Kaya, B.; Magno, M.; Benini, L. An Accurate EEGNet-based Motor-Imagery Brain–Computer Interface for Low-Power Edge Computing. Proc. IEEE Int. Symp. Medical Measurements and Applications (MeMeA), 2020. [Google Scholar]
- Hwaidi, J. F.; Chen, T. M. Classification of Motor Imagery EEG Signals Based on Deep Autoencoder and Convolutional Neural Network Approach. IEEE Access 2022, vol. 10, 48071–48081. [Google Scholar] [CrossRef]
- Fan, C.; Yang, B.; Li, X.; Zan, P. Temporal-frequency-phase feature classification using 3D-convolutional neural networks for motor imagery and movement. Front. Neurosci. 2023, vol. 17, Art.(no. 1250991). [Google Scholar] [CrossRef]
- Wang, Z.; Oates, T. Encoding time series as images for visual inspection and classification using tiled convolutional neural networks. Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015. [Google Scholar]
- Altamirano, J. Motor Imagery EEG Classification using Common Spatial Patterns and Machine Learning: A Cross-Subject Study; preprint, Mar 2026. [Google Scholar]
- Sartipi, M.; Yaghoubi, M. E.; Nasrabadi, A. M. A subject-independent semi-supervised deep architecture for motor imagery classification from EEG signals. arXiv 2024, arXiv:2402.09438. [Google Scholar]
- Perez-Velasco, A.; Santamaria-Vazquez, E.; Martinez-Cagigal, V. EEGSym: Overcoming inter-subject variability in motor imagery based BCIs with deep learning. J. Neural Eng. 2022, vol. 19(no. 5), Art. no. 056018. [Google Scholar] [CrossRef]
- Gomez-Rivera, A.; Collazos-Huertas, D. F. Gaussian Connectivity-Driven EEG Imaging for Deep Learning-Based Motor Imagery Classification. Sensors 2025, vol. 26(no. 1, Art. no. 227). [Google Scholar] [CrossRef]
- Tibermacine, A.; Naidji, I.; Tibermacine, I. E.; Mamen, L.; Rabehi, A.; Habib, M. EEG-TriNet++: A Transformer-Guided Meta-Learning Framework for Robust and Generalizable Motor Imagery Classification. Bioengineering 2026, vol. 13(no. 3, Art. no. 307). [Google Scholar] [CrossRef]
- Lian, X.; Liu, C.; Gao, C. A Multi-Branch Network for Integrating Spatial, Spectral, and Temporal Features in Motor Imagery EEG Classification. Brain Sci. 2025, vol. 15(no. 8, Art. no. 877). [Google Scholar] [CrossRef]












| Paper | Year | Subjects | Protocol | Main interpretive note | Inference scope |
|---|---|---|---|---|---|
| Enhanced classification of MI EEG signals using spatio-temporal representations (our recent proof-of-concept study) [7] | 2025 | 10 and 30 | Windows generated before split; 9:1 sample-level split; five-fold sample-level CV | Strong benchmark result under segment-wise evaluation; not designed as a strict unseen-subject study | Benchmark only |
| hline Lomelin-Ibarra et al. [10] | 2022 | 105 | 80% of generated samples for training, 10% for validation, 10% for test over image-like representations | Image-level pooled-sample split | Limited |
| Roots et al. [11] | 2020 | 103 | 70% of pooled samples for training, 10% for validation, 20% for testing | Pooled-sample split despite cross-subject framing | Limited |
| Chowdhury et al. [12] | 2023 | 103 | 70% random data for training, 10% for validation, 20% for testing | Random pooled splitting rather than strict unseen-subject evaluation | Limited |
| Huang et al. [17] | 2023 | single-subject experiments | Recurrence-plot image classification on one subject at a time | Useful for within-subject analysis, but not for broad cross-subject claims | Limited |
| Ghimire and Sekeroglu [18] | 2022 | 109 | Subject-block style validation using subsets of subjects for validation under a global setting | More structured than pooled random splitting, but not a clean LOSO or nested unseen-subject design | Partial |
| Huang et al. [19] | 2022 | 109 | Global classifier evaluated by grouped subject partitions; individual variability also examined | Cleaner than sample-level splits, but accessible protocol remains insufficiently explicit for a full LOSO interpretation | Partial |
| Study | Year | Subjects | Why fewer than 109 were used | Interpretation |
|---|---|---|---|---|
| Dose et al. [4] | 2018 | 105 | Subset of 105 subjects retained due to missing trials in the public dataset | Criterion-based exclusion |
| Lun et al. [13] | 2020 | 10, 20, 60, 100 | Deliberate subgroup benchmarking on the PhysioNet dataset | Reduced-cohort benchmarking |
| Roots et al. [11] | 2020 | 103 | Six subjects omitted because of incorrectly annotated data | Criterion-based exclusion |
| Wang et al. [20] | 2020 | 105 | Four subjects discarded because of variability in number of trials | Criterion-based exclusion |
| Li et al. [14] | 2021 | 48 of 104 | Quality-based outlier filtering before experimentation | Quality-filtered subset |
| Majoros and Oniga [15] | 2022 | 10 and 20 | Deliberate reduced-cohort benchmarking on the PhysioNet four-class dataset | Reduced-cohort benchmarking |
| Hwaidi and Chen [21] | 2022 | 10 | Deliberate 10-subject PhysioNet experiment | Reduced-cohort benchmarking |
| Fan et al. [22] | 2023 | 20 | Selected 20 PhysioNet subjects for evaluation/validation | Reduced-cohort benchmarking |
| Aung et al. [16] | 2025 | 20 | Empirical study conducted on 20 PhysioNet subjects | Reduced-cohort benchmarking |
| Setting | Subjects | Task | Partition unit | Validation logic | Reported result | What the result supports |
|---|---|---|---|---|---|---|
| Earlier pooled-window proof-of-concept benchmark | 10 | 2-class | 0.4 s windows generated before split | 9:1 sample-level split; five-fold sample-level CV | 99.73% accuracy | Strong pooled-window discrimination on a reduced cohort; established proof-of-concept representational promise rather than an unseen-subject claim. |
| Earlier pooled-window proof-of-concept benchmark | 30 | 2-class | 0.4 s windows generated before split | 9:1 sample-level split; five-fold sample-level CV | 99.18% accuracy | Same proof-of-concept message under a larger reduced cohort; still not a strict unseen-subject evaluation. |
| This work | 105 | 2-class | Whole subject held out at test time; trial-level majority voting | LOSOCV with a separate validation subject in each outer fold | 58.07% ± 8.27% mean accuracy | Unseen-subject performance estimate with full retained-fold disclosure and explicit inter-subject variability. |
| Paper | Year | Subjects | Validation style |
Reported result |
Fold-level profile shown |
Fairness note |
|---|---|---|---|---|---|---|
| This work | 2026 | 105 | LOSOCV, one held-out subject per fold | 58.07% ± 8.27%, (min 38.10%, median 57.14%, max 78.57%) | Yes | Exact retained executable cohort (subjects 1–105) is named, all 105 held-out folds are reported, and cohort-level inference is paired with explicit fold-level variability plots. This is the highest-transparency comparator in the table. |
| Altamirano CSP+SVM preprint [24] | 2026 | 5 | LOSO on a small subject subset | 51.11% ± 10.33% | Partial (5 folds) | Useful classical-baseline context, but only five subjects are analysed and the source is a preprint; the comparison is therefore exploratory rather than definitive. |
| SSDA [25] | 2024 | 105 | LOSO | 78% ± 3% | No visible full table | Closest protocol match among the checked papers, but the accessible main results emphasise aggregate mean ± standard deviation rather than the complete held-out subject profile. |
| EEGSym [26] | 2022 | cohort | LOSO/inter-subject with pretraining and fine-tuning | 88.6% ± 9.0% | No visible full table | Strong result, but not directly like-for-like because the pipeline uses a stronger transfer-learning setup and the accessible reporting foregrounds cohort-level inter-subject summaries. |
| GCNs-Net [6] | 2024 | 109 | Group-level cross-subject evaluation | 88.57% | No visible full table | High reported PhysioNet performance in a strong IEEE venue, but the accessible description is group-level rather than a strict retained-cohort LOSOCV rerun matched to the present protocol. |
| EEGNet global validation [20] | 2020 | 105 | Subject-based 5-fold global validation | 82.43% | No visible full table | Useful subject-based context, but not a strict LOSO study; protocol match is therefore partial rather than full. |
| Gaussian connectivity-driven EEG imaging [27] | 2025 | 109 | Same-dataset deep-learning benchmark; exact LOSO match not explicit in accessible summary | Same-dataset result reported in article | No visible full table | Relevant recent PhysioNet comparator, but task formulation and accessible protocol details are not sufficiently aligned with the present binary retained-cohort LOSOCV study for a like-for-like accuracy claim. |
| EEG-TriNet++ [28] | 2026 | PhysioNet cohort | LOSO | 70.8% ± 0.9% | No visible full table | Strong recent same-dataset LOSO comparator with repeated-run summaries and macro-F1 reporting, but it still does not expose the complete held-out fold-by-fold subject profile the way the present paper does. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).