Submitted:
12 May 2026
Posted:
13 May 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- 1.
- A unified multi-ROI multimodal transformer architecture for Alzheimer’s disease classification. We propose a multiple-input deep learning framework that jointly models heterogeneous data sources, including 3D MRI-derived regions of interest (ROIs), clinical metadata, and volumetric biomarkers. The architecture enables end-to-end learning across modalities, capturing complementary information relevant to Alzheimer’s disease progression.
- 2.
- Multi-ROI tokenization of volumetric MRI using 3D tubelet embeddings. Instead of processing whole-brain volumes, the proposed model decomposes MRI data into multiple anatomically relevant ROIs (e.g., hippocampus, entorhinal cortex, fornix, and cortical lobes). Each ROI is independently encoded using 3D tubelet embeddings, allowing fine-grained spatial representation learning while reducing irrelevant background information.
- 3.
- Feature-wise tokenization of clinical and volumetric data for transformer-based fusion. We introduce a feature tokenization strategy that transforms tabular clinical variables and volumetric biomarkers into learnable token representations. This design enables seamless integration of structured data into the transformer architecture, facilitating cross-modal attention between imaging and non-imaging features.
- 4.
- Modality-aware embedding for explicit cross-modal representation learning. The model incorporates learnable modality embeddings to distinguish between ROI-specific imaging tokens and non-imaging features. This mechanism enhances the model’s ability to learn modality-specific and cross-modal interactions within a unified attention framework.
- 5.
- Attention-based multimodal fusion with dual representation learning. A hybrid representation is obtained by combining a global [CLS] token with learnable attention pooling over all tokens. This dual aggregation strategy improves information integration across modalities and enhances classification robustness.
- 6.
- Interpretable attention mechanisms for clinical insight extraction. The architecture provides access to attention maps across transformer layers and modalities, enabling analysis of region relevance, feature importance, and cross-modal interactions. This contributes to model interpretability and supports clinically meaningful insights into Alzheimer’s disease biomarkers.
2. Related Work
3. Materials and Methods
3.1. Datasets
Multimodal Data Representation
- 3D MRI (ROI-based): Structural T1-weighted brain volumes are processed to extract anatomically relevant regions of interest (ROIs), including the hippocampus, entorhinal cortex, fornix, and major cortical lobes. These regions are strongly associated with AD-related neurodegeneration.
- Clinical and Demographic Data: Subject-level attributes such as age, and sex, together with cognitive scores (e.g., MMSE), are included to capture inter-subject variability.
- Volumetric Biomarkers: Quantitative volumetric measures derived from neuroanatomical structures (e.g., hippocampus, amygdala, ventricles) are incorporated as structured features.
Diagnostic Labels
Cohort Construction
Data Harmonization
3.2. Proposed Methodology
3.2.1. Dataset Preparation
Volume Dataset Building:
Data Preprocessing
- Numerical Data: Continuous variables were normalized using min–max scaling to the range , ensuring comparable feature magnitudes and stable optimization during training.
- Categorical Data: Categorical variables were encoded using one-hot encoding, producing binary vectors within the range and avoiding the introduction of ordinal relationships.
- Image Intensity Scaling: MRI voxel intensities were normalized to the range to improve numerical stability and convergence of deep learning models.
MRI Image Preprocessing
- Skull Stripping: Raw MRI volumes were processed to remove non-brain tissues—including skin, fat, muscle, neck, and ocular structures—thereby isolating the intracranial region of interest.
- Tissue Segmentation and Surface Reconstruction: The brain was segmented into major tissue classes, including gray matter (GM), white matter (WM), cerebrospinal fluid (CSF), and background. Subsequently, the white matter and pial surfaces were reconstructed, enabling accurate modeling of cortical boundaries [26].
- Spatial Normalization (Registration): Skull-stripped volumes were nonlinearly registered to the MNI152 T1-weighted template, ensuring uniformity in anatomical orientation, shape, and alignment. The resulting volumes were resampled to a standardized resolution of and dimensions of voxels.
- Volumetric Feature Extraction: Region-of-interest (ROI) volumetric measures were computed, focusing on structures strongly associated with Alzheimer’s disease. These include the left and right hippocampus, amygdala, and lateral ventricles (including inferior lateral ventricles), which are often combined into bilateral measures to improve robustness and discriminative power [27,57].
3.2.2. Instance Dataset Building
| Algorithm 1 ROI-Based Instance Selection with Centroid Refinement |
![]() |
3.2.3. Data Generation
3.2.4. Proposed Multimodal Transformer Architecture
Multi-ROI MRI Encoding
Tabular Feature Encoding
Multimodal Fusion via Self-Attention
Hybrid Representation Learning
Classification Head

3.2.5. Mathematical Formulation of the Multimodal Transformer
ROI Tokenization
Tabular Tokenization
Multimodal Token Fusion
Transformer Encoding
Feature Aggregation
Classification
Training and Optimization:
Model performance evaluation:
4. Experimental Setup
4.1. Datasets
4.2. Experimental Design
ROI-Specific Hemispheric Analysis for Multi-ROI Representation.
- Clinical metadata: demographic and cognitive variables such as age, sex, and MMSE.
- Volumetric biomarkers: structural measurements derived from MRI, including gray matter, white matter, cerebrospinal fluid (CSF), hippocampus, amygdala, ventricles, entorhinal cortex, and whole-brain volume.
Ablation Study Design: Multimodal Integration and ROI-Based Representation.
- MRI Only (ROI-based): uses only MRI inputs extracted from anatomically defined regions of interest.
- Tabular Only: uses only non-imaging features, including clinical metadata (e.g., age, sex, MMSE) and volumetric biomarkers derived from structural MRI.
- Whole-brain (w/o ROI): uses full MRI volumes without ROI decomposition, providing a baseline to evaluate the impact of anatomically constrained representations.
- Multi-ROI + Tabular (Proposed): integrates ROI-based MRI inputs with clinical and volumetric features within a multimodal transformer architecture.
Attention-Based Feature Importance: Clinical and Volumetric Contributions.
ROI Attention Analysis and Interpretability.
Multimodal 3D Vision Transformer vs. State-of-the-Art Methods.
4.3. Hyperparameter Optimization
4.4. Statistical Analysis
4.5. Implementation Details
5. Results
5.1. Analysis of ROI-Specific Hemispheric Contributions for Multi-ROI Alzheimer’s Disease Classification
5.2. Ablation Study: Contribution of Multimodal Integration and ROI-Based Representation
5.3. Attention-Based Feature Importance: Clinical and Volumetric Contributions
5.4. ROI Attention Analysis and Interpretability
5.5. Multimodal 3D Vision Transformer vs. State-of-the-Art Methods
6. Limitations
7. Discussion
8. Conclusions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- World Health Organization. Dementia, 2025. Accessed: 2025-05-03.
- National Institute on Aging. National Institute on Aging, 2025. Accessed: 2025-05-03.
- Alzheimer’s Society. Alzheimer’s Society, 2025. Accessed: 2025-05-03.
- Yu, J.; Lee, T.M. Verbal memory and hippocampal volume predict subsequent fornix microstructure in those at risk for Alzheimer’s disease. Brain Imaging Behav. 2020, 14, 2311–2322. [Google Scholar] [CrossRef]
- Huang, Y.; Li, W. Resizer Swin Transformer-Based Classification Using sMRI for Alzheimer’s Disease. Appl. Sci. 2023, 13. [Google Scholar] [CrossRef]
- Das, R.; Kalita, S. Classification of Alzheimer’s Disease Stages Through Volumetric Analysis of MRI Data. In Proceedings of the 2022 IEEE Calcutta Conference (CALCON); IEEE, 2022; pp. 165–169. [Google Scholar] [CrossRef]
- Khan, T.K. Chapter 3 - Neuroimaging Biomarkers in Alzheimer’s Disease. In Biomarkers in Alzheimer’s Disease; Khan, T.K., Ed.; Academic Press, 2016; pp. 51–100. [Google Scholar] [CrossRef]
- Tripathi, S.M.; Chutia, P.; Murray, A.D. Neuroimaging Biomarkers in Alzheimer’s Disease. J. Dement. Alzheimer’s Dis. 2025, 2, 1–20. [Google Scholar] [CrossRef]
- Alzheimer’s Association. Alzheimer’s Association. 2023. Available online: https://www.alz.org/alzheimers-dementia/diagnosis/medical_tests (accessed on 2023-12-10).
- Alzheimer’s Disease Neuroimaging Initiative (ADNI). Available online: http://adni.loni.usc.edu.
- Australian Imaging, Biomarker and Lifestyle (AIBL) Flagship Study of Ageing. Available online: https://aibl.csiro.au.
- Open Access Series of Imaging Studies (OASIS). Available online: http://www.oasis-brains.org.
- Xu, Y. Patch-wise Intensity Mapping for Individualized Brain Abnormality Detection in Alzheimer ’ s Disease Distributional Representation Normative Modeling Statistical Inference. 2025 IEEE/CVF Int. Conf. Comput. Vis. Work. (ICCVW) 2025, 6824–6833. [Google Scholar] [CrossRef]
- Wen, J.; Thibeau-Sutre, E.; Diaz-Melo, M.; Samper-González, J.; Routier, A.; Bottani, S.; Dormont, D.; Durrleman, S.; Burgos, N.; Colliot, O. Overview of classification of Alzheimer’s disease. Med. Image Anal. 2020, 63, 1904.07773. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. 2017. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. an Image Is Worth 16X16 Words: Transformers for Image Recognition At Scale. ICLR 2021 - 9th International Conference on Learning Representations, 2021. [Google Scholar]
- Zhu, D.; Wang, D. Journal of Radiation Research and Applied Sciences Transformers and their application to medical image processing: A review. J. Radiat. Res. Appl. Sci. 2023, 100680. [Google Scholar] [CrossRef]
- Wang, Y.; Luo, Y.; Zu, C.; Zhan, B.; Jiao, Z.; Wu, X.; Zhou, J.; Shen, D.; Zhou, L. 3D multi-modality Transformer-GAN for high-quality PET reconstruction. Med. Image Anal. 2024, 91, 102983. [Google Scholar] [CrossRef] [PubMed]
- Liu, L.; Liu, S.; Zhang, L.; To, X.V.; Nasrallah, F.; Chandra, S.S. Cascaded Multi-Modal Mixing Transformers for Alzheimer’s Disease Classification with Incomplete Data. NeuroImage 2023, 277, 2210.00255. [Google Scholar] [CrossRef]
- Xin, J.; Wang, A.; Guo, R.; Liu, W.; Tang, X. CNN and swin-transformer based efficient model for Alzheimer’s disease diagnosis with sMRI. Biomed. Signal Process. Control 2023, 86. [Google Scholar] [CrossRef]
- Li, C.; Wang, Q.; Liu, X.; Hu, B. An Attention-Based CoT-ResNet With Channel Shuffle Mechanism for Classification of Alzheimer’s Disease Levels. Front. Aging Neurosci. 2022, 14. [Google Scholar] [CrossRef] [PubMed]
- Hu, Z.; Li, Y.; Wang, Z.; Zhang, S.; Hou, W. Conv-Swinformer: Integration of CNN and shift window attention for Alzheimer’s disease classification. Comput. Biol. Med. 2023, 164. [Google Scholar] [CrossRef]
- Menagadevi, M.; Mangai, S.; Madian, N.; Thiyagarajan, D. Automated prediction system for Alzheimer detection based on deep residual autoencoder and support vector machine. Optik 2023, 272. [Google Scholar] [CrossRef]
- Al-Rahayfeh, A.; Atiewi, S.; Almiani, M.; Jararweh, M.; Faezipour, M. Utilizing 3D magnetic source imaging with landmark-based features and multi-classification for Alzheimer’s Disease diagnosis. Clust. Comput. 2024, 27, 2635–2651. [Google Scholar] [CrossRef]
- Gravina, M.; García-Pedrero, A.; Gonzalo-Martín, C.; Sansone, C.; Soda, P. Multi input–Multi output 3D CNN for dementia severity assessment with incomplete multimodal data. Artif. Intell. Med. 2024, 149. [Google Scholar] [CrossRef]
- Zheng, G.; Zhang, Y.; Zhao, Z.; Wang, Y.; Liu, X.; Shang, Y.; Cong, Z.; Dimitriadis, S.I.; Yao, Z.; Hu, B. A transformer-based multi-features fusion model for prediction of conversion in mild cognitive impairment. Methods 2022, 204, 241–248. [Google Scholar] [CrossRef] [PubMed]
- Coupé, P.; Manjón, J.V.; Mansencal, B.; Tourdias, T.; Catheline, G.; Planche, V. Hippocampal-amygdalo-ventricular atrophy score: Alzheimer disease detection using normative and pathological lifespan models. Hum. Brain Mapp. 2022, 43, 3270–3282. [Google Scholar] [CrossRef]
- Göschel, L.; Kurz, L.; Dell’Orco, A.; Köbe, T.; Körtvélyessy, P.; Fillmer, A.; Aydin, S.; Riemann, L.T.; Wang, H.; Ittermann, B.; et al. 7T amygdala and hippocampus subfields in volumetry-based associations with memory: A 3-year follow-up study of early Alzheimer’s disease. NeuroImage Clin. 2023, 38, 103439. [Google Scholar] [CrossRef] [PubMed]
- icometrix. Volumetric MRI Quantification in the Diagnosis of Alzheimer’s Disease. 2021. Available online: https://www.icometrix.com/post/volumetric-mri-quantification-in-the-diagnosis-of-alzheimer-s-disease (accessed on 2026-03-24).
- Zaabi, M.; Smaoui, N.; Derbel, H.; Hariri, W. Alzheimer’s disease detection using convolutional neural networks and transfer learning based methods. Proceedings of the 2020 17th International Multi-Conference on Systems, Signals & Devices (SSD), 2020; pp. 939–943. [Google Scholar] [CrossRef]
- Bae, J.B.; Lee, S.; Jung, W.; Park, S.; Kim, W.; Oh, H.; Han, J.W.; Kim, G.E.; Kim, J.S.; Kim, J.H.; et al. Identification of Alzheimer’s disease using a convolutional neural network model based on T1-weighted magnetic resonance imaging. Sci. Rep. 2020, 10, 1–10. [Google Scholar] [CrossRef]
- Ahmed, S.; Kim, B.C.; Lee, K.H.; Jung, H.Y. for the Alzheimer’s Disease Neuroimaging Initiative. Ensemble of ROI-based convolutional neural network classifiers for staging the Alzheimer disease spectrum from magnetic resonance imaging. PLoS ONE 2020, 15, 1–23. [Google Scholar] [CrossRef]
- Pan, D.; Luo, G.; Zeng, A.; Zou, C.; Liang, H.; Wang, J.; Zhang, T.; Yang, B. the Alzheimer’s Disease Neuroimaging Initiative. Adaptive 3DCNN-Based Interpretable Ensemble Model for Early Diagnosis of Alzheimer’s Disease. IEEE Trans. Comput. Soc. Syst. 2022, 1–20. [Google Scholar] [CrossRef]
- Li, C.; Cui, Y.; Luo, N.; Liu, Y.; Bourgeat, P.; Fripp, J.; Jiang, T. Trans-ResNet: Integrating Transformers and CNNs for Alzheimer’s disease classification. Proceedings of the 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), 2022; pp. 1–5. [Google Scholar] [CrossRef]
- Poloni, K.M.; Ferrari, R.J. Automated detection, selection and classification of hippocampal landmark points for the diagnosis of Alzheimer’s disease. Comput. Methods Programs Biomed. 2022, 214, 106581. [Google Scholar] [CrossRef] [PubMed]
- Aghaei, A.; Moghaddam, M.E. Smart ROI Detection for Alzheimer’s Disease prediction using explainable AI. Technical report, 2023, [arXiv:eess.IV/2303.10401].
- Castro-Silva, J. A.; Moreno-Garcia, M.; Guachi-Guachi, L.; Peluffo-Ordoñez, D.H. Instance Selection Framework for Alzheimer’s Disease Classification Using Multiple Regions of Interest and Atlas Integration. In Proceedings of the Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - ICPRAM. INSTICC; SciTePress, 2024; pp. 453–460. [Google Scholar] [CrossRef]
- Lyu, Y.; Yu, X.; Zhu, D.; Zhang, L. Classification of Alzheimer’s Disease via Vision Transformer: Classification of Alzheimer’s Disease via Vision Transformer. In ACM International Conference Proceeding Series; 2022; pp. 463–468. [Google Scholar] [CrossRef]
- Hoang, G.M.; Kim, U.H.; Kim, J.G. Vision transformers for the prediction of mild cognitive impairment to Alzheimer’s disease progression using mid-sagittal sMRI. Front. Aging Neurosci. 2023, 15. [Google Scholar] [CrossRef]
- Mora-Rubio, A.; Bravo-Ortíz, M.A.; Arredondo, S.Q.; Torres, J.M.S.; Ruz, G.A.; Tabares-Soto, R. Classification of Alzheimer’s disease stages from magnetic resonance images using deep learning. PeerJ Comput. Sci. 2023, 9. [Google Scholar] [CrossRef] [PubMed]
- Altay, F.; Sánchez, G.R.; James, Y.; Faraone, S.V.; Velipasalar, S.; Salekin, A. Preclinical Stage Alzheimer’s Disease Detection Using Magnetic Resonance Image Scans. Proc. AAAI Conf. Artif. Intell. 2021, 35, 15088–15097. [Google Scholar] [CrossRef]
- Li, C.; Cui, Y.; Luo, N.; Liu, Y.; Bourgeat, P.; Fripp, J.; Jiang, T. Trans-ResNet: Integrating Transformers and CNNs for Alzheimer’s disease classification. Proceedings - International Symposium on Biomedical Imaging, 2022-March. pp. 1–5. [CrossRef]
- Albarakat, H.M.; Chaitanya, T.V.S.S.; .... HybridViT: An Approach for Alzheimer’s Disease Classification with ADNI Neuroimaging Data. Annamalai and Bassfar … 2025. https://doi.org/10.1007/s42979-025-03862-0.
- Zhang, Z.; Khalvati, F. Introducing Vision Transformer for Alzheimer’s Disease classification task with 3D input. Technical report, 2022, [arXiv:eess.IV/2210.01177].
- Tiwari, A.; Singhal, A.; Shigwan, S.; Kumar Singh, R.; Shigwan, S.J.; Tiwari, A.; Singhal, A.; Shigwan, S.; Singh, R. Early Diagnosis of Alzheimer through Swin-Transformer-Based Deep Learning Framework using Sparse Diffusion Measures. Technical report, 2023.
- Zhang, W.; Yang, X.; Chen, Y.; Liu, Y. Alzheimer ’ s Disease Classification Based on Multi-Scale 2D-VMD Swin Transformer. 2024 9th International Conference on Computer and Communication Systems (ICCCS) 2024, pp. 178–183. https://doi.org/10.1109/ICCCS61882.2024.10603331.
- Illakiya, T.; Karthik, R. A Dimension Centric Proximate Attention Network and Swin Transformer for Age-Based Classification of Mild Cognitive Impairment From Brain MRI 2023. 11.
- Hu, C. Image Feature Extraction with Fourier Transform and Multi - task Swin - Transformer for Alzheimer ’ s Disease Prediction and Detection. 2024 4th International Signal Processing, Communications and Engineering Management Conference (ISPCEM) 2024, pp. 1028–1039. https://doi.org/10.1109/ISPCEM64498.2024.00182.
- Maddalena, L.; Granata, I.; Giordano, M.; Manzo, M.; Guarracino, M.R. Integrating Different Data Modalities for the Classification of Alzheimer’s Disease Stages. SN Comput. Sci. 2023, 4. [Google Scholar] [CrossRef]
- Birkenbihl, C.; Westwood, S.; Shi, L.; Nevado-Holgado, A.; Westman, E.; Lovestone, S.; Hofmann-Apitius, M. ANMerge: A comprehensive and accessible Alzheimer’s disease patient-level dataset. 2020. [Google Scholar] [CrossRef]
- Multimodal deep learning models for early detection of Alzheimer’s disease stage. Sci. Rep. 2021, 11, 1–13. [CrossRef]
- Gao, X.; Shi, F.; Shen, D.; Liu, M. Multimodal transformer network for incomplete image generation and diagnosis of Alzheimer’s disease. Comput. Med. Imaging Graph. 2023, 110. [Google Scholar] [CrossRef]
- Golovanevsky, M.; Eickhoff, C.; Singh, R. Multimodal Attention-based Deep Learning for Alzheimer’s Disease Diagnosis 2022. [2206.08826]. https://doi.org/10.1093/jamia/ocac168.
- Zhang, X.; Lin, W.; Xiao, M.; Ji, H. Multimodal 2.5D convolutional neural network for diagnosis of Alzheimer’s disease with magnetic resonance imaging and positron emission tomography. Prog. Electromagn. Res. 2021, 171, 21–34. [Google Scholar] [CrossRef]
- Odusami, M.; Maskeliūnas, R.; Damaševičius, R.; Misra, S. Explainable Deep-Learning-Based Diagnosis of Alzheimer’s Disease Using Multimodal Input Fusion of PET and MRI Images. J. Med. Biol. Eng. 2023, 43, 291–302. [Google Scholar] [CrossRef]
- Hughes, C.P.; Berg, L.; Danziger, W.; Coben, L.A.; Martin, R.L. A New Clinical Scale for the Staging of Dementia. Br. J. Psychiatry 1982, 140, 566–572. [Google Scholar] [CrossRef]
- Frisoni, G.B. Alzheimer’s Disease Neuroimaging Initiative in Europe. Alzheimer’s Dement. 2010, 6, 280–285. Available online: https://alz-journals.onlinelibrary.wiley.com/doi/pdf/10.1016/j.jalz.2010.03.005. [CrossRef]
- Li, L.; Jamieson, K.; DeSalvo, G.; Rostamizadeh, A.; Talwalkar, A. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. J. Mach. Learn. Res. 2018, 18, 1–52. [Google Scholar]
- Priyadharshini, M.; Murugesh, V.; Rybin, O. Enhancing Alzheimer’s disease classification with a transformer-based model using self-supervised learning. Sci. Rep. 2026, 16, 3798. [Google Scholar] [CrossRef] [PubMed]



| Dataset | Class | Subjects | Age | Sex F / M |
Total Subjects |
|---|---|---|---|---|---|
| ADNI | CN | 70 | 31/39 | 140 | |
| AD | 70 | 33/37 | |||
| AIBL | CN | 70 | 36/34 | 140 | |
| AD | 70 | 33/37 | |||
| OASIS | CN | 70 | 39/31 | 140 | |
| AD | 70 | 33/37 | |||
| MERGED | CN | 210 | 106/104 | 420 | |
| AD | 210 | 107/103 |
| Hyperparameter | Search Space | Selected Value |
|---|---|---|
| Dataset | – | Merged (ADNI+AIBL+OASIS) |
| Slice Number | – | 25 |
| Image Size | – | |
| Channels | {1, 3} | 1 |
| Optimizer | {Adam, SGD, RMSprop, AdamW} | AdamW |
| Learning Rate | {1e-3, 1e-4, 1e-5} | |
| Weight Decay | {1e-3, 1e-4} | |
| Clipvalue | – | 0.5 |
| Transformer Layers | – | 8 |
| Projection Dim | – | 128 |
| Embedding Dim | – | 128 |
| Num Heads | – | 8 |
| Patch Size | – | |
| LayerNorm | – | |
| Dropout | {0.20 – 0.50} | 0.20 |
| Batch Size | {4, 6, 8, 16} | 6 |
| Epochs | {25, 50, 100} | 100 |
| Num Classes | {2, 3} | 2 |
| ROI | Left Hemisphere | Right Hemisphere |
|---|---|---|
| Entorhinal Cortex | ||
| Fornix | ||
| Frontal Lobe | ||
| Hippocampus | ||
| Parietal Lobe | ||
| Temporal Lobe |
![]() |
| Comparison | Metric | p-value | Cohen’s d | Interpretation |
|---|---|---|---|---|
| Full vs MRI | AUC | Extremely large effect | ||
| Accuracy | Extremely large effect | |||
| Full vs Tabular | AUC | Moderate effect | ||
| Accuracy | Large effect | |||
| Full vs Whole-brain | AUC | Small effect (not significant) | ||
| Accuracy | Moderate effect |
| # | Feature | Modality | Importance |
|---|---|---|---|
| 1 | Gray matter | Volumetric biomarker | |
| 2 | Ventricle left | Volumetric biomarker | |
| 3 | Entorhinal right | Volumetric biomarker | |
| 4 | Entorhinal left | Volumetric biomarker | |
| 5 | MMSE | Clinical metadata | |
| 6 | CSF | Volumetric biomarker | |
| 7 | Ventricle right | Volumetric biomarker | |
| 8 | Hippocampus left | Volumetric biomarker | |
| 9 | Age | Clinical metadata | |
| 10 | Amygdala left | Volumetric biomarker | |
| 11 | Sex | Clinical metadata | |
| 12 | Amygdala right | Volumetric biomarker | |
| 13 | Hippocampus right | Volumetric biomarker | |
| 14 | White matter | Volumetric biomarker | |
| 15 | Whole brain | Volumetric biomarker |
| ROI | Attention | CV | Stability Score |
|---|---|---|---|
| Frontal Lobe | 0.0640 | 0.9399 | |
| Entorhinal Cortex | 0.0710 | 0.9337 | |
| Hippocampus | 0.0729 | 0.9321 | |
| Parietal Lobe | 0.0762 | 0.9292 | |
| Fornix | 0.0823 | 0.9240 | |
| Temporal Lobe | 0.1175 | 0.8949 |
| Study | Dataset | Model | Accuracy |
| [34] | ADNI | Trans-ResNet | |
| AIBL | |||
| [38] | ADNI | Vision Transformer | |
| [22] | ADNI, OASIS | 2D CNN+Transformer | |
| [41] | OASIS | Vision Transformer | |
| [20] | ADNI, AIBL | CNN+Swin-Transformer | |
| [5] | ADNI+AIBL | Swin Transformer | |
| [40] | ADNI+OASIS | Vision Transformer | |
| [59] | KAGGLE | TabTransformer (ETT-SSL) | |
| Merged (ADNI + | Multimodal | ||
| Multiple ROIs | AIBL + OASIS) | 3D Vision Transformer | |
| Single ROI | Entorhinal Cortex - Left | ||
| Fornix - Right | |||
| Frontal Lobe - Left | |||
| Hippocampus - Left | |||
| Parietal Lobe - Right | |||
| Temporal Lobe - Right |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

