Submitted:
28 April 2025
Posted:
30 April 2025
Read the latest preprint version here
Abstract
Brain tumor detection through magnetic resonance imaging (MRI) is a complex investigation to conduct. Developing a fast and reliable clinical decision-making tool is paramount. Modern techniques like deep learning and convolutional neural networks (CNNs) have demonstrated great promise in automating the process of detecting tumor masses from MRI scans. In this study, we take a different approach by training a VGG16-based CNN, and instead of relying on single source dataset or black-box predictions, we merge two publicly available datasets (Figshare and Kaggle), introducing inter-dataset variability that simulates real-world diagnostic conditions. We start by preprocessing the data, use stratified splitting for training, testing and validation, and at last, we use data augmentation techniques; our model achieves a validation accuracy of 84.4% and demonstrates consistent performance across tumor types. Grad-CAM heatmaps highlight tumor regions with reasonable precision, even in some misclassified cases, thereby enhancing model transparency and trust. This work highlights the effectiveness of a lightweight, generalizable CNN architecture along with visual interpretability.
Keywords:
Introduction
- Generalization Across MRI Datasets — Training AI models on scans from one or two cohorts of hospitals tends to cause loss of accuracy when applied to another cohort, as the type of scanner used, acquisition settings, and patient demographics differ between sites, creating a domain shift.
- Class Imbalance in Brain Tumor Datasets — Many publicly available datasets have imbalanced distributions of tumor types that cause biased predictions toward the majority class.
- Explainability & trust – Deep learning models operate as “black boxes,” making it challenging for clinicians to decipher AI-based diagnoses and impeding real-world adoption.
- Small & Early-Stage Tumor Detection – A lot of models do not detect small tumors or early-stage abnormalities, as they look like normal tissues in MRI images.
Literature Review
Methodology
- Kaggle “Brain Tumor Classification (MRI)” Dataset (Bhuvaji et al., 2020) – Comprising 3,264 T1-weighted contrast-enhanced MRI images, categorized into four classes:
- 2.
- Figshare “Brain Tumor Dataset” (Cheng et al., 2017) – Containing 7,000+ MRI images, categorized into three classes:
- Image Resizing – All images were resized to 224 × 224 pixels, aligning with VGG16’s input size.
- Normalization – Pixel values were scaled to [0, 1] using min-max scaling to improve model convergence.
- Image Augmentation – To increase dataset variability and reduce overfitting, we applied:
- 4.
- Class Imbalance Handling – Class weights were computed and applied during training to mitigate bias toward majority tumor classes.
- Rotation: Random rotations (±30°) to simulate different viewing angles.
- Flipping: Horizontal and vertical flips to enhance spatial invariance.
- Zooming: Random zoom-in and zoom-out to introduce variations in tumor magnifications.
- Brightness Adjustment: Controlled intensity modifications to account for differences across MRI scanners.
- Shifting: Minor translations of the image to make the model robust to positional variations.
- Class Weights: Adjusted loss function penalties to counterbalance the effect of dominant classes.
- Oversampling: Replicated minority class images to ensure a more balanced representation.
- Targeted Data Augmentation: Applied additional augmentations exclusively to underrepresented classes to synthetically increase their presence.
- Pre-trained Base Model: The VGG16 model was loaded with ImageNet weights, excluding the fully connected layers (include_top=False).
- Frozen Layers: All convolutional layers in VGG16 were initially frozen, preventing their weights from being modified:

- Custom Classification Head: The fully connected layers were replaced with a trainable classification head consisting of:
- ◯
- Global Average Pooling (GAP) – Reducing feature maps to a 512-dimensional vector.
- ■
- Batch Normalization – Stabilizing activations for better convergence.
- ■
- Fully Connected Dense Layers:
- ■
- 256 neurons (ReLU activation, dropout = 0.5)
- ■
- 128 neurons (ReLU activation, dropout = 0.5)
- ■
- Softmax Output Layer – Classifying MRI scans into 3 categories (glioma, meningioma, pituitary tumor).

- Optimizer: Adam optimizer (learning_rate = 1e-4)
- Loss Function: Categorical Cross-Entropy
- Batch Size: 32
- Epochs: 100 (Early stopping after 10 epochs of no improvement)
-
Callbacks:
- ◯
- EarlyStopping – Monitors validation loss and stops training if no improvement is detected:

- ◯
- ModelCheckpoint – Saves the best model based on validation performance:

- Training Execution:

- Accuracy – Overall classification correctness.
- Precision – Proportion of correctly classified tumors per class.
- Recall (Sensitivity) – True positive rate, measuring detection ability.
- F1-Score – Balancing precision and recall.
- Confusion Matrix – Visualizing misclassifications across tumor types.
Visualization

Confusion Matrix
- The diagonal values indicate correct classifications, whereas off-diagonal values highlight misclassifications.
- The model performed well in classifying pituitary tumors, but some misclassification occurred between glioma and meningioma, which could be attributed to their structural similarities in MRI scans.

- The steady decrease in training loss indicates that the model is effectively learning from the dataset.
- The validation loss follows a similar trend, suggesting no significant overfitting. However, slight fluctuations in validation loss after epoch 10 suggest that further fine-tuning or regularization techniques could further improve generalization.

- Glioma Tumor → AUC = 0.96
- Meningioma Tumor → AUC = 0.93
- Pituitary Tumor → AUC = 0.99

Evaluation Metrics
Results and Discussion
Challenges Observed
- Class Imbalance: Despite class weighting and augmentation, minority classes (meningioma and pituitary) remained harder to classify in some experiments.
- Dataset Variability: Merging datasets introduced real-world diversity but also increased intra-class variability.
- Generalization: While validation metrics were strong, some alternate models (EfficientNet, MobileNet) underperformed on the test set, reinforcing the need for careful model selection and tuning.

Conclusion
References
- Abdusalomov, A. B., Mukhiddinov, M., & Whangbo, T. K. (2023). Brain tumor detection based on deep learning approaches and magnetic resonance imaging. Cancers, 15(16), 4172. [CrossRef]
- Amin, J., Sharif, M., Haldorai, A., Yasmin, M., & Nayak, R. S. (2021). Brain tumor detection and classification using machine learning: A comprehensive survey. Complex & Intelligent Systems, 8(4), 3161–3183. [CrossRef]
- Bhuvaji, S., Kanchan, S., Dedge, S., Bhumkar, P., & Kadam, A. (2020, May 24). Brain tumor classification (MRI). Kaggle. https://www.kaggle.com/datasets/sartajbhuvaji/brain-tumor-classification-mri.
- Chen, R., Zhang, X., Li, P., & Wang, L. (2024). YOLO-NeuroBoost: Enhancing real-time object detection for brain tumor MRI scans. IEEE Transactions on Biomedical Engineering, 71(3), 1125–1137. [CrossRef]
- Cheng, Jun (2017). Brain tumor dataset. Figshare. Dataset. [CrossRef]
- Dulal, R., & Dulal, R. (2025). Brain tumor identification using improved YOLOv8. arXiv preprint. https://arxiv.org/abs/2502.03746.
- Esmaeilzadeh, P. (2020). Use of AI-based tools for healthcare purposes: A survey study from consumers’ perspectives. BMC Medical Informatics and Decision Making, 20(1), 191. [CrossRef]
- Khan, A. H., Abbas, S., Khan, M. A., Farooq, U., Khan, W. A., Siddiqui, S. Y., & Ahmad, A. (2022). Intelligent model for brain tumor identification using deep learning. Applied Computational Intelligence and Soft Computing, 2022, 1–10. [CrossRef]
- Krishnan, H., Patel, S., & Gupta, R. (2024). RViT: A rotation-invariant vision transformer for brain tumor MRI classification. Medical Image Analysis, 92, 102313. [CrossRef]
- Parida, A., Capellán-Martín, D., Jiang, Z., Tapp, A., Liu, X., Anwar, S. M., Ledesma-Carbayo, M. J., & Linguraru, M. G. (2024). Adult glioma segmentation in Sub-Saharan Africa using transfer learning on stratified fine-tuning data. arXiv preprint. https://arxiv.org/abs/2412.04111.
- Reddy, S., Kumar, P., & Sharma, N. (2024). Fine-tuned vision transformers for multi-class brain tumor classification. Neural Computing and Applications, 36(2), 517–531. [CrossRef]
- Secinaro, S., Calandra, D., Secinaro, A., Muthurangu, V., & Biancone, P. (2021). The role of artificial intelligence in healthcare: A structured literature review. BMC Medical Informatics and Decision Making, 21(1), 88. [CrossRef]
- Talukder, Md. A. (2023). An efficient deep learning model to categorize brain tumors using reconstruction and fine-tuning [Preprint]. [CrossRef]
- Younis, A., Qiang, L., Nyatega, C. O., Adamu, M. J., & Kawuwa, H. B. (2022). Brain tumor analysis using deep learning and VGG-16 ensembling learning approaches. Applied Sciences, 12(14), 7282. [CrossRef]
- Zahoor, A., Malik, H., & Khan, S. (2024). Res-BRNet: A novel residual and boundary-aware network for brain tumor classification in MRI. Expert Systems with Applications, 221, 119932. [CrossRef]
- Gulbarga, M. I., Khan, A. L., Cankurt, S., & Shaidullaev, N. (2023, June). Deep learning (DL) dense classifier with long short-term memory encoder detection and classification against network attacks. 2023 20th International Conference on Electronics, Computer and Computation (ICECCO), 1–6. [CrossRef]
- Nazira, A., Isaev, R., Shambetova, B., Ur Rehman, S., & Osmonaliev, K. (2025). The role of computer technology in monitoring and analysis of hemodialysis patient data: A review. South Eastern European Journal of Public Health, 26. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).