1. Introduction
Human life depends heavily on health. Since the brain is the executive of the essential resource, its health is very crucial. Devices that use magnetic resonance imaging (MRI) to diagnose human health issues assist medical professionals in making decisions about vital organs including the brain. These gadgets' images provide artificial intelligence with enormous data. High performance in image processing categorization problems—a branch of artificial intelligence—is made possible by this huge data. Brain tumors encompass a wide variety of abnormal growths that can impact brain function, posing significant health risks due to their potential for rapid progression and difficulty in treatment. Gliomas, meningiomas, and pituitary tumors are among the most common types of brain tumors, each presenting unique characteristics, origins, and challenges in diagnosis and treatment. The most aggressive, invasive, and primitive tumors in humans are brain tumors, which have a dismal prognosis [
1]. The life expectancy of neuro-oncological patients is still very short (24–36 months), even though much research has been conducted recently to find innovative therapeutic regimens and tumor molecular markers capable of predicting survival and responsiveness to therapy [
2]. Gliomas make up almost 80% of all malignant central nervous system (CNS) tumors in adults, or around 33% of all brain tumors [
3]. Gliomas are a large class of glial brain and spinal cord cancers that start in the glial cells—such as astrocytes, oligodendrocytes, and ependymal cells—that envelop and support neurons in the brain. Glioblastoma (GBM) is one of the most prevalent and severe primary brain tumors among them [
4]. As of right now, GBM is still linked to a very aggressive clinical course, with just 0.05–4.7% of patients living five years after diagnosis [
5]. GBM differs from other lower-grade gliomas due to cellular pleomorphism with nuclear atypia, strong mitotic activity, and microvascular proliferation [
6].
Meningiomas develop from the meninges, the protective membranes that surround the brain and spinal cord. These tumors are generally slow-growing and are more likely to be benign compared to other brain tumors[
7]. Meningiomas are the most common primary brain tumors, accounting for about 30% of cases. Although typically benign, meningiomas can cause significant health issues if they grow large enough to compress surrounding brain tissue, leading to symptoms such as headaches, seizures, and neurological deficits. They are believed to originate from meningothelial (arachnoid) cells (MECs) and are often benign, slowly developing neoplasms [
8]. Although these dural-based tumors are thought to be benign, they can cause morbidity, which manifests as a range of location-dependent, non-specific symptoms. This overview covers the epidemiology, risk factors of meningiomas, and the most recent 2016 revisions to the World Health Organization's (WHO) categorization of CNS tumors. Additionally, included in this study are clinical aspects, diagnostics, typical treatment regimen, ongoing studies of new remedies, molecular traits, and potential uses for grading[
8]. A large class of tumors known as gliomas develop from glial cells, which nourish and shield neurons. The aggressiveness and prognosis of gliomas, which make up around 80% of all malignant brain tumors, vary greatly. These tumors are categorized according to their development rate and propensity for metastasis, and they can develop in different parts of the brain. Because they are infiltrative and resistant to standard treatments, high-grade gliomas—like glioblastomas—are very difficult to cure. The pituitary gland, a tiny organ at the base of the brain that controls hormones that govern vital body processes, is where pituitary tumors develop. Usually benign adenomas, these tumors can cause serious health issues by interfering with regular hormone levels. Most pituitary tumors are not malignant, but they can impair eyesight and result in hormonal abnormalities that affect development, metabolism, and reproduction. To avoid these possible consequences, pituitary tumors must be accurately detected and classified. Approximately 10% of all intracranial tumors are pituitary tumors. The third and fourth decades of life are when they are most prevalent, and secretory tumors are slightly more common in women[
9]. Benign adenomas that originate from adenohypophysis make up the great majority of these tumors. On rare occasions, tumors may be pituicytomas, or neurohypophysis. Although it is uncommon, pituitary carcinoma most likely accounts for 0.2% of pituitary tumors. They may spread extracranially and are typically invasive and secretory[
9]. Normal brain tissue differs from aberrant growth in that it exhibits normal cellular architecture and brain function without any indications of tumor development or aberrant cell proliferation. In diagnostic imaging, normal tissue categorization is essential to prevent healthy tissue from being mistakenly classified as tumorous, which lowers the possibility of needless treatments.
1.1. Broader Applications of Convolutional Neural Networks in Medical Diagnostics
Convolutional Neural Networks (CNNs) have revolutionized the field of medical diagnostics by automating complex image analysis tasks with high accuracy and efficiency. Their capacity to learn hierarchical features from images has made CNNs particularly valuable in analyzing medical images where nuanced differences are critical for diagnosis. CNNs have been widely applied across multiple medical imaging domains, including radiology, pathology, dermatology, and ophthalmology, demonstrating their versatility and potential to improve patient outcomes through early detection and precise classification of diseases.
Cancer Diagnosis: CNNs have been used extensively in cancer diagnosis, especially when analyzing medical images like computed tomography (CT) scans, histopathology slides, and mammograms [
10]. For example, CNNs have greatly aided in early diagnosis by achieving high accuracy in the identification of breast cancer using mammography analysis. CNNs can differentiate between malignant and non-cancerous tissues in histopathological image analysis, which has applications in prostate, colorectal, and lung cancer [
11].
Cardiovascular Disease: CNNs have been used in cardiovascular imaging to identify heart failure, coronary artery disease, and arrhythmias using cardiac magnetic resonance imaging, echocardiograms, and electrocardiograms (ECGs) [
12]. The speed and precision of cardiovascular diagnostics can be increased by using CNN-based models to automatically identify structural heart disorders and irregular heart rhythms.
Neurological Disorders: CNNs are employed in the diagnosis of neurological disorders including epilepsy, multiple sclerosis, and Alzheimer's disease. CNNs, for instance, have been used to uncover patterns of brain shrinkage in brain MRI images, which helps identify early indicators of Alzheimer's disease [
13]. CNNs improve seizure activity detection in epilepsy by analyzing electroencephalogram (EEG) data.
Convolutional Neural Networks (CNNs) can extract hierarchical characteristics from complicated pictures, they are fundamental to MRI analysis. CNNs are frequently used to categorize various brain disorders, making it possible to identify certain tumor types such pituitary tumors, meningiomas, and gliomas. Deeper layers and residual connections are included into enhanced CNN designs, such as VGGNet and ResNet, which increase accuracy by collecting more complex information in MRI images. By employing parallel convolutional layers with different filter sizes, InceptionV3, another influential model, enhances the classification process and enables the network to collect both tiny and broad characteristics in MRI images. This adaptability has been especially helpful in identifying minute patterns and anomalies that conventional imaging methods can miss. Transfer learning, which involves fine-tuning pre-trained models (such as those learned on massive datasets like ImageNet) with MRI-specific data, has also gained popularity in the field of MRI classification. This technique is particularly useful in medical applications where labeled MRI datasets are frequently scarce since it improves model performance and drastically cuts down on training time. For MRI classification, other AI methods like recurrent neural networks (RNNs) and hybrid models that mix CNNs and RNNs are also investigated, particularly in time-series MRI data where sequence patterns are important. When combined, these AI methods provide radiologists with dependable and effective ways to help them classify MRI images accurately and consistently, facilitating early diagnosis and treatment planning.
In this study, we propose a prediction system designed to classify brain MRI images into four categories: glioma tumor, meningioma tumor, pituitary tumor, and normal (non-tumorous) cases. Our model utilizes a CNN architecture developed using TensorFlow, optimized for high accuracy in distinguishing these tumor types. Additionally, the prediction system is integrated with a FastAPI framework, which enables rapid processing of MRI images uploaded by healthcare professionals. The system architecture supports real-time classification, with the FastAPI server handling HTTP requests, receiving MRI images, and sending predictions within a few seconds. This setup enhances the usability of the model in clinical environments, where timely diagnosis is crucial. Through this work, we aim to advance the practical applicability of CNN-based diagnostic tools for brain cancer, making strides toward the integration of AI in healthcare. This paper presents the development, evaluation, and deployment of the CNN model, offering insights into its potential for aiding in the accurate and efficient classification of brain tumors.
2. Materials and Methods
For this study, a Convolutional Neural Network (CNN) model was developed to classify MRI images into four distinct categories, utilizing a dataset comprising 3,097 images. This dataset was carefully divided to ensure robust training and evaluation: 20% of the images were allocated for testing to assess model generalization, and 10% were set aside for validation to tune hyperparameters and reduce overfitting. The model was implemented using Python 3.12 and the TensorFlow deep learning library, chosen for its flexibility and efficiency in handling image-based data. The CNN model architecture was optimized to interpret the features in MRI images that are crucial for identifying and differentiating brain tumors.
The primary goal of this model was to classify MRI images into the correct categories, thereby providing a rapid, automated method for assisting radiologists and other medical professionals in diagnostic decision-making. The following sections will detail the structure of the dataset, including class distribution and distinguishing features, and elaborate on the processing techniques and training methodologies applied.
2.1. Datasets and Classes
The dataset utilized in this study consists of 3,097 MRI images divided into four classes: glioma tumor, meningioma tumor, normal, and pituitary tumor. Each class represents a distinct type of brain condition, which the CNN model aims to classify accurately. The dataset’s diversity enhances the model's ability to generalize across different variations of brain conditions visible in MRI scans.
Glioma Tumor: Gliomas are tumors that arise from glial cells in the brain and spinal cord, and they are known for their aggressive and infiltrative nature. Glioma images in the dataset typically display irregular, diffuse masses that may vary in intensity and shape, which can be challenging to distinguish. The CNN model is trained to identify the unique patterns and irregularities associated with gliomas, aiding in early detection and treatment planning.
Meningioma Tumor: Meningiomas originate in the meninges, the protective layers surrounding the brain and spinal cord. These tumors are often slow-growing and may show up as well-defined, round masses in MRI scans. The model uses distinctive features of meningiomas, such as their smooth contours and location near the surface of the brain, to differentiate them from other types.
Normal: Images classified as "normal" display typical brain anatomy without any detectable signs of tumors or abnormal growth. Including normal scans in the dataset allows the model to learn the baseline characteristics of healthy brain tissue, which is essential for distinguishing pathological cases from normal variations.
Pituitary Tumor: Pituitary tumors occur in the pituitary gland and can influence hormonal balance, affecting various bodily functions. These tumors appear near the base of the brain and often have distinct shapes that differentiate them from other brain tumors. The CNN model is trained to recognize these specific characteristics, enhancing its ability to accurately classify pituitary tumors.
Each image in the dataset has been labeled with one of these categories, and data preprocessing techniques, including resizing and normalization, were applied to ensure consistency across all images. This organized dataset enabled the CNN model to learn intricate details, leading to higher accuracy in classification tasks.
2.2. Machine Learning Model
The convolutional neural network (CNN) developed to perform multi-class classification on images, outputting a prediction of the image class along with a visualization of the important areas through a heatmap. The network first sets the image size to 256x256 pixels, standardizing input dimensions to improve model performance. Additionally, a batch size of 32 is used, meaning the model processes 32 images at a time during training, balancing computational efficiency with model convergence. The data is then divided into training, validation, and testing sets with an 80-10-10 split ratio, ensuring robust evaluation at each stage.
Figure 1.
Convolutional neural network architecture for image classification.
Figure 1.
Convolutional neural network architecture for image classification.
The model architecture uses TensorFlow’s Sequential API to build a multi-layer CNN. To preprocess the images, a Rescaling layer normalizes pixel values between 0 and 1, enhancing model stability and speeding up convergence. Additionally, a data augmentation layer randomly flips and rotates the images, a technique that increases the model’s generalization by reducing overfitting to specific features in the training data.
The model consists of multiple convolutional layers, starting with a layer of 64 filters and then expanding to 128 filters in subsequent layers. Each convolutional layer uses a 3x3 kernel size and ReLU activation function, which effectively captures spatial features while maintaining computational efficiency. Max-pooling layers are interspersed between these convolutional layers, down-sampling the feature maps and reducing dimensionality to focus on key spatial features. This layered structure allows the model to learn hierarchical features, beginning with low-level edges and textures and progressing to complex patterns relevant to classification. Once the features are extracted, a Flatten layer converts them into a 1D vector, which is then passed to fully connected (dense) layers for final classification. The first dense layer has 64 neurons and uses ReLU activation, capturing high-level representations from the convolutional layers. The output layer has a SoftMax activation function, producing probabilities for each of the four classes in the classification task. This model can predict on uploaded images, and by using TensorFlow’s heatmap functionality, it can also apply a mask over the input image to highlight regions that influenced the classification decision most. This heatmap overlay allows for visual interpretability, making it easier to identify the areas the model found most significant in its prediction.
3. Result
The architecture depicted in the image represents a Convolutional Neural Network (CNN) designed for image classification, structured into two main stages: feature learning and classification. Starting with the input layer, this model processes an image, such as an MRI scan, which is then passed through various layers to extract relevant information for classification. This input is essential as it initiates the flow of data, enabling the network to analyze pixel values and recognize patterns within the image. We modeled and implemented a sequential Convolutional Neural Network (CNN) consisting of four primary layers in Python using TensorFlow and Keras libraries. This sequential model structure allowed us to stack each layer one after the other, enabling a straightforward flow of data through the network.
After training our CNN model on the dataset, we evaluated its performance by examining the accuracy and loss graphs. After the model's learning process, we looked at accuracy and loss graph. These visualizations provide insight into how well the model has learned from the training data and how effectively it generalizes to the validation data over the course of multiple epochs.
Figure 2.
Training and validation accuracy and loss curves for CNN model.
Figure 2.
Training and validation accuracy and loss curves for CNN model.
The graphs illustrate the accuracy and loss progression for both training and validation sets over 100 epochs, providing insight into the model's learning behavior and generalization ability. In the “Training and Validation Accuracy” graph (left), we observe a steady increase in accuracy for both the training and validation sets as the epochs progress. Initially, both training and validation accuracy improve rapidly, indicating that the model is effectively learning from the data and adjusting its parameters to minimize errors. After about 20 epochs, the accuracy values start to stabilize, with the training accuracy nearing 1.0 and the validation accuracy fluctuating close to this high range. The similar trends in both curves suggest that the model is learning well without significant overfitting, as the validation accuracy closely tracks the training accuracy.
“The Training and Validation Loss” graph (right) shows a corresponding decline in loss values for both training and validation sets. Initially, the training loss decreases rapidly, indicating that the model is quickly minimizing errors during early epochs. The validation loss, however, shows some fluctuations, especially at the beginning, which is common as the model adjusts to the new data. Over time, both training and validation loss values drop and stabilize, with validation loss fluctuating more than training loss. Despite these fluctuations, the overall downward trend in validation loss indicates that the model is generalizing relatively well to unseen data. Occasional spikes in validation loss may suggest that the model faces some challenges with certain samples but ultimately maintains a consistent learning trajectory. These graphs demonstrate a successful training process with both high accuracy and low loss for the training and validation sets, indicating that the model has achieved a balanced performance without significant overfitting. This consistency between training and validation curves confirms that the model has learned the features in the dataset effectively and can generalize well to new data.
Figure 3.
Confusion matrix for tumor classification model.
Figure 3.
Confusion matrix for tumor classification model.
This confusion matrix illustrates the performance of a classification model that distinguishes among four categories: glioma tumor, meningioma tumor, normal, and pituitary tumor. The diagonal entries represent instances where the model correctly identified the class, demonstrating a strong performance for each category. For example, the model accurately classified most glioma tumor images, showing only minor misclassifications, primarily confusing some glioma tumor cases such as meningioma tumor. Similarly, meningioma tumor images were predominantly classified correctly, with only minimal misclassification. The normal class shows a modest degree of accuracy, though a small number of normal cases were predicted incorrectly as glioma or meningioma tumors, suggesting a slight overlap in features that might need further refinement in the model. Pituitary tumor predictions were also robust, with only a couple of cases incorrectly classified as meningioma tumors. This pattern suggests that while the model performs well overall, there are subtle misclassification trends, especially between certain tumor types, that could potentially benefit from additional feature engineering or training data to improve differentiation.
The model's performance metrics highlight its effectiveness across various evaluation measures, offering a comprehensive view of its classification capabilities. Here, we examine each metric in detail:
Figure 4.
Classification report and performance metrics for CNN model on brain MRI images.
Figure 4.
Classification report and performance metrics for CNN model on brain MRI images.
The “overall accuracy” of the model is 0.97, meaning that it correctly classified 97% of all instances. This high accuracy suggests that the model performs well across the dataset, with only a small percentage of errors. However, accuracy alone may not provide a full picture, particularly in imbalanced datasets, so additional metrics like precision, recall, and F1-score offer more detailed insights.
“Precision (Weighted) and Recall (Weighted)” both score 0.97, indicating that the model has a balanced performance across classes. Precision measures the model’s ability to correctly identify positive instances while avoiding false positives. For example, if the model predicts a glioma tumor, there’s a 99% likelihood (precision) that this prediction is accurate. Recall, on the other hand, measures the model’s capacity to capture all relevant instances, minimizing false negatives. For instance, the recall for meningioma tumors is 1.00, indicating that the model correctly identifies every instance of meningioma tumors in the test set without missing any.
“The F1-Score (Weighted)” is also 0.97, representing the harmonic means of precision and recall, offering a balanced measure of accuracy. The F1-score is especially useful in cases where there’s an uneven class distribution, as it balances false positives and false negatives effectively.
The Matthews Correlation Coefficient (MCC) is 0.96, reflecting a high correlation between observed and predicted classifications. The MCC value ranges from -1 to +1, with +1 indicating perfect prediction, 0 no better than random guessing, and -1 total disagreement. Here, a high MCC shows that the model is making consistent and reliable predictions across all classes.
“The ROC AUC Score” is 0.70, which may initially seem lower than other metrics. However, in multi-class classification problems, the AUC can sometimes vary across classes. This score indicates the model’s ability to distinguish between classes, with values closer to 1.0 representing stronger discrimination. While slightly lower, this score may be due to the nature of specific classes or imbalances in the data. The Classification Report further breaks down precision, recall, and F1-scores for each class. For glioma tumors, precision is 0.99, recall is 0.94, and F1-score is 0.97, demonstrating that the model performs well but occasionally misses some glioma instances. Meningioma tumors show high recall at 1.00, meaning all meningioma cases are correctly identified, though precision is slightly lower at 0.91. Normal cases have perfect precision and a recall of 0.94, indicating the model is cautious with normal cases, yielding no false positives. Pituitary tumors have the highest overall performance, with perfect precision, a recall of 0.98, and an F1-score of 0.99, demonstrating the model’s robust detection ability for this class.
Figure 5.
Example MRI predictions: Actual vs. predicted tumor types with model confidence levels.
Figure 5.
Example MRI predictions: Actual vs. predicted tumor types with model confidence levels.
The displayed output consists of nine MRI images, each annotated with the actual class label, the predicted class label, and the model’s confidence in that prediction. This layout provides insight into the CNN model's ability to classify various types of brain conditions and normal tissue. In most instances, the model's predictions are highly accurate, as indicated by the high confidence levels, often exceeding 98%. For instance, in the first row, images of glioma tumor, meningioma tumor, and pituitary tumor are correctly classified with confidence scores around or above 99%, demonstrating the model's robust differentiation between these tumor types. This high accuracy suggests that the model has effectively learned distinguishing features specific to each category, allowing it to make reliable predictions when presented with images like those in its training set. In the second row, two of the three images are also accurately classified, with confidence levels near 99% for both pituitary tumor and normal tissue predictions. The middle image in this row shows an example of meningioma tumor predicted with a confidence of 98.36%. This slight drop in confidence may indicate a lower certainty in identifying specific features within this image, though the model's overall performance remains strong. The third row further reinforces the model's accuracy. Both glioma and meningioma tumors are accurately classified with high confidence levels (above 98%). However, the image on the far right, which represents a glioma tumor, is predicted correctly but with a lower confidence of 72.32%. This lower confidence could suggest that the image has characteristics that might overlap with features of other classes, challenging the model's ability to clearly classify it with the same certainty as others. This case highlights a potential area for improvement, as it suggests that certain glioma images may benefit from additional feature refinement in the training data.
Figure 6.
Axial brain MRI of a lung cancer patient with no brain tumors.
Figure 6.
Axial brain MRI of a lung cancer patient with no brain tumors.
CNN model has been developed to classify MRI images, providing valuable assistance in identifying brain conditions based on distinct visual patterns in the scans. The trained model is stored as "model.keras" and is deployed via a FastAPI server, enabling real-time interaction through a RESTful API. This infrastructure allows users to upload MRI images for analysis, with the system processing each request and returning predictions in an efficient manner.
Figure 7.
API classification response example.
Figure 7.
API classification response example.
For testing and interaction with the FastAPI server, Postman—a widely used tool for API testing—is employed to post MRI images to the API endpoint. Once an image is submitted, the FastAPI server routes it to the CNN model, which performs an analysis and outputs a prediction of the image’s class. The model's classification results include a confidence score, quantifying the certainty of its diagnosis. In the example provided, the model classified an MRI image as "normal" with a confidence level of 91.35%, indicating a high level of assurance in this assessment. The prediction process is swift, with each request typically taking approximately 3 to 5 seconds to complete. This quick response time is a key advantage of using FastAPI, as it supports asynchronous request handling, minimizing latency and optimizing performance. This system is particularly suited for clinical or research applications, where prompt results can aid in decision-making or expedite research workflows.
After completing the prediction process, the convolutional neural network (CNN) model generates a heatmap visualization, providing an interpretable overlay for the classified MRI image. This heatmap highlights the areas within the MRI scan that contributed most significantly to the model's decision, effectively revealing the model’s focus and rationale in making its prediction. By emphasizing specific regions within the MRI image, the heatmap aids in understanding the model's interpretation of the visual data, which is particularly valuable in clinical and diagnostic contexts.
Figure 8.
The heat-map of predicted image.
Figure 8.
The heat-map of predicted image.
The heatmap is subsequently masked with the original MRI image, combining both layers into a single output. This overlay technique enhances visual comprehension by aligning the highlighted features from the heatmap directly with the anatomical structures in the MRI scan. The combination of the original image and the heatmap visualization provides medical practitioners or researchers with a clearer view of which parts of the brain were instrumental in the model's classification, such as identifying normal or abnormal tissue patterns.
Figure 9.
The heat-map mask of predicted image.
Figure 9.
The heat-map mask of predicted image.
By producing this layered output, the model assists in bridging the gap between complex neural network decisions and human interpretability. This masking process not only enhances transparency but also builds confidence in the model's predictions, as it visually confirms that the CNN is focusing on medically relevant regions within the MRI. Such interpretability methods are essential in healthcare applications, where understanding the underlying basis of AI-driven predictions is crucial for informed decision-making.
4. Discussion
The Convolutional Neural Network (CNN) model developed in this study demonstrates a promising approach for classifying brain MRI into four distinct categories: glioma tumor, meningioma tumor, normal brain, and pituitary tumor. By leveraging a dataset of 3,097 MRI images with balanced representation across these categories, the model achieved high accuracy and robust classification performance. Notably, the use of data preprocessing techniques, including normalization and augmentation, has proven effective in enhancing the model’s generalizability, ensuring consistent performance across both training and validation datasets. These findings indicate that CNNs, when appropriately designed and trained, can offer valuable support in clinical diagnostic tasks by providing reliable predictions and reducing the time burden on radiologists and medical practitioners. One significant advantage of this model lies in its ability to differentiate between distinct tumor types, each exhibiting unique visual characteristics in MRI scans. For example, the model's performance on glioma and pituitary tumors—known for their diffuse and location-specific features, respectively—suggests that it has successfully learned critical patterns associated with these conditions. However, the model exhibited some minor misclassification trends, particularly between glioma and meningioma tumors, which may share overlapping imaging features. Addressing this challenge could involve further enhancement of the training dataset with additional cases or employing advanced feature extraction techniques to help the model more accurately delineate these subtle differences. The use of Grad-CAM for interpretability adds an additional layer of utility to this CNN model, as it provides insight into which regions of the MRI images contributed most to each classification decision. This transparency is crucial in healthcare settings, as it allows clinicians to validate the model's focus on medically relevant regions, thereby building trust in the AI-driven predictions.
The evaluation metrics, including precision, recall, and F1-score, reinforce the model's effectiveness in distinguishing between tumor types and normal brain tissue. The high precision and recall scores across classes indicate balanced performance without significant bias toward any specific category. The slight dip in performance for the ROC AUC score may reflect the inherent complexity of multi-class classification problems, particularly in cases where certain features of different tumor types may appear visually similar. Future iterations of this model could benefit from further refinement of the architecture, potentially incorporating transfer learning or additional optimization techniques to improve inter-class discrimination and enhance overall classification confidence. Moreover, the model’s deployment via a FastAPI server enables seamless integration with external systems, facilitating real-time analysis and classification of MRI images. The efficient response times observed—typically 3 to 5 seconds per prediction—demonstrate the potential for this system to be implemented in a clinical setting, where prompt diagnostics can be critical. The use of Postman for API testing further validates the system's stability and usability, ensuring that healthcare professionals can reliably interact with the model for diagnostic assistance. This streamlined pipeline, from image submission to classification and heatmap generation, illustrates a potential pathway for integrating deep learning models into diagnostic workflows, supporting evidence-based decision-making.
Nevertheless, several limitations warrant consideration. First, while the dataset is diverse and representative, the model’s performance could be further improved by expanding the dataset to include a wider range of MRI images across different imaging modalities, institutions, and populations. Such variability would help the model generalize better to different patient demographics and MRI protocols, thus increasing its applicability in diverse clinical environments. Additionally, the model’s interpretability, while enhanced by Grad-CAM, could be further strengthened by exploring other explainability techniques, such as SHAP values, to provide more granular insights into the specific features influencing each prediction. Biomedical research may be aided by the automatic segmentation of rodent brain tumors using magnetic resonance imaging and the goal of the current study is to demonstrate the viability of AI-assisted segmentation as well as the possibility of automatic segmentation by AI [
14]. In comparing our CNN model for classification with the 3D U-Net model used for segmentation, each demonstrates strengths tailored to distinct objectives within brain MRI analysis. The 3D U-Net model specializes in tumor segmentation, precisely delineating tumor boundaries and measuring spatial volumes. This makes it particularly suitable for research settings where standardizing measurements across datasets is critical, such as monitoring tumor progression or assessing treatment effects. Furthermore, the 3D U-Net's resilience to noise and ability to reduce inter-observer variability reinforce its utility in studies requiring reproducible segmentation results. Conversely, our CNN model excels in classification by accurately categorizing MRI images into conditions like glioma, meningioma, or normal tissue. This classification approach is highly advantageous for clinical settings, where rapid, reliable diagnoses are essential. Additionally, interpretability features such as heatmaps in our model enable clinicians to visualize areas most relevant to the classification decision, adding transparency and aiding diagnostic confidence. Thus, while the 3D U-Net model is optimal for precise tumor delineation and volumetric assessment, our CNN model is better suited for diagnostic classification, offering a valuable and interpretable tool in clinical decision-making.