Background/Objectives : This study investigates the utility of vision transformers (ViTs) for predicting early responses to SRS using a minimal pre-processing approach on MRI images. Methods: We analyzed MRI scans from 19 patients with BM, focusing on axial fluid-attenuated inversion recovery (FLAIR) and high-resolution contrast-enhanced T1-weighted (CE T1w) sequences. Patients were classified as responders (complete or partial response) or non-responders (stable or progressive disease). Results: Our findings demonstrate that ViTs can effectively predict treatment responses, achieving an overall accuracy of 97%. The model exhibited high precision (96% for progression and 97% for regression) and strong recall rates (92% for progression and 99% for regression), distinguishing between treatment outcomes reliably. The confusion matrix analysis further supports its reliability, with minimal misclassifications. The model also achieved an almost perfect area under the ROC curve (AUC = 0.99), indicating accurate differentiation between responders and non-responders across various thresholds. Conclusions : These findings highlight the potential of the Vision Transformer model as a non-invasive predictive tool in clinical settings, significantly influencing clinical decision-making processes and enhancing patient management for brain metastases. By improving early response predictions, this model contributes to the development of personalized treatment strategies in oncology. Future research should focus on validating these results in larger, diverse cohorts, integrating additional data types, and refining the model to further enhance its utility in clinical practice.