Deep learning-based tumor segmentation has achieved strong performance on benchmark datasets, yet models often degrade when deployed in new hospitals. This decline is largely driven by domain shift, including differences in scanners, acquisition protocols, reconstruction settings, patient populations, and annotation styles. In high-stakes clinical workflows, such instability limits real adoption because a model that performs well in one center may fail silently in another. This paper presents a preprint-ready methodological framework for domain shift-robust segmentation in multi-hospital MRI and CT tumor imaging. The proposed design combines four complementary ingredients: strong segmentation backbones from the U-Net family, domain-generalization through intensity, style, and frequency-based augmentation, self-supervised pretraining on unlabeled multi-site data, and optional label-free test-time adaptation for target hospitals. The manuscript emphasizes a deployment-oriented evaluation protocol that prioritizes worst-site reliability, boundary safety, calibration, and failure analysis rather than average Dice alone. We describe an experimental plan with leave-one-hospital-out validation, targeted ablations, uncertainty analysis, and stress tests under artifact corruption. The expected pattern is that self-supervised pretraining and frequency-aware augmentation reduce the gap between in-domain and out-of-domain performance, improve worst-site Dice, and lower extreme boundary errors measured by Hausdorff distance. The central argument is that robustness should be treated as a first-class objective in medical image segmentation and that multi-center validation, transparent reporting, and clinically meaningful error analysis are necessary before deployment.