Background/Objectives: Sepsis is responsible for approximately 270,000 deaths annually in the United States. Conventional scoring systems, such as SOFA and qSOFA, are largely reactive and do not effectively leverage longitudinal ICU data for early prediction. This study aimed to develop a deep learning framework capable of predicting sepsis onset up to 6 hours before Sepsis-3 criteria are met, while also providing clinically interpretable temporal explanations. Methods: The PhysioNet/CinC 2019 Challenge dataset, comprising 1,552,210 patient-hours from 40,336 ICU patients, was utilized. A Temporal Transformer Encoder (TTE) was trained using 12-hour look-back windows with 92 engineered features. Severe class imbalance (2.6% positive rate) was addressed through weighted random sampling and focal loss. Five-fold patient-level cross-validation was employed to prevent temporal leakage. Platt scaling was applied for probability calibration. Grad-CAM was adapted for temporal explainability, while SHAP was used for feature-level attribution. BiLSTM-Attention and XGBoost models served as baseline comparators. Results: The TTE model achieved a cross-validated AUROC of 0.8320±0.0032 and an AUPRC of 0.1505±0.0148, significantly outperforming BiLSTM Attention (AUROC: 0.7859) and XGBoost (AUROC: 0.7731; DeLong p < 0.0001). Platt scaling reduced the Expected Calibration Error from 0.3154 to 0.0017. The median alert lead time was 46.5 hours (IQR: 21–84 h), with 95.3% of septic patients receiving alerts at least 3 hours before onset. Grad-CAM analysis identified timesteps t − 10 and t −9 as the most predictive. However, high-severity patients (SOFA proxy ≥ 3)demonstrated substantially reduced performance (AUROC: 0.257). Conclusions: The proposed TTE framework demonstrates strong and well-calibrated early sepsis prediction with substantial clinical lead time. The concentration of predictive signals 10–11 hours prior to alert generation supports the feasibility of continuous automated ICU monitoring from admission onward. Reduced performance in high-severity patients highlights the need for severity-stratified modelling in future research.