Calibrated and Explainable Gradient Boosting for Road Traffic Crash Severity Prediction: SHAP Audit and Cross-Jurisdiction Transfer Evaluation

Mohammad Alhawarat; Ahmad Alkhatib; Qasem Nijem

doi:10.20944/preprints202605.0777.v1

Submitted:

11 May 2026

Posted:

12 May 2026

You are already at the latest version

Abstract

Forecasting crash severity is critical for emergency response, infrastructure spending & risk communication. Although machine learning has been widely applied to this problem, three shortcomings prevent its practical application: poorly calibrated probability scores, SHAP-based explanations whose faithfulness has not been verified, and models never tested in different regions. The proposed framework, termed SAE-XCrash (Safety-Aware and Explainable Crash Severity Prediction), considers all three using two public datasets - US-Accidents (7.0 million records, 2016-2023) and UK STATS19 (approximately 1,010,000 records, 2016-2022). Notably, the US-Accidents severity label refers to traffic disruption duration, not injury outcome, and results should be interpreted accordingly. Previously unknown label-schema drift led to a revised binary target with Severity 4 as only the positive class; strict temporal splits are used throughout. FIVE classifiers are compared. Post hoc isotonic Calibration reduces expected calibration. Error by 97.3 percent while maintaining negligible discrimination loss. A four-step SHAP audit confirms that explanations genuinely reflect model behavior: deletion-based per-budget faithfulness gaps exceed the 0.05 threshold at every feature budget (min gap=0.066, p<0.0001), though the aggregate trapezoidal AUC is borderline due to scale compression at AUPRC≈0.13, and insertion gaps are statistically significant at more than ten percent of features. Explanation stability holds under conservative noise levels but degrades at realistic perturbation magnitudes mainly in spatially sparse geohash cells. In a three-tier cross-dataset transfer experiment - zero-shot, recalibration and full retraining - spatial memorization is the major generalization barrier while temporal features transfer smoothly between jurisdictions.

Keywords:

road traffic safety

;

XGBoost

;

LightGBM

;

probability calibration

;

explainable artificial intelligence

;

imbalanced classification

;

temporal split validation

;

cross-dataset generalization

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Calibrated and Explainable Gradient Boosting for Road Traffic Crash Severity Prediction: SHAP Audit and Cross-Jurisdiction Transfer Evaluation

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe