Preprint
Article

This version is not peer-reviewed.

Quantifying System-Level Risk at Highway Rail-Grade Crossings: Integrating Spatial Autocorrelation and Explainable Machine Learning

Submitted:

11 May 2026

Posted:

12 May 2026

You are already at the latest version

Abstract
Highway–rail grade crossing (HRGC) safety analysis is often based on raw incident counts or site-level models that do not control for exposure and ignore spatial dependence. This limits the ability to identify where risk is structurally concentrated across the rail network. The problem is important because misidentifying high-risk environments leads to inefficient allocation of limited safety resources and weakens corridor-level intervention strategies. This study introduces accumulated incidents per crossing (AIPX), an exposure-normalized metric that measured cumulative incident burden at the county level over a 51-year period (1975–2025). The study developed an algorithmic framework that integrates data reconciliation with spatial autocorrelation analysis, distributional modeling, and nonparametric machine learning to identify and interpret high-intensity risk environments. Global Moran’s I indicates statistically significant positive spatial autocorrelation (I = 0.359, p = 0.001), confirming that incident intensity is spatially clustered rather than random. Local indicators identify coherent high and low intensity county clusters. Distributional analysis shows that AIPX in high intensity clusters follows heavy-tailed behavior best represented by lognormal and Johnson SU distributions, indicating concentrated risk in a small subset of counties. Machine learning models achieve strong classification performance (AUC ≈ 0.85), with explainability methods consistently identifying temperature, train direction, crossing warning configuration, train composition, and track class as dominant associated features. These variables function as proxies for exposure intensity and network structure rather than causal drivers. The findings demonstrate that HRGC risk is a regional, network-driven phenomenon concentrated along freight-intensive corridors. The study provides a transparent and transferable framework that supports corridor-level prioritization of safety interventions and more effective allocation of infrastructure investments.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated