PreprintArticleVersion 1Preserved in Portico This version is not peer-reviewed
Determination of Optimal Spatial Sample Sizes for Fitting Negative Binomial-Based Crash Prediction Models with Consideration of Statistical Modeling Assumptions
Koloushani, M.; Abazari, S.R.; Vanli, O.A.; Ozguven, E.E.; Moses, R.; Giroux, R.; Jacobs, B. Determination of Optimal Spatial Sample Sizes for Fitting Negative Binomial-Based Crash Prediction Models with Consideration of Statistical Modeling Assumptions. Sustainability2023, 15, 14731.
Koloushani, M.; Abazari, S.R.; Vanli, O.A.; Ozguven, E.E.; Moses, R.; Giroux, R.; Jacobs, B. Determination of Optimal Spatial Sample Sizes for Fitting Negative Binomial-Based Crash Prediction Models with Consideration of Statistical Modeling Assumptions. Sustainability 2023, 15, 14731.
Koloushani, M.; Abazari, S.R.; Vanli, O.A.; Ozguven, E.E.; Moses, R.; Giroux, R.; Jacobs, B. Determination of Optimal Spatial Sample Sizes for Fitting Negative Binomial-Based Crash Prediction Models with Consideration of Statistical Modeling Assumptions. Sustainability2023, 15, 14731.
Koloushani, M.; Abazari, S.R.; Vanli, O.A.; Ozguven, E.E.; Moses, R.; Giroux, R.; Jacobs, B. Determination of Optimal Spatial Sample Sizes for Fitting Negative Binomial-Based Crash Prediction Models with Consideration of Statistical Modeling Assumptions. Sustainability 2023, 15, 14731.
Abstract
Transportation authorities aim to boost road safety by identifying risky locations and applying suitable safety measures. The Highway Safety Manual (HSM) is a vital resource for US transportation professionals, aiding in the creation of Safety Performance Functions (SPFs), which are predictive models for crashes. These models rely on Negative Binomial distribution-based regression and misinterpreting them due to unmet statistical assumptions can lead to erroneous conclusions, including inaccurately assessing crash rates or missing high-risk sites. The Florida Department of Transportation (FDOT) has introduced context classifications to HSM SPFs, complicating assumption violation identification. This study, part of an FDOT-sponsored project, investigates established statistical diagnostic tests to identify model violations and proposes a novel approach to determine optimal spatial regions for Empirical Bayes adjustment. This adjustment aligns HSM-SPFs with regression assumptions. The study employs a case study involving Florida roads. Results indicate that a 20-mile radius offers an optimal spatial sample size for modeling crashes of all injury levels, ensuring accurate assumptions. For severe injury crashes, which are less frequent and harder to predict, a 60-mile radius is suggested to fulfill statistical modeling assumptions. This methodology guides FDOT practitioners in assessing the conformity of HSM-SPFs with intended assumptions and determining appropriate region sizes.
Keywords
Crash Prediction Model; Safety Performance Function; Highway Safety Manual; Negative Binomial Regression; Model Diagnostic; Context Classification System
Subject
Engineering, Safety, Risk, Reliability and Quality
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.