Preprint
Article

This version is not peer-reviewed.

Predicting Motor Insurance Claims Using Telematics Data—XGBoost vs. Logistic Regression

A peer-reviewed article of this preprint also exists.

Submitted:

09 May 2019

Posted:

10 May 2019

You are already at the latest version

Abstract
XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims vs. no claims can be used to identify the determinants of traffic accidents. We compare the relative performances of logistic regression and XGBoost approaches for predicting the existence of accident claims using telematics data. The dataset contains information from an insurance company about individuals’ driving patterns – including total annual distance driven and percentage of total distance driven in urban areas. Our findings show that logistic regression is a suitable model given its interpretability and good predictive capacity. XGBoost requires numerous model-tuning procedures to match the predictive performance of the logistic regression model and greater effort as regards interpretation.
Keywords: 
;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated