Preprint Article Version 1 This version is not peer-reviewed

Predicting Motor Insurance Claims Using Telematics Data—XGBoost vs. Logistic Regression

Version 1 : Received: 9 May 2019 / Approved: 10 May 2019 / Online: 10 May 2019 (11:28:11 CEST)

How to cite: Pesantez-Narvaez, J.; Guillen, M.; Alcañiz, M. Predicting Motor Insurance Claims Using Telematics Data—XGBoost vs. Logistic Regression. Preprints 2019, 2019050122 (doi: 10.20944/preprints201905.0122.v1). Pesantez-Narvaez, J.; Guillen, M.; Alcañiz, M. Predicting Motor Insurance Claims Using Telematics Data—XGBoost vs. Logistic Regression. Preprints 2019, 2019050122 (doi: 10.20944/preprints201905.0122.v1).

Abstract

XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims vs. no claims can be used to identify the determinants of traffic accidents. We compare the relative performances of logistic regression and XGBoost approaches for predicting the existence of accident claims using telematics data. The dataset contains information from an insurance company about individuals’ driving patterns – including total annual distance driven and percentage of total distance driven in urban areas. Our findings show that logistic regression is a suitable model given its interpretability and good predictive capacity. XGBoost requires numerous model-tuning procedures to match the predictive performance of the logistic regression model and greater effort as regards interpretation.

Subject Areas

dichotomous response; predictive model; tree boosting; GLM; machine learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.