Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Is the validity of logistic regression models developed with a medico-administrative database inferior to models developed from clinical databases?

Version 1 : Received: 2 October 2023 / Approved: 2 October 2023 / Online: 3 October 2023 (08:23:00 CEST)

A peer-reviewed article of this Preprint also exists.

Bernard, A.; Cottenet, J.; Quantin, C. Is the Validity of Logistic Regression Models Developed with a National Hospital Database Inferior to Models Developed from Clinical Databases to Analyze Surgical Lung Cancers? Cancers 2024, 16, 734. Bernard, A.; Cottenet, J.; Quantin, C. Is the Validity of Logistic Regression Models Developed with a National Hospital Database Inferior to Models Developed from Clinical Databases to Analyze Surgical Lung Cancers? Cancers 2024, 16, 734.

Abstract

In medico-administrative database, certain prognostic factors cannot be taken into account. The main objective was to estimate the performance of two models based on two databases: Epi-thor clinical and medico-administrative databases. For each of the two databases, we randomly sampled a development dataset with 70% of the data and a validation dataset with 30%. Performance of models was assessed by Brier score, the area under the receiver operating characteristic (AUC ROC) curve and the calibration of the model. For Epithor and medico-administrative databases, the development dataset included 10,516 patients (with resp. 227 (2.16%) and 283(2.7%) deaths) and the validation dataset included 4,507 patients (with resp. 93 (2%) and 119 (2.64%) deaths). 15 predictors were selected in the models (including FEV, Body Mass Index, ASA score and TNM stage for Epithor). The Brier score values were similar in the models of the two databases. For validation data, the AUC ROC curve was 0.73 [0.68-0.78] for Epithor and 0.8 [0.76-0.84] for medico-administrative databases. The slope of the calibration plot was less than 1 for the two databases. This work shows the good performances of a model developed from a medico-administrative database, despite the absence of clinical variables used in practice by surgeons, such as FEV1, ASA score or TNM stage.

Keywords

Model performance; medico-administrative database; clinical database; Brier score; area under the receiver operating characteristic; discrimination; calibration

Subject

Medicine and Pharmacology, Epidemiology and Infectious Diseases

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.