Development and validation of a prediction model for 28-day in-hospital mortality in critically ill patients with COVID-19

Development and validation of a prediction model for 28-day in-hospital mortality in critically ill patients with COVID-19 Paloma Ferrando-Vivas (0000-0002-2163-645X), Doug W Gould (0000-0003-41483312), James C Doidge (0000-0002-3674-3100), Karen Thomas (0000-0001-75484466), Paul Mouncey (0000-0002-8510-8517), Manu Shankar-Hari (0000-00025338-2538), J Duncan Young (0000-0002-6838-4835), Kathryn M Rowan (00000001-8217-5602), David A Harrison (0000-0002-9002-9098), on behalf of the ICNARC Covid-19 Team† †Group members listed in Acknowledgments Intensive Care National Audit & Research Centre (ICNARC), 24 High Holborn, London WC1V 6AZ, United Kingdom Paloma Ferrando-Vivas, Statistician Doug W Gould, Senior Researcher James C Doidge, Senior Statistician Karen Thomas, Senior Statistician Paul Mouncey, Head of Research Kathryn M Rowan, Director David A Harrison, Head Statistician Intensive Care Unit, Guy’s and St Thomas’ NHS Foundation Trust, and School of Immunology & Microbial Sciences, Kings College London, London, United Kingdom Manu Shankar-Hari, NIHR Clinician Scientist Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK J Duncan Young, Professor of Intensive Care Medicine Correspondence to: Professor David Harrison david.harrison@icnarc.org Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 February 2021 doi:10.20944/preprints202102.0059.v1


Introduction
Numerous statistical models have been developed for patients with COVID-19, including both diagnostic models to identify likely COVID-19 and prognostic models to predict outcomes including mortality, disease progression, ventilation and length of hospital stay.(1) However, substantial limitations have been identified in the development, validation and reporting of these models, and none of the models to date have focussed on the most severely ill patients, those admitted to critical care units. There is a long history of prediction models in adult critical care. These models take information from early in the patient's critical illness and make a prediction about the patient's likely outcome. These can then be used to support risk-adjusted analyses and to monitor changing outcomes over time, or pooled to compare observed and expected outcomes for critical care providers. (2) In the UK, the Intensive Care National Audit and Research Centre (ICNARC) co-

Selection of data
For model development, data were extracted from the Case Mix Programme database for all patients in England, Wales and Northern Ireland admitted to critical care between 1 March 2020 and 30 April 2020 with a diagnosis of COVID-19 confirmed either at, or after, the start of critical care. Initial model development was undertaken on a dataset locked for analysis on 21 July 2020. The final model was reestimated using an updated dataset locked on 8 January 2021. Patients with a duration of critical care of less than 24 hours were excluded, as this was the timeframe over which potential predictors were measured. We also excluded patients transferred to another critical care unit within 28 days of admission who had an unknown final critical care outcome. For patients with multiple critical care admissions, only the first admission was included.
Models were validated using data for patients admitted between 1 May and 31 August 2020 (locked for analysis on 8 January 2021). The same inclusion and exclusion criteria were applied.

Outcome and potential predictors
The primary outcome was in-hospital mortality at 28 days following the start of critical care (28-day in-hospital mortality). Patients discharged to a non-acute setting prior to 28 days were considered to have survived. The selection of 28 days was informed by analysis of longer-term outcomes showing that the majority of hospital deaths in critically ill patients with COVID-19 occurred by 28 days following the start of critical care. (4) Potential predictors were selected, a priori, based on: established relationships with outcome for critically ill patients; emerging information from the COVID-19 pandemic, including evaluation of prognostic factors using the Case Mix Programme database (5); and availability within the Case Mix Programme dataset (eTable 1).
All physiological and laboratory variables were assessed as the most extreme values within the first 24 hours of critical care. Where patients were transferred or readmitted to critical care, physiological and laboratory variables were included from the first 24 hours of the first admission only. To avoid adjusting out differences in patient outcome related to treatment, variables such as organ support received were not assessed for inclusion in the models. Missing data were handled with multiple imputation (see Supplementary Material for detail).

Model development and validation
Due to outcomes for critically ill patients with COVID-19 improving over time, (6) which may reflect changes in treatment or clinical management, we developed two versions of the model: incorporating and not incorporating calendar time, such that a The models were validated in both the development and temporal validation cohorts.
When validated using the temporal validation cohort, calendar time was set to 30 April. Discrimination was assessed with the c index (area under the receiver operating characteristic curve), accuracy was assessed with Brier score, and calibration was assessed graphically and using Cox calibration regression.

Patient and Public Involvement
We did not directly include PPI in this evaluation, but the database used in the study was developed with PPI involvement and ICNARC is overseen by a Board of Management (Trustees) that includes patient and public representatives. patients were included in model development (Figure 1). Patient characteristics are presented in Table 1 and eTable 4. There were 3,464 (40.0%) in-hospital deaths by 28 days following the start of critical care.

Results
The model development steps are summarised in eTable 5 and full model coefficients are provided in eTable 6. The final models included the following predictors: age; quintile of deprivation; ethnicity; body mass index; any dependency prior to hospital admission; any severe condition in the past medical history; highest heart rate; highest respiratory rate; PaO2/FiO2 ratio from the arterial blood gas with the lowest PaO2; highest blood lactate concentration; highest serum creatinine concentration; highest serum urea concentration; neutrophil count associated with the lowest white blood count; and lowest platelet count.

Model validation
ICNARC received data for 2,521 admissions of 1,853 patients with COVID-19 admitted to 229 of the 287 participating critical care units between 1 May 2020 and 31 August 2020 (again, all remaining units confirmed that no patients were admitted with COVID-19). After excluding patients with a duration of critical care of less than 24 hours (n=118, 6.3%), a total of 1,735 patients were included in the temporal validation ( Figure 1, Table 1 and eTable 4). Overall, 28-day in-hospital mortality in the validation cohort was substantially lower than in the development cohort (30.3% vs 40.0%). Table 2 and calibration plots are presented in Figure 2. Discrimination was well maintained in the temporal validation, with a c index of 0.78. However, both models overpredicted mortality in the validation cohort. Unsurprisingly, this discrepancy was greater for the model not incorporating calendar time.

Discussion
We have developed prediction models for 28-day in-hospital mortality among critically ill patients with COVID-19, with and without adjustment for time trend. The ability of the models to discriminate between survivors and non-survivors remained similar when the models were validated using data from a later time period, but the models were poorly calibrated, reflecting improvements in outcomes over the time period studied.
The major strength of this work is the large, national database, with 100% coverage of general critical care units providing Level 3 (intensive) care. Harnessing ongoing, routine data collection meant that established systems were already in place with trained data collectors following existing definitions. The disadvantage of this approach was that only routinely recorded fields were available, meaning that some variables that have been found to be useful predictors of outcomes for hospitalised patients with COVID-19 in other studies, for example C reactive protein and lymphocyte count,(1, 7) were not available for inclusion in this model. Despite this, the model demonstrated discrimination comparable with some of the most extensively validated models for hospitalised patients. (1,8,9) The poor calibration in the temporal validation data was to be expected given previously reported improving outcomes for critically ill patients with COVID-19 over the course of the first epidemic wave. (6) While this limits the scope for applying the models to prediction of future patient outcomes, the models do provide baseline predictions to support monitoring of changes in risk-adjusted outcome over time.
Nesting the models within ongoing data collection will allow further recalibration and further development to take place. Our focus was a service evaluation of critical care in the UK, and the models require evaluation and recalibration for use in other settings.
Data sharing: Requests for accessing data from the Case Mix Programme are subject to approval by an independent Data Access Advisory Group (see https://www.icnarc.org/Our-Audit/Cmp/Reports/Access-Our-Data for more details).
Requests should be submitted to the corresponding author in the first instance.

. Model calibration: (A) internal validation, not incorporating calendar time; (B) internal validation, incorporating calendar time; (C) temporal validation, not incorporating calendar time; (D) temporal validation, incorporating calendar time
The points show the observed 28-day in-hospital mortality plotted against the predicted mortality in ten equal sized groups by predicted mortality; the vertical lines through each point are 95% confidence intervals on the observed mortality; the dashed lines show locallyweighted scatterplot smoothing (LOWESS) applied to the observed and predicted log odds of mortality; the grey diagonal line indicates perfect calibration.

A B
C D