Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

CCLR-DL: A Novel Statistics and Deep Learning Hybrid Method for Feature Selection and Forecasting Healthcare Demand

Version 1 : Received: 18 March 2024 / Approved: 19 March 2024 / Online: 22 March 2024 (07:31:07 CET)

How to cite: Hernández Guillamet, G.; López Seguí, F.; Vidal Alaball, J.; López Ibáñez, B. CCLR-DL: A Novel Statistics and Deep Learning Hybrid Method for Feature Selection and Forecasting Healthcare Demand. Preprints 2024, 2024031110. https://doi.org/10.20944/preprints202403.1110.v1 Hernández Guillamet, G.; López Seguí, F.; Vidal Alaball, J.; López Ibáñez, B. CCLR-DL: A Novel Statistics and Deep Learning Hybrid Method for Feature Selection and Forecasting Healthcare Demand. Preprints 2024, 2024031110. https://doi.org/10.20944/preprints202403.1110.v1

Abstract

Hybrid forecasting methods have emerged as a solution surpassing the limitations of both statistical and deep learning approaches. While the first emphasize the significance of variables, they often produce worse forecasting results when compared to newer techniques. In contrast, deep learning models remain enigmatic "black boxes" in terms of interpretability, although achieving better results in forecasting. This article introduces the Comprehensive Cross-Correlation and Lagged Linear Regression Deep Learning (CCLR-DL) framework, designed to harness the best of both approaches, enhancing forecasting accuracy while retaining model interpretability through a feature selection process. CCLR-DL blens cross-correlation analysis, lagged multiple linear regression and granger's causality procedures with deep learning architectures based on LSTM. In a practical demonstration, CCLR-DL was applied to a real database of clinical visits associated to diagnoses in Catalonia, Spain (tracking a population of 6.3 million patients during 10 years). Predicting visits enables the healthcare managers to be ready for future demand shifts. Results demonstrate a consistent and substantial improvement over standalone statistical and deep learning methods when predicting healthcare demand. This hybrid approach not only showcases its efficacy but also offers a promising solution to the challenge of balancing predictive accuracy with model explicability. In this context, this work aims to design and validate a method for feature selection and forecasting of multivariate high dimensional time series datasets not only to improve prediction accuracy but also to model transparency by identifying a subset of variables that improve predictions and G-cause the target variable.

Keywords

Deep learning; explainability; feature selection; Granger causality; health demand modelling; LSTM; multivariate time-series forecasting

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.