Version 1
: Received: 15 May 2024 / Approved: 15 May 2024 / Online: 16 May 2024 (08:21:56 CEST)
How to cite:
Olaniran, O. R.; Alzahrani, A. R. R. Robustness of Bayesian Random Forest in High-Dimensional Analysis with Missing Data. Preprints2024, 2024051022. https://doi.org/10.20944/preprints202405.1022.v1
Olaniran, O. R.; Alzahrani, A. R. R. Robustness of Bayesian Random Forest in High-Dimensional Analysis with Missing Data. Preprints 2024, 2024051022. https://doi.org/10.20944/preprints202405.1022.v1
Olaniran, O. R.; Alzahrani, A. R. R. Robustness of Bayesian Random Forest in High-Dimensional Analysis with Missing Data. Preprints2024, 2024051022. https://doi.org/10.20944/preprints202405.1022.v1
APA Style
Olaniran, O. R., & Alzahrani, A. R. R. (2024). Robustness of Bayesian Random Forest in High-Dimensional Analysis with Missing Data. Preprints. https://doi.org/10.20944/preprints202405.1022.v1
Chicago/Turabian Style
Olaniran, O. R. and Ali Rashash R. Alzahrani. 2024 "Robustness of Bayesian Random Forest in High-Dimensional Analysis with Missing Data" Preprints. https://doi.org/10.20944/preprints202405.1022.v1
Abstract
The challenge of missing data in scientific research prompts researchers to decide between imputing incomplete data or discarding observations, where discarding can lead to information loss. Various methods exist, from simple deletion to sophisticated approaches like Multiple Imputation (MI). However, these methods often fall short with high-dimensional datasets. Multiple Imputation by Chained Equations (MICE) and Random Forest (RF) proximity imputation offer promising alternatives. Therefore, in this paper, we propose integrating MICE with Bayesian random forest (BRF) to enhance imputation accuracy and predictive power, particularly in high-dimensional analyses. Our approach combines MICE’s efficiency with BRF’s robustness, offering a comprehensive solution to missing data challenges. By way of example, we provide empirical evaluations to validate its effectiveness using synthetic data of various missing data scenarios. The results from the simulations showed that the combination of BRF and MICE offered a promising strategy for high-dimensional analysis in the presence of missing data.
Keywords
Robust Estimation; Missing data; Bayesian Random Forest; High-dimensional Analysis; Random Forest
Subject
Computer Science and Mathematics, Probability and Statistics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.