Version 1
: Received: 14 January 2022 / Approved: 17 January 2022 / Online: 17 January 2022 (12:40:15 CET)
How to cite:
El Ibrahimi, S.; Hendricks, M. A.; Hallvik, S. E.; Dameshghi, N.; Hildebran, C.; Fischer, M. A.; Weiner, S. G. Enhancing Race and Ethnicity using Bayesian Imputation in an All Payer Claims Database. Preprints2022, 2022010227. https://doi.org/10.20944/preprints202201.0227.v1
El Ibrahimi, S.; Hendricks, M. A.; Hallvik, S. E.; Dameshghi, N.; Hildebran, C.; Fischer, M. A.; Weiner, S. G. Enhancing Race and Ethnicity using Bayesian Imputation in an All Payer Claims Database. Preprints 2022, 2022010227. https://doi.org/10.20944/preprints202201.0227.v1
El Ibrahimi, S.; Hendricks, M. A.; Hallvik, S. E.; Dameshghi, N.; Hildebran, C.; Fischer, M. A.; Weiner, S. G. Enhancing Race and Ethnicity using Bayesian Imputation in an All Payer Claims Database. Preprints2022, 2022010227. https://doi.org/10.20944/preprints202201.0227.v1
APA Style
El Ibrahimi, S., Hendricks, M. A., Hallvik, S. E., Dameshghi, N., Hildebran, C., Fischer, M. A., & Weiner, S. G. (2022). Enhancing Race and Ethnicity using Bayesian Imputation in an All Payer Claims Database. Preprints. https://doi.org/10.20944/preprints202201.0227.v1
Chicago/Turabian Style
El Ibrahimi, S., Michael A. Fischer and Scott G. Weiner. 2022 "Enhancing Race and Ethnicity using Bayesian Imputation in an All Payer Claims Database" Preprints. https://doi.org/10.20944/preprints202201.0227.v1
Abstract
Background: All Payer Claims Databases (APCD) are a rich source of health information, however, race and ethnicity (R&E) data are largely missing. Bayesian Improved Surname Geocoding (BISG) is a common R&E imputation method, yet, validation of BISG in APCDs is lacking. We used the BISG to impute missing R&E in the Oregon APCD. Methods: BISG imputed R&E for Asian Pacific Islanders (API), Blacks, Hispanics and Whites were contrasted to the gold standard (vital statistics) and sensitivity and specificity improvements were assessed. Logistic regression examined whether missing R&E was random across patient characteristics. Results: Among 85,857 individuals in the study, 32.1% (n=27,594) had missing R&E. Missing R&E was not randomly distributed. There were higher odds of missingness among males, Whites, those age 65 and older, and commercially insured individuals. Differences in the percent missing were also found by co-morbid conditions and mortality causes. Imputing the missing R&E with BISG method improved the sensitivity to identify White, Black, API, and Hispanics. Conclusions: APCDs can benefit from enhancing missing R&E with BISG imputation to perform more robust population-health level analyses and identify inequities according to R&E without losing power or dropping non-random records with missing R&E data.
Keywords
Bayesian inference; race and ethnicity imputation; All Payer Claims Database; vital statistics death records; validation
Subject
Social Sciences, Behavior Sciences
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.