Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

A High Resolution Spatiotemporal Fine Particulate Matter Exposure Assessment Model for the Contiguous United States

Version 1 : Received: 7 September 2021 / Approved: 8 September 2021 / Online: 8 September 2021 (21:00:47 CEST)

How to cite: Brokamp, C. A High Resolution Spatiotemporal Fine Particulate Matter Exposure Assessment Model for the Contiguous United States. Preprints 2021, 2021090164 (doi: 10.20944/preprints202109.0164.v1). Brokamp, C. A High Resolution Spatiotemporal Fine Particulate Matter Exposure Assessment Model for the Contiguous United States. Preprints 2021, 2021090164 (doi: 10.20944/preprints202109.0164.v1).

Abstract

Currently available nationwide prediction models for fine particulate matter (PM2.5) lack prediction confidence intervals and usually do not describe cross validated (CV) model performance at different spatiotemporal resolutions and extents. We used 41 different spatiotemporal predictors, including data on land use, meteorology, aerosol optical density, emissions, wildfires, population, traffic, and spatiotemporal indicators to train a machine learning model to predict daily averages of PM2.5 concentrations at 0.75 sq km resolution across the contiguous United States from 2000 through 2020. We utilized a generalized random forest model that allowed us to generate asymptotically-valid prediction confidence intervals and took advantage of its usefulness as an ensemble learner to quickly and cheaply characterize leave-one-location-out (LOLO) CV model performance for different temporal resolutions and geographic regions. Using a variable importance metric, we selected 8 predictors that were able to accurately predict daily PM2.5, with an overall LOLO CV median absolute error (MAE) of 1.20 μgm3, an R2 of 0.84, and confidence interval coverage fraction of 95%. When considering aggregated temporal windows, the model achieved LOLO CV MAEs of 0.99, 0.76, 0.63, and 0.60 μgm3 for weekly, monthly, annual, and all-time exposure assessments, respectively. We further describe the model’s CV performance at different geographic regions in the United States, finding that it performs worse in the Western half of the country where there are less monitors. The code and data used to create this model are publicly available and we have developed software packages to be used for exposure assessment. This accurate exposure assessment model will be useful for epidemiologists seeking to study the health effects of PM2.5 across the continential United States, while possibly considering exposure estimation accuracy and uncertainty specific to the the spatiotemporal resolution and extent of their study design and population.

Keywords

fine particulate matter; exposure assessment; machine learning; spatiotemporal; high resolution

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.