Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

A High Resolution Spatiotemporal Fine Particulate Matter Exposure Assessment Model for the Contiguous United States

Version 1 : Received: 7 September 2021 / Approved: 8 September 2021 / Online: 8 September 2021 (21:00:47 CEST)
Version 2 : Received: 14 December 2021 / Approved: 17 December 2021 / Online: 17 December 2021 (14:46:46 CET)

How to cite: Brokamp, C. A High Resolution Spatiotemporal Fine Particulate Matter Exposure Assessment Model for the Contiguous United States. Preprints 2021, 2021090164 (doi: 10.20944/preprints202109.0164.v2). Brokamp, C. A High Resolution Spatiotemporal Fine Particulate Matter Exposure Assessment Model for the Contiguous United States. Preprints 2021, 2021090164 (doi: 10.20944/preprints202109.0164.v2).

Abstract

Currently available nationwide prediction models for fine particulate matter (PM2.5) lack prediction confidence intervals and usually do not describe cross validated model performance at different spatiotemporal resolutions and extents. We used 41 different spatiotemporal predictors, including data on land use, meteorology, aerosol optical density, emissions, wildfires, population, traffic, and spatiotemporal indicators to train a machine learning model to predict daily averages of PM2.5 concentrations at 0.75 sq km resolution across the contiguous United States from 2000 through 2020. We utilized a generalized random forest model that allowed us to generate asymptotically-valid prediction confidence intervals and took advantage of its usefulness as an ensemble learner to quickly and cheaply characterize leave-one-location-out CV model performance for different temporal resolutions and geographic regions. Using a variable importance metric, we selected 8 predictors that were able to accurately predict daily PM2.5, with an overall leave-one-location-out cross validated median absolute error of 1.20 ug/m3, an R2 of 0.84, and confidence interval coverage fraction of 95%. When considering aggregated temporal windows, the model achieved leave-one-location-out cross validated median absolute errors of 0.99, 0.76, 0.63, and 0.60 ug/m3 for weekly, monthly, annual, and all-time exposure assessments, respectively. We further describe the model’s cross validated performance at different geographic regions in the United States, finding that it performs worse in the Western half of the country where there are less monitors. The code and data used to create this model are publicly available and we have developed software packages to be used for exposure assessment. This accurate exposure assessment model will be useful for epidemiologists seeking to study the health effects of PM across the continental United States, while possibly considering exposure estimation accuracy and uncertainty specific to the the spatiotemporal resolution and extent of their study design and population.

Supplementary and Associated Material

https://github.com/geomarker-io/st_pm_hex: repository of code used to generate the model

Keywords

fine particulate matter; exposure assessment; machine learning; spatiotemporal; high resolution

Subject

EARTH SCIENCES, Environmental Sciences

Comments (1)

Comment 1
Received: 17 December 2021
Commenter: Cole Brokamp
Commenter's Conflict of Interests: Author
Comment: Completed peer review.
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.