Submitted:
01 April 2026
Posted:
01 April 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction


2. Methodology
2.1. IHME Dataset
2.2. ACS Dataset
2.3. CAMS Dataset
2.4. Livestock Data
| Source | Description | Period | Resolution | N |
|---|---|---|---|---|
| IHME | Life expectancy at birth | 2012–2019 | County-level | 1 |
| ACS 5-year | Socioeconomic and demographic indicators | 2012–2019 | County-level | 10 |
| CAMS/ERA5 | Atmospheric pollutants and meteorological variables | 2012–2019 | County-level (from 0.75°/0.25° grids) | 26 |
| FAO GLW | Livestock density by species | 2012–2019 (interpolated) | County-level (from ∼10 km grids) | 7 |
| Total: 43 features | ||||
2.5. Data Processing
2.6. Modeling Approach
| Parameter | Optimal Value |
|---|---|
| n_estimators | 1457 |
| max_depth | 7 |
| learning_rate | 0.024 |
| subsample | 0.95 |
| colsample_bytree | 0.50 |
| reg_alpha | 0.08 |
| reg_lambda | 5.00 |
| min_child_weight | 15 |
2.7. Modeling Interpretability
3. Results
3.1. Model Performance


3.2. Feature Importance Analysis

3.3. Feature Ablation Study


| Feature Set | N Features | Train R² | Test R² | Train RMSE | Test RMSE | Test MAE |
|---|---|---|---|---|---|---|
| All Features | 43 | 0.989 | 0.854 | 0.26 | 0.97 | 0.73 |
| Top 20 | 20 | 0.940 | 0.834 | 0.62 | 1.03 | 0.79 |
| Top 10 | 10 | 0.949 | 0.797 | 0.57 | 1.14 | 0.86 |
| Top 5 | 5 | 0.806 | 0.754 | 1.12 | 1.26 | 0.96 |
4. Discussion
4.1. Formaldehyde: An Underappreciated Environmental Determinant
4.2. Socioeconomic Factors and Climate-Related Predictors

4.3. Limitations and Future Directions
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Variable Descriptions
| Variable Name | Description | Unit |
|---|---|---|
| Socioeconomic and Demographic (N=10) | ||
| Poverty Rate | Population below poverty line | % |
| Bachelor’s Degree or Higher (%) | Percentage with bachelor’s degree or higher | % |
| Disability Rate | Population with disability | % |
| Total Population | County population | count |
| Unemployment Rate | Labor force unemployed | % |
| White Population (%) | White population percentage | % |
| Hispanic Population (%) | Hispanic/Latino population percentage | % |
| Black Population (%) | Black/African American population percentage | % |
| Households with No Vehicle (%) | Households without vehicle | % |
| Single Mother Families (%) | Families headed by single mothers | % |
| Atmospheric and Meteorological (N=26) | ||
| Land-sea Mask | Land-sea boundary indicator | - |
| Mean Sea Level Pressure | Mean sea level pressure | Pa |
| Dust Aerosol (0.55–0.9 m) Mixing Ratio | Fine dust aerosol mixing ratio | kg/kg |
| Dust Aerosol (0.9–20 m) Mixing Ratio | Coarse dust aerosol mixing ratio | kg/kg |
| Hydrophilic Black Carbon Aerosol Mixing Ratio | Hydrophilic black carbon mixing ratio | kg/kg |
| Hydrophobic Black Carbon Aerosol Mixing Ratio | Hydrophobic black carbon mixing ratio | kg/kg |
| Hydrophobic Organic Matter Aerosol Mixing Ratio | Hydrophobic organic matter mixing ratio | kg/kg |
| Sea Salt Aerosol (0.5–5 m) Mixing Ratio | Fine sea salt mixing ratio | kg/kg |
| Sea Salt Aerosol (5–20 m) Mixing Ratio | Coarse sea salt mixing ratio | kg/kg |
| Sulphate Aerosol Mixing Ratio | Sulphate aerosol mixing ratio | kg/kg |
| Leaf Area Index, High Vegetation | High vegetation leaf area index | m2/m2 |
| Leaf Area Index, Low Vegetation | Low vegetation leaf area index | m2/m2 |
| Snow Depth | Mean snow depth | m |
| 10m Wind Speed | Wind speed at 10m height | m/s |
| Wet Bulb Temperature | Mean wet bulb temperature | K |
| FoT Carbonmonoxide Above75th Percentile | Time CO > 75th percentile | % |
| FoT Ethane Above75th Percentile | Time ethane > 75th percentile | % |
| FoT Formaldehyde Above75th Percentile | Time formaldehyde > 75th percentile | % |
| FoT Hydroxyl Radical Above75th Percentile | Time OH > 75th percentile | % |
| FoT Nitric Acid Above75th Percentile | Time HNO3 > 75th percentile | % |
| FoT Nitrogen Dioxide Above75th Percentile | Time NO2 > 75th percentile | % |
| FoT Nitrogen Monoxide Above75th Percentile | Time NO > 75th percentile | % |
| FoT Ozone Above75th Percentile | Time O3 > 75th percentile | % |
| FoT PM2.5 Above75th Percentile | Time PM2.5 > 75th percentile | % |
| FoT Propane Above75th Percentile | Time propane > 75th percentile | % |
| FoT Sulphur Dioxide Above75th Percentile | Time SO2 > 75th percentile | % |
| Livestock Density (N=7) | ||
| Cattle | Cattle density | heads/km2 |
| Chicken | Chicken density | heads/km2 |
| Duck | Duck density | heads/km2 |
| Goat | Goat density | heads/km2 |
| Horse | Horse density | heads/km2 |
| Pig | Pig density | heads/km2 |
| Sheep | Sheep density | heads/km2 |
References
- Dwyer-Lindgren, L.; Bertozzi-Villa, A.; Stubbs, R.W.; et al. Inequalities in life expectancy among US counties, 1980 to 2014: temporal trends and key drivers. JAMA Internal Medicine 2017, 177, 1003–1011. [Google Scholar] [CrossRef]
- Murray, C.J.; et al. The state of US health, 1990–2010: burden of diseases, injuries, and risk factors. JAMA 2013, 310, 591–608. [Google Scholar] [CrossRef]
- Ho, J.Y. Causes of America’s Lagging Life Expectancy: An International Comparative Perspective. The Journals of Gerontology: Series B, Psychological Sciences and Social Sciences 2022, 77, S117–S126. [Google Scholar] [CrossRef]
- Chetty, R.; Stepner, M.; Abraham, S.; Lin, S.; Scuderi, B.; Turner, N.; Bergeron, A.; Cutler, D.M. The Association Between Income and Life Expectancy in the United States, 2001–2014. JAMA 2016, 315, 1750–1766. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Singh, G.K.; Lee, H. Marked Disparities in Life Expectancy by Education, Poverty Level, Occupation, and Housing Tenure in the United States, 1997–2014. Int. J. MCH AIDS 2021, 10, 7–18. [Google Scholar] [CrossRef]
- Liu, L.; Wen, W.; Shrubsole, M.J.; Lipworth, L.E.; Mumma, M.T.; Ackerly, B.A.; Shu, X.O.; Blot, W.J.; Zheng, W. Impacts of Poverty and Lifestyles on Mortality: A Cohort Study in Predominantly Low-Income Americans. American Journal of Preventive Medicine 2024, 67, 15–23. [Google Scholar] [CrossRef]
- Raghupathi, V.; Raghupathi, W. The influence of education on health: an empirical assessment of OECD countries for the period 1995–2015. Archives of Public Health 2020, 78, 20. [Google Scholar] [CrossRef] [PubMed]
- Zajacova, A.; Lawrence, E.M. The relationship between education and health: reducing disparities through a contextual approach. Annual Review of Public Health Final edited form published January 12. 2018, 39, 273–289. [Google Scholar] [CrossRef] [PubMed]
- Kelly, F.J.; Fussell, J.C. Air pollution and public health: emerging hazards and improved understanding of risk. Environmental Geochemistry and Health 2015, 37, 631–649. [Google Scholar] [CrossRef]
- Manisalidis, I.; Stavropoulou, E.; Stavropoulos, A.; Bezirtzoglou, E. Environmental and Health Impacts of Air Pollution: A Review. Frontiers in Public Health 2020, 8, 14. [Google Scholar] [CrossRef] [PubMed]
- Pope, C.A., III; Ezzati, M.; Dockery, D.W. Fine-Particulate Air Pollution and Life Expectancy in the United States. New England Journal of Medicine 2009, 360, 376–386. [Google Scholar] [CrossRef]
- Di, Q.; Wang, Y.; Zanobetti, A.; Wang, Y.; Koutrakis, P.; Choirat, C.; Dominici, F.; Schwartz, J.D. Air Pollution and Mortality in the Medicare Population. New England Journal of Medicine 2017, 376, 2513–2522. [Google Scholar] [CrossRef]
- Crouse, D.L.; Peters, P.A.; Hystad, P.; Brook, J.R.; van Donkelaar, A.; Martin, R.V.; Villeneuve, P.J.; Jerrett, M.; Goldberg, M.S.; III, C.A.P.; et al. Ambient PM2.5, O3, and NO2 exposures and associations with mortality over 16 years of follow-up in the Canadian Census Health and Environment Cohort (CanCHEC). Environmental Health Perspectives 2015, 123, 1180–1186. [Google Scholar] [CrossRef]
- Jerrett, M.; Burnett, R.T.; III, C.A.P.; Ito, K.; Thurston, G.; Krewski, D.; Shi, Y.; Calle, E.; Thun, M. Long-term ozone exposure and mortality. The New England Journal of Medicine 2009, 360, 1085–1095. [Google Scholar] [CrossRef] [PubMed]
- Beane Freeman, L.E.; Blair, A.; Lubin, J.; Stewart, P.A.; Hein, M.J.; Rothman, N.; Alavanja, M.C.R.; et al. Mortality From Lymphohematopoietic Malignancies Among Workers in Formaldehyde Industries: The National Cancer Institute Cohort. Journal of the National Cancer Institute 2009, 101, 751–761. [Google Scholar] [CrossRef] [PubMed]
- Gilbert, M.; Nicolas, G.; Cinardi, G.; Van Boeckel, T.P.; Vanwambeke, S.O.; Wint, G.W.; Robinson, T.P. Global distribution data for cattle, buffaloes, horses, sheep, goats, pigs, chickens and ducks in 2010. Scientific Data 2018, 5, 1–12. [Google Scholar] [CrossRef]
- Anestis, V.; Umar, W.; Dragoni, F.; van der Weerden, T.J.; Hassouna, M.; Noble, A.; Bartzanas, T.; Amon, B. Mitigation of greenhouse gas and ammonia emissions due to livestock housing management practices: Analysis of the DATAMAN database. Biosystems Engineering 2025, 258, 104260. [Google Scholar] [CrossRef]
- Rigolot, C.; Espagnol, S.; Robin, P.; Hassouna, M.; Béline, F.; Paillat, J.M.; Dourmad, J.Y. Modelling of manure production by pigs and NH3, N2O and CH4 emissions. Part II: effect of animal housing, manure storage and treatment practices. Animal 2010, 4, 1413–1424. [Google Scholar] [CrossRef] [PubMed]
- Wild, C.P. The exposome: from concept to utility. International Journal of Epidemiology 2012, 41, 24–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2016; pp. 785–794. [Google Scholar]
- for Health Metrics, I.; (IHME), E. United States Mortality Rates and Life Expectancy by County, Race, and Ethnicity 2000–2019. Global Health Data Exchange (GHDx);GHDx record “United States Life Expectancy by County, Race & Ethnicity 2000-2019”.
- U.S. Census Bureau. American Community Survey 5-Year Data (2009–2023). 2024. Available online: https://www.census.gov/data/developers/data-sets/acs-5year.html.
- Inness, A.; Ades, M.; Agustí-Panareda, A.; Barré, J.; Benedictow, A.; Blechschmidt, A.M.; Dominguez, J.J.; Engelen, R.; Eskes, H.; Flemming, J.; et al. The CAMS reanalysis of atmospheric composition. Atmospheric Chemistry and Physics 2019, 19, 3515–3556. [Google Scholar] [CrossRef]
- Bureau, U.S.C. American Community Survey (ACS) 5-Year Estimates. 2023. Available online: https://www.census.gov/programs-surveys/acs/technical-documentation.html https://www.census.gov/data/developers/data-sets/acs-5year.html.
- U.S. Census Bureau. American Community Survey 1-Year Data (2005–2024). 2025. Page last revised August 28, 2025. Available online: https://www.census.gov/data/developers/data-sets/acs-1year.html.
- Wint, G.R.W.; Robinson, T.P. Gridded Livestock of the World, 2007; Accessed; Food and Agriculture Organization of the United Nations: Rome, Italy, 2007. [Google Scholar]
- Wild Tree Tech.; Google Brain.; University of Liège.; Saarland University. scikit-optimize: Sequential model-based optimization in Python. 2020. Available online: https://scikit-optimize.github.io/stable/modules/generated/skopt.BayesSearchCV.html. [CrossRef]
- Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. arXiv. Submitted. 22 May 2017. [CrossRef]
- Salthammer, T. Formaldehyde sources, formaldehyde concentrations and air exchange rates in European housings. Building and Environment 2019, 150, 219–232. [Google Scholar] [CrossRef]
- Salthammer, T.; Mentese, S.; Marutzky, R. Formaldehyde in the Indoor Environment. Chemical Reviews 2010, 110, 2536–2572. [Google Scholar] [CrossRef]
- Cogliano, V.J.; Grosse, Y.; Baan, R.A.; Straif, K.; Secretan, M.B.; El Ghissassi, F. the Working Group for Volume 88. Meeting Report: Summary of IARC Monographs on Formaldehyde, 2-Butoxyethanol, and 1-tert-Butoxy-2-Propanol. Environmental Health Perspectives 2005, 113, 1205–1208. [Google Scholar] [CrossRef]
- Ban, J.; Su, W.; Zhong, Y.; Liu, C.; Li, T. Ambient formaldehyde and mortality: A time series analysis in China. Science Advances 2022, 8, eabm4097. [Google Scholar] [CrossRef]
- Zhang, Y.; Yang, Y.; He, X.; Yang, P.; Zong, T.; Sun, P.; Sun, R.C.; Yu, T.; Jiang, Z. The cellular function and molecular mechanism of formaldehyde in cardiovascular disease and heart development. Journal of Cellular and Molecular Medicine 2021, 25, 5358–5371. [Google Scholar] [CrossRef]
- Ghelli, F.; Bellisario, V.; Squillacioti, G.; Panizzolo, M.; Santovito, A.; Bono, R. Formaldehyde in Hospitals Induces Oxidative Stress: The Role of GSTT1 and GSTM1 Polymorphisms. Toxics 2021, 9, 178. [Google Scholar] [CrossRef] [PubMed]
- Costa, S.; Carvalho, S.; Costa, C.; Coelho, P.; Silva, S.; Santos, L.S.; Gaspar, J.F.; Porto, B.; Laffon, B.; Teixeira, J.P. Increased levels of chromosomal aberrations and DNA damage in a group of workers exposed to formaldehyde. Mutagenesis 2015, 30, 463–473. [Google Scholar] [CrossRef]
- Zhu, L.; Jacob, D.J.; Keutsch, F.N.; Mickley, L.J.; Scheffe, R.D.; Strum, M.; González Abad, G.; Chance, K.; Yang, K.; Rappenglück, B.; et al. Formaldehyde (HCHO) as a Hazardous Air Pollutant: Mapping Surface Air Concentrations from Satellite and Inferring Cancer Risks in the United States. Environmental Science & Technology 2017, 51, 5650–5657. [Google Scholar] [CrossRef] [PubMed]
- U.S. Environmental Protection Agency. Executive Summary of the Risk Evaluation for Formaldehyde (CASRN 50-00-0). Technical Report EPA-740-S-24-007, U.S. Environmental Protection Agency, Office of Chemical Safety and Pollution Prevention, 2024. Final risk evaluation under the Toxic Substances Control Act (TSCA) determining that formaldehyde presents an unreasonable risk to human health under certain conditions of use.
- Wang, P.; Holloway, T.; Bindl, M.; Harkey, M.; De Smedt, I. Ambient Formaldehyde over the United States from Ground-Based (AQS) and Satellite (OMI) Observations. Remote Sensing 2022, 14, 2191. [Google Scholar] [CrossRef]
- De Smedt, I.; Pinardi, G.; Vigouroux, C.; Compernolle, S.; Bais, A.; Benavent, N.; Boersma, F.; Chan, K.L.; Donner, S.; Eichmann, K.U.; et al. Comparative assessment of TROPOMI and OMI formaldehyde observations and validation against MAX-DOAS network column measurements. Atmospheric Chemistry and Physics 2021, 21, 12561–12593. [Google Scholar] [CrossRef]
- Marmot, M. Social determinants of health inequalities. The Lancet 2005, 365, 1099–1104. [Google Scholar] [CrossRef]
- Raymond, C.; Matthews, T.; Horton, R.M. The emergence of heat and humidity too severe for human tolerance. Science Advances 2020, 6, eaaw1838. [Google Scholar] [CrossRef]
- Sherwood, S.C.; Huber, M. An adaptability limit to climate change due to heat stress. Proceedings of the National Academy of Sciences 2010, 107, 9552–9555. [Google Scholar] [CrossRef]
- Mora, C.; Dousset, B.; Caldwell, I.R.; Powell, F.E.; Geronimo, R.C.; Bielecki, C.R.; Counsell, C.W.W.; Dietrich, B.S.; Johnston, E.T.; Louis, L.V.; et al. Global risk of deadly heat. Nature Climate Change 2017, 7, 501–506. [Google Scholar] [CrossRef]
- Gallo, E.; Quijal-Zamorano, M.; Méndez Turrubiates, R.F.; Tonne, C.; Basagaña, X.; Achebak, H.; Ballester, J. Heat-related mortality in Europe during 2023 and the role of adaptation in protecting health. Nature Medicine 2024, 30, 3101–3105. [Google Scholar] [CrossRef]
- Zhao, Q.; Guo, Y.; Ye, T.; Gasparrini, A.; Tong, S.; Overcenco, A.; Urban, A.; Schneider, A.; Entezari, A.; Vicedo-Cabrera, A.M.; et al. Global, regional, and national burden of mortality associated with non-optimal ambient temperatures from 2000 to 2019: a three-stage modelling study. The Lancet Planetary Health 2021, 5, e415–e425. [Google Scholar] [CrossRef]
- Pampel, F.C.; Krueger, P.M.; Denney, J.T. Socioeconomic Disparities in Health Behaviors. Annual Review of Sociology 2010, 36, 349–370. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).