Submitted:
24 December 2024
Posted:
26 December 2024
You are already at the latest version
Abstract
Recently, there has been a growing interest in the production of high-resolution maps of vaccination coverage. These maps have been useful for uncovering geographic inequities in coverage and improving targeting of interventions to reach marginalized populations. Different methodological approaches have been developed for producing these maps using mostly geolocated household survey data and geospatial covariate information. However, it remains unclear how much the predicted coverage maps produced by the various methods differ, and which methods yield more reliable estimates. Here, we explore the predictive performance of these methods and resulting implications for spatial prioritization to fill this gap. Using Nigeria Demographic and Health Survey as a case study, we generate 1x1 km and district level maps of indicators of vaccination coverage using geostatistical, machine learning (ML) and hybrid methods and evaluate predictive performance via cross-validation. Our results show similar predictive performance for five of the seven methods investigated, although two geostatistical approaches are the best methods. The worst-performing methods are two ML approaches. We find marked differences in spatial prioritization using these methods, which could potentially result in missing important underserved populations, although broad similarities exist. Our study can help guide map production for other health and development metrics.
Keywords:
1. Introduction
2. Methodology
2.1. Data
2.1.1. Vaccination Coverage Data
2.1.2. Geospatial Covariate and Population Data
2.2. Geostatistical and Machine Learning Modelling Approaches
2.2.1. Bayesian Geostatistical Regression Model (GEOS)
2.2.2. Bayesian Semiparametric Geostatistical Regression Model (SGEOS)
2.2.3. Generalized Additive Model (GAM)
2.2.4. Boosted Regression Model/Trees (BRT)
2.2.5. Least Absolute Shrinkage and Selection Operator (LASSO) Regression
2.2.6. Stacked Generalization Using a Bayesian Geostatistical Model (STG)
2.2.7. Artificial Neural Networks (ANN)
2.3. Uncertainty Estimation Using Delete-a-Block Jackknife Cross-Validation
2.4. Model Validation Using k-Fold Cross-Validation and Variogram Analysis
3. Results
3.1. In- and Out-of-Sample Predictive Performance Using Cross-Validation and Variogram Analysis
3.3. Exploring Spatial Prioritization Using District Level Coverage Estimates
4. Discussion
Author Contributions
Funding
Ethical approval
Data and code availability
Acknowledgements
Competing interests
References
- Aheto, J. M. K., Olowe, I. D., Chan, H. M., Ekeh, A., Dieng, B., Fafunmi, B., Setayesh, H., Atuhaire, B., Crawford, J., Tatem, A. J. & Utazi, C. E. 2023. Geospatial analyses of recent household surveys to assess changes in the distribution of zero-dose children and their associated factors before and during the covid-19 pandemic in nigeria. Vaccines, 11. [CrossRef]
- Alegana, V. A., Ticha, J. M., Mwenda, J. M., Katsande, R., Gacic-Dobo, M., Danovaro-Holliday, M. C., Shey, C. W., Akpaka, K. A., Kazembe, L. N. & Impouma, B. 2024. Modelling the spatial variability and uncertainty for under-vaccination and zero-dose children in fragile settings. Scientific Reports, 14, 24405. [CrossRef]
- Berrocal, V. J., Guan, Y., Muyskens, A., Wang, H., Reich, B. J., Mulholland, J. A. & Chang, H. H. 2020. A comparison of statistical and machine learning methods for creating national daily maps of ambient pm2.5 concentration. Atmospheric Environment, 222, 117130. [CrossRef]
- Bhatt, S., Cameron, E., Flaxman, S. R., Weiss, D. J., Smith, D. L. & Gething, P. W. 2017. Improved prediction accuracy for disease risk mapping using gaussian process stacked generalization. Journal of The Royal Society Interface, 14, 20170520. [CrossRef]
- Bosco, C., Alegana, V., Bird, T., Pezzulo, C., Bengtsson, L., Sorichetta, A., Steele, J., Hornby, G., Ruktanonchai, C., Ruktanonchai, N., Wetter, E. & Tatem, A. J. 2017. Exploring the high-resolution mapping of gender-disaggregated development indicators. Journal of The Royal Society Interface, 14, 20160825. [CrossRef]
- Browne, C., Matteson, D. S., Mcbride, L., Hu, L., Liu, Y., Sun, Y., Wen, J. & Barrett, C. B. 2021. Multivariate random forest prediction of poverty and malnutrition prevalence. PLOS ONE, 16, e0255519. [CrossRef]
- Chopra, M., Bhutta, Z., Chang Blanc, D., Checchi, F., Gupta, A., et al. 2020. Addressing the persistent inequities in immunization coverage. Bull World Health Organ, 98, 146-148. [CrossRef]
- Croft, T. N., Allen, C. K. & Zachary, B. W. 2023. Guide to dhs statistics. Rockville, Maryland, USA: ICF.
- Diggle, P. J., Tawn, J. A. & Moyeed, R. A. 1998. Model-based geostatistics. Journal of the Royal Statistical Society Series C: Applied Statistics, 47, 299-350.
- Dong, T. Q. & Wakefield, J. 2021. Modeling and presentation of vaccination coverage estimates using data from household surveys. Vaccine, 39, 2584-2594. [CrossRef]
- Dowell, S. F., Blazes, D. & Desmond-Hellmann, S. 2016. Four steps to precision public health. Nature, 540, 189-191.
- Fouedjio, F. & Arya, E. 2024. Locally varying geostatistical machine learning for spatial prediction. Artificial Intelligence in Geosciences, 5, 100081. [CrossRef]
- Friedman, J., Hastie, T., Tibshirani, R., Narasimhan, B., Tay, K., Simon, N. & Qian, J. 2021. Package ‘glmnet’. CRAN R Repositary, 595.
- Fryda, T., Ledell, E., Gill, N., Aiello, S., Fu, A., Candel, A., Click, C., Kraljevic, T. & Nykodym, T. 2024. R package ‘h2o’: R interface for the 'h2o' scalable machine learning platform.
- Fuglstad, G.-A., Simpson, D., Lindgren, F. & Rue, H. 2019. Constructing priors that penalize the complexity of gaussian random fields. Journal of the American Statistical Association, 114, 445-452. [CrossRef]
- Gascoigne, C., Smith, T., Paige, J. & Wakefield, J. 2025. Estimating subnational under-five mortality rates using a spatio-temporal age-period-cohort model. Spatial and Spatio-temporal Epidemiology, 52, 100708. [CrossRef]
- Gavi the Vaccine Alliance. 2020. Gavi strategy 5.0, 2021-2025. Available: https://www.gavi.org/our-alliance/strategy/phase-5-2021-2025 [Accessed 25 June 2021].
- Gelfand, A. E., Kim, H.-J., Sirmans, C. F. & Banerjee, S. 2003. Spatial modeling with spatially varying coefficient processes. Journal of the American Statistical Association, 98, 387-396. [CrossRef]
- Guio Blanco, C. M., Brito Gomez, V. M., Crespo, P. & Ließ, M. 2018. Spatial prediction of soil water retention in a páramo landscape: Methodological insight into machine learning using random forest. Geoderma, 316, 100-114. [CrossRef]
- Hagenauer, J. & Helbich, M. 2022. A geographically weighted artificial neural network. International Journal of Geographical Information Science, 36, 215-235.
- James, G., Witten, D., Hastie, T. & Tibshirani, R. 2013. An introduction to statistical learning: With applications in r, Spinger.
- Janocha, B., Donohue, R. E., Fish, T. D., Mayala, B. K. & Croft, T. N. 2021. Guidance and recommendations for the use of indicator estimates at subnational administrative level 2. DHS Spatial Analysis Report 20. Rockville, Maryland, USA: ICF.
- Johns, N. E., Hosseinpoor, A. R., Chisema, M., Danovaro-Holliday, M. C., Kirkby, K., Schlotheuber, A., Shibeshi, M., Sodha, S. V. & Zimba, B. 2022. Association between childhood immunisation coverage and proximity to health facilities in rural settings: A cross-sectional analysis of service provision assessment 2013–2014 facility data and demographic and health survey 2015–2016 individual data in malawi. BMJ Open, 12, e061346. [CrossRef]
- Kawakatsu, Y., Mosser, J. F., Adolph, C., Baffoe, P., Cheshi, F., Aiga, H., Watkins, D. A. & Sherr, K. H. 2024. High-resolution mapping of essential maternal and child health service coverage in nigeria: A machine learning approach. BMJ Open, 14, e080135. [CrossRef]
- Kaya, F., Keshavarzi, A., Francaviglia, R., Kaplan, G., Başayiğit, L. & Dedeoğlu, M. 2022. Assessing machine learning-based prediction under different agricultural practices for digital mapping of soil organic carbon and available phosphorus. Agriculture, 12, 1062. [CrossRef]
- Kinyoki, D. K., Osgood-Zimmerman, A. E., Pickering, B. V., Schaeffer, L. E., Marczak, L. B., et al. 2020. Mapping child growth failure across low- and middle-income countries. Nature, 577, 231-234.
- Lim, S. S., Stein, D. B., Charrow, A. & Murray, C. J. L. 2008. Tracking progress towards universal childhood immunisation and the impact of global initiatives: A systematic analysis of three-dose diphtheria, tetanus, and pertussis immunisation coverage. The Lancet, 372, 2031-2046. [CrossRef]
- Lindgren, F., Bachl, F., Illian, J., Suen, M. H., Rue, H. & Seaton, A. E. 2024. Inlabru: Software for fitting latent gaussian models with non-linear predictors. arXiv preprint arXiv:2407.00791.
- Lindgren, F., Rue, H. & Lindström, J. 2011. An explicit link between gaussian fields and gaussian markov random fields: The stochastic partial differential equation approach. J Roy Stat Soc Series B (Stat Methodol), 73, 423-498. [CrossRef]
- Lindgren, F., Rue, H. & Lindström, J. 2015. Bayesian spatial modelling with r-inla. Journal of Statistical Software, 63, 25. [CrossRef]
- Matérn, B. 1960. Spatial variation, Berlin, Germany, Springer-Verlag.
- Mayala, B., Dontamsetti, T., Fish, T. & Crof, T. 2019. Interpolation of dhs survey data at subnational administrative level 2. Dhs spatial analysis reports no. 17. Rockville: ICF. [CrossRef]
- Mosser, J. F., Gagne-Maynard, W., Rao, P. C., Osgood-Zimmerman, A., Fullman, N., et al. 2019. Mapping diphtheria-pertussis-tetanus vaccine coverage in africa, 2000 - 2016: A spatial and temporal modelling study. The Lancet, 393, 1843-1855. [CrossRef]
- Mwinnyaa, G., Hazel, E., Maïga, A. & Amouzou, A. 2021. Estimating population-based coverage of reproductive, maternal, newborn, and child health (rmnch) interventions from health management information systems: A comprehensive review. BMC Health Services Research, 21, 1083. [CrossRef]
- National Population Commission - Npc & Icf 2019. Nigeria demographic and health survey 2018 - final report. Abuja, Nigeria: NPC and ICF.
- Nychka, D., Furrer, R., Paige, J. & Sain, S. 2017. Fields: Tools for spatial data. R package version, 9, D6W957CT.
- Paige, J., Fuglstad, G.-A., Riebler, A. & Wakefield, J. 2022. Design- and model-based approaches to small-area estimation in a low- and middle-income country context: Comparisons and recommendations. Journal of Survey Statistics and Methodology, 10, 50-80. [CrossRef]
- Park, Y. S. & Lek, S. 2016. Chapter 7 - artificial neural networks: Multilayer perceptron for ecological modeling. In: JØRGENSEN, S. E. (ed.) Developments in environmental modelling. Elsevier.
- Perez-Haydrich, C., Warren, J. L., Burgert, C. R. & Emch, M. E. 2013. Guidelines on the use of dhs gps data. DHS Spatial Analysis Reports No. 8. Calverton, Maryland, USA: ICF International.
- R Core Team 2021. A language and environment for statistical computing. Vienna, Austria.
- Rao, J. N. 2005. Small area estimation, John Wiley & Sons.
- Ribeiro Jr, P. J., Diggle, P. J., Christensen, O., Schlather, M., Bivand, R. & Ripley, B. 2024. The geor package: Analysis of geostatistical data.
- Ridgeway, G. & Ridgeway, M. G. 2004. The gbm package. R Foundation for Statistical Computing, Vienna, Austria, 5.
- Sbarra, A. N., Rolfe, S., Nguyen, J. Q., Earl, L., Galles, N. C., et al. 2021. Mapping routine measles vaccination in low- and middle-income countries. Nature, 589, 415-419. [CrossRef]
- Scobie, H. M., Edelstein, M., Nicol, E., Morice, A., Rahimi, N., Macdonald, N. E., Carolina Danovaro-Holliday, M. & Jawad, J. 2020. Improving the quality and use of immunization and surveillance data: Summary report of the working group of the strategic advisory group of experts on immunization. Vaccine, 38, 7183-7197. [CrossRef]
- Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. 2020. Random forest spatial interpolation. Remote Sensing, 12, 1687. [CrossRef]
- Shattock, A. J., Johnson, H. C., Sim, S. Y., Carter, A., Lambach, P., et al. 2024. Contribution of vaccination to improved survival and health: Modelling 50 years of the expanded programme on immunization. The Lancet, 403, 2307-2316. [CrossRef]
- Simpson, D., Rue, H., Riebler, A., Martins, T. G. & Sørbye, S. H. 2017. Penalising model component complexity: A principled, practical approach to constructing priors. [CrossRef]
- Takahashi, S., Metcalf, C. J. E., Ferrari, M. J., Tatem, A. J. & Lessler, J. 2017. The geography of measles vaccination in the african great lakes region. Nature Communications, 8, 15585. [CrossRef]
- Tatem, A. J. 2017. Worldpop, open data for spatial demography. Scientific Data, 4, 170004. [CrossRef]
- Tzavidis, N., Zhang, L.-C., Luna, A., Schmid, T. & Rojas-Perilla, N. 2018. From start to finish: A framework for the production of small area official statistics. Journal of the Royal Statistical Society Series A: Statistics in Society, 181, 927-979. [CrossRef]
- UNICEF and the Bill and Melinda Gates Foundation. 2021. Equity reference group for immunization advocacy brief. Available: https://drive.google.com/file/d/1VpuVX85RWd_vq6FJ4lcmCnPOYJp1AhuM/view [Accessed 05 May 2021].
- United Nations. 2015. Transforming our world: The 2030 agenda for sustainable development. Available: http://www.un.org/ga/search/view_doc.asp?symbol=A/RES/70/1&Lang=E [Accessed 20 June 2017].
- Utazi, C. E., Aheto, J. M. K., Chan, H. M. T., Tatem, A. J. & Sahu, S. K. 2022. Conditional probability and ratio-based approaches for mapping the coverage of multi-dose vaccines. Statistics in Medicine, 41, 5662 - 5678. [CrossRef]
- Utazi, C. E., Aheto, J. M. K., Wigley, A., Tejedor-Garavito, N., Bonnie, A., Nnanatu, C. C., Wagai, J., Williams, C., Setayesh, H., Tatem, A. J. & Cutts, F. T. 2023. Mapping the distribution of zero-dose children to assess the performance of vaccine delivery strategies and their relationships with measles incidence in nigeria. Vaccine, 41, 170-181. [CrossRef]
- Utazi, C. E., Nilsen, K., Pannell, O., Dotse-Gborgbortsi, W. & Tatem, A. J. 2021. District-level estimation of vaccination coverage: Discrete vs continuous spatial models. Stat Med, 40, 2197-2211. [CrossRef]
- Utazi, C. E., Thorley, J., Alegana, V. A., Ferrari, M. J., Nilsen, K., Takahashi, S., Metcalf, C. J. E., Lessler, J. & Tatem, A. J. 2018a. A spatial regression model for the disaggregation of areal unit based data to high-resolution grids with application to vaccination coverage mapping. Statistical Methods in Medical Research, 28, 3226-3241. [CrossRef]
- Utazi, C. E., Thorley, J., Alegana, V. A., Ferrari, M. J., Takahashi, S., Metcalf, C. J. E., Lessler, J., Cutts, F. T. & Tatem, A. J. 2019. Mapping vaccination coverage to explore the effects of delivery mechanisms and inform vaccination strategies. Nature Communications, 10, 1633. [CrossRef]
- Utazi, C. E., Thorley, J., Alegana, V. A., Ferrari, M. J., Takahashi, S., Metcalf, C. J. E., Lessler, J. & Tatem, A. J. 2018b. High resolution age-structured mapping of childhood vaccination coverage in low and middle income countries. Vaccine, 36, 1583-1591. [CrossRef]
- Veronesi, F. & Schillaci, C. 2019. Comparison between geostatistical and machine learning models as predictors of topsoil organic carbon with a focus on local uncertainty estimation. Ecological Indicators, 101, 1032-1044. [CrossRef]
- Wahba, G. 1981. Spline interpolation and smoothing on the sphere. SIAM Journal on Scientific and Statistical Computing, 2, 5-16.
- Wang, L. & Yu, F. 2021. Jackknife resample method for precision estimation of weighted total least squares. Communications in Statistics - Simulation and Computation, 50, 1272-1289. [CrossRef]
- Wang, X., Yue, Y. R. & Faraway, J. J. 2018. Bayesian regression modeling with inla, Chapman and Hall/CRC.
- Weiss, D. J., Lucas, T. C. D., Nguyen, M., Nandi, A. K., Bisanzio, D., et al. 2019. Mapping the global prevalence, incidence, and mortality of plasmodium falciparum, 2000-17: A spatial and temporal modelling study. The Lancet, 394, 322-331.
- Wigley, A., Lorin, J., Hogan, D., Utazi, C. E., Hagedorn, B., Dansereau, E., Tatem, A. J. & Tejedor-Garavito, N. 2022. Estimates of the number and distribution of zero-dose and under-immunised children across remote-rural, urban, and conflict-affected settings in low and middle-income countries. PLOS Global Public Health, 2, e0001126. [CrossRef]
- Wood, S. & Wood, M. S. 2015. Package ‘mgcv’. R package version, 1, 729.
- Wood, S. N. 2011. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 73, 3-36. [CrossRef]
- World Health Organization. 2018. World health organization vaccination coverage cluster surveys: Reference manual. Available: https://apps.who.int/iris/handle/10665/272820.
- World Health Organization. 2020. Immunization agenda 2030: A global strategy to leave no one behind. Available: https://www.who.int/immunization/immunization_agenda_2030/en/ [Accessed 25/06/2020].







Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).