Introduction
In a world where data increasingly shape policies, planning, and resource allocation, significant gaps persist in our knowledge of basic human demographics. Central to these gaps is the lack of small area data on population counts, arguably the cornerstone of planning and decision-making for governmental, non-governmental, and private organisations [
1,
2,
3,
4,
5]. Vital registries (records of births, marriages and deaths) and the population and housing census — hereafter referred to as ‘census’ — are traditionally the most detailed demographic data sources. However, both face practical challenges — such as infrequent collection and resource demands — that can limit their capacity to provide timely, granular data. Censuses are typically conducted only every 10 years or more, with results often made publicly available only at coarse spatial resolution to protect the privacy of individuals. Moreover, conducting a census requires substantial human and financial resources, and ensuring full coverage — particularly in remote or marginalized areas — can be complex and resource-intensive [
6,
7]. Data-rich countries may turn to population registers that are updated with administrative data collected at service points, but these require robust administrative systems which are lacking in most low and lower-middle income countries [
8]. Urbanization, conflicts, migration and climate change are leading to rapid changes in population distributions and demographics, and thus the need for regularly updated demographic data is growing, prompting the exploration and development of alternative approaches to estimate and map populations at small area scales.
The 2030 Agenda for Sustainable Development [
9] underscores the urgent need for high-quality, timely, and disaggregated population data to monitor progress toward the Sustainable Development Goals (SDGs). Many SDG indicators—such as those on poverty, health, education, and gender equality—require population data as a denominator, disaggregated by sex, age, geographic location, and other dimensions of inequality. However, millions of people remain statistically invisible due to outdated censuses, undercounts, or a lack of data systems capable of reaching marginalized populations. In this context, developing complementary population estimation methods becomes not only a technical necessity, but also a human rights imperative. Enhancing statistical visibility at small area scales is fundamental to achieving the SDG principle of leaving no one behind.
Small area population estimation methods have increasingly been recognized as critical tools not only for addressing data gaps during intercensal periods, but also for strengthening census operations themselves. These models can serve to evaluate and adjust for coverage issues, inform cartographic updates, and support demographic reconciliation processes post-enumeration. Importantly, such methods should not be seen as replacements for a national census, but rather as a complementary source of demographic intelligence that can enhance the accuracy, equity, and responsiveness of statistical systems — particularly in settings affected by conflict, environmental hazards, or rapid population shifts. The integration of these approaches into national data systems requires technical capacity, institutional coordination, and political will.
In 2018, Wardrop
, et al. [
10] envisaged a census-independent ‘bottom-up’ estimation approach to produce recent, small area demographic estimates in countries where demographic data sources from censuses or registries were outdated, incomplete or unavailable. The ‘bottom-up’ approach aimed at generating high-resolution population estimates using datasets from small area population surveys — referred to as ‘microcensus’ data — and high-resolution geospatial covariates derived from satellite imagery and other sources. The envisaged statistical modelling methods aimed to estimate population counts and demographic characteristics at high spatial resolution across unsurveyed areas, together with measures of uncertainty. Due to the variability in available demographic data, with differences in geolocation accuracy, sample sizes and spatial coverage, bottom-up models were anticipated to require bespoke design and development.
Since Wardrop
et al. was published, conflicts, resource limitations, political changes, natural disasters and the COVID-19 pandemic have resulted in census delays and cancellations in many countries, as well as increasing rates of undercounting in many censuses conducted in the 2020 round [
11,
12,
13]. Partly as a result, there has been increasing demand for alternative approaches for the provision of small area population estimates. Fuelled by this need, together with increased data availability and recent statistical modelling developments, the bottom-up approach has seen substantial advances in the past five years. These complementary estimation methods are also increasingly being explored in combination with traditional censuses to enhance coverage, support pre-enumeration planning, and improve responsiveness to dynamic demographic conditions.
Here we review the current state of census-independent bottom-up models for the production of small area population estimates. We explore their application to challenges concerning the estimation of population distributions and demographics, particularly in resource poor settings. We present an overview of the modelling frameworks that have been developed to date, the range of input demographic and geospatial datasets they utilise, and approaches to the validation of outputs. Finally, we discuss current challenges and future opportunities facing the field.
Approaches to Census-Independent Small Area Population Estimation
New data types and sources, as well as demographic challenges, which have required novel approaches to spatial coverage, data quality and human mobility issues, have accelerated development of small area population methods in recent years.
Figure 1 illustrates these different approaches and data used for different scenarios – filling gaps in an existing census, providing new estimates in the absence of a recent census, addressing undercounts, and supporting census preparations, among other applications. In the following sections we review each component of
Figure 1, providing examples throughout.
Demographic Datasets
Bottom-up demographic estimation methods rely on a training set of population data to predict demographic attributes, such as population counts and age and sex structures, across unsampled locations. In recent years, digitally collected data and GPS devices have become standard for survey teams, both for the census and household surveys and for those delivering health campaigns [
11]. This has resulted in significant improvements in attribute accuracy and geolocation, which makes for more straightforward integration of these datasets into modelling processes. These demographic datasets can have different levels of spatial completeness (
Figure 1), from purposefully designed microcensus surveys to partial censuses, household surveys and specific sectoral interventions, like health campaigns.
Wardrop
et al. (ibid) presented the use of microcensus surveys for population modelling in the absence of census observations, offering the advantage of rapid data collection [
10,
14,
15,
16,
17]. Microcensus surveys typically enumerate a random, representative sample of locations, fully covering a small area; around three hectares each of a single settlement type. The sample locations must be geographically defined, with clear spatial locations and extents. These bespoke surveys can also accommodate more complex sampling designs, such as stratified and probabilistic samples, and usually involve limited sample sizes [
18]. Like traditional censuses, they still require considerable human and financial resources — albeit a fraction of a full census — and face traditional challenges related to inaccessibility and insecurity.
Since these early applications, a wider set of models have been developed to deal with situations where recent enumeration data already exists. Among those discussed here, the most complete form of demographic sample can be derived from partial census enumerations which provide observations across entire regions, enabling the filling of geographic gaps in census data collection [
19,
20]. There are also examples of using census cartography data for population modelling to support census planning processes [
21]. These data consist of demographic attributes associated with household coordinates or enumeration areas, but do not provide a random sample of the population due to accessibility limitations. Nevertheless, they generally have very large sample sizes. In addition to gaps in enumeration due to conflict or remoteness [
7,
22], there exist growing numbers of examples of undercounts in both rural and urban settings, including associated with high rise buildings, gated communities [
23] and among certain demographic groups [
6,
24,
25]. Population modelling approaches can help complement census data by identifying and addressing such gaps.
Another source of georeferenced enumeration data can be found through the process of undertaking national household surveys to monitor demographic, socio-economic and health characteristics (e.g. the Demographic and Health Surveys, Household Income and Expenditure Surveys, Multiple Indicator Cluster Surveys). These surveys are designed to be nationally representative by fully enumerating a stratified random sample of the population at selected locations. Repurposing the household listing observations for population modelling is a novel way of conducting population estimation, potentially enabling frequent and low-cost model updates [
26,
27,
28]. Such datasets typically include geographic data on the cluster locations and, in some cases, even the coordinates of individual households and the sampling weights [
29].
A different type of sample population enumeration data can be obtained from activities that aim to deliver a commodity or service to all households, in a similar way to the census, aiming to reach every household in a given area. For example, campaigns to distribute insecticide treated nets (ITNs) as part of malaria control measures, often adopt a house-to-house distribution model, with demographic attributes relevant to the campaign recorded at the household level. Previous applications have used health campaign data (
Figure 1), but the potential is not limited to data collected for the delivery of health interventions. Such semi-regularly collected campaign data can enhance sample coverage and frequency, offering a practical complement to censuses or microcensus surveys [
30,
31]. While not originally intended for robust statistical analysis, these data may contain collection biases that require careful quality assessment and adjustment. For example, random (i.e. missed and ghost households) and systematic (i.e. inflated household sizes) biases likely to be more prevalent than in well-planned surveys. In addition, the lack of clarity in the designs of these operations could also introduce estimation biases.
Geospatial Datasets
The development of bottom-up population modelling approaches has been closely linked with the availability and growth of geospatial datasets derived from satellite imagery [
14,
32] and other geospatial data sources. All modelling approaches discussed here leverage ancillary geospatial datasets that describe the human landscape (
Figure 1) and thus correlate with the geographic distribution of demographic attributes across an area of interest. To be able to predict into gap areas more accurately, geospatial datasets must have complete coverage of the area of interest. Such ancillary datasets can be prepared as small area summaries at administrative or census unit level, but they are often utilised in a regular gridded format, with the spatial resolution matching the desired estimated population output – e.g. approximately 10m [
33] or 100m [
34].
Wardrop et al. highlighted advances in geospatial data, citing satellite-derived building area, land use, counts of dwelling units, spectral radiance and other socio-economic and physical characteristics. Today, geospatial datasets representing factors relating to the human landscape have advanced substantially due to increasing computing power, a wider range of data sources, and advances in AI algorithms that can extract relevant features. These have greatly improved the quality, spatial detail and regularity of relevant available data. As a result, the availability of near-global open geospatial datasets on factors relating to population distributions have become commonplace, driving the production of more detailed, more accurate and more recent small area estimates.
The most important geospatial datasets for population modelling describe built-up areas [
35]. Today, near-global building footprint datasets exist going beyond simply defining the extents of major settlements, and instead mapping individual buildings (e.g. outline, height, roof type) with unprecedented accuracy [
36,
37,
38,
39]. Building footprints in population models are typically aggregated to provide summaries of the built environment at the scale of the model output, for instance 100x100m grid cells. These aggregations include, for example, building footprint counts and area, morphology (e.g., building footprint perimeter and volume), and other locational characteristics (e.g. building orientation, neighbourhood types) [
40,
41,
42,
43,
44]. Some datasets also exist that stratify building footprints into residential/non-residential classes [
33]. Road network data can also be extracted from satellite imagery [
45] and converted into population model inputs using similar metrics described above. Finally, satellite-derived temporally explicit building datasets are now becoming available with annual timesteps, such as the Google 2.5D building dataset, enabling spatio-temporal assessment and better temporal alignment of covariates with ground enumeration samples [
46].
Other geospatial datasets can capture additional valuable covariates. Leveraging the growing availability of volunteered geographic information (VGI), OpenStreetMap data for example [
47], provides detailed geographic locations for different point of interests (e.g., health and education facilities) associated with population distributions [
48]. Regional and global datasets mapping health and education facility locations [
49], armed conflicts [
50], and building damage [
51,
52,
53] can also provide useful ancillary covariate data. These point location datasets can be converted into gridded or other formats using focal windows of different sizes to count them and thus mapping potential push-pull effects of different features on population distributions. Additionally, digital trace data from mobile devices have been used as indicators of the presence and densities of humans, including call detail records [
54,
55] and smartphone app data [
56].
The presence and distribution of humans and their settlements have always been driven by or associated with environmental factors, such as topography, climate, water bodies and land cover [
35], and an increasing number of high spatial resolution datasets with global coverage and frequent updates are now available. Similarly, lights visible from nighttime satellite imagery are often used as a potential indicator of human presence, indicating areas with electrification and fires [
57], while open near real-time global land cover mapping datasets are now also routinely produced [
58].
Modelling Frameworks
Wardrop
et al. suggested simple regression and geostatistical approaches as potential bottom-up model frameworks. Since then, many different frameworks for bottom-up population estimation have been developed and implemented. These frameworks are specifically designed to accommodate the available input data and span from deterministic approaches, through frequentist and Bayesian statistical modelling frameworks, to machine learning and artificial intelligence (AI) techniques (
Figure 1). In essence, these approaches assess relationships between the demographic and geospatial covariate datasets outlined above that are observed at sampled locations (e.g., administrative units, census enumeration areas, microcensus clusters, household locations) to estimate demographic attributes at both sampled and unsampled locations using the geospatial covariates as predictors (
Figure 2). The simplest form of small area population involves mapping constant values of population densities across the study area. Through estimates of the number of residential buildings, households per building, and the average household size, population numbers can be estimated deterministically at small area scales. While such approaches produce highly uncertain outputs that do not account for local variations, recent advances in the availability of datasets like building footprint maps, building heights and settlement types are improving the precision of outputs [
44,
59].
Leasure
et al. [
16] used Bayesian hierarchical models to estimate population counts as a Poisson process with exact counts derived from the product of settlement datasets and estimated population densities. In a hierarchical modelling framework, population densities are generally modelled as a lognormal process to capture variabilities in the expected population densities. Hierarchical models draw information from all scales, such as coarse administrative units (e.g., regions or districts) or functional areas (e.g., settlement classes), to define different model parameters, including random intercepts, random effects, and variance components [
16,
28]. The hierarchical random intercept is a key component of model structure that helps to account for the complex sampling designs used in the data collection as well as the geographic patterns in population density. Moreover, by using random effects, key information is shared among various spatial scales, thereby, leveraging observations from neighbouring clusters to predict population densities across clusters with no observations. More recent hierarchical model applications utilise satellite image-derived building footprints enabling more accurate population distribution prediction [
17,
19,
26] and some even extended to estimating building count distributions [
18,
60]. These approaches train the model parameters using the observed population data and then predict into other unsampled areas based on the trained parameters in line with the model assumptions.
Bayesian geostatistical models have also been developed to explicitly account for spatial autocorrelation not captured by covariates. Geostatistical models employ the use of distance-dependent covariance matrices to describe the spatial autocorrelation between observations. This includes a two-stage Poisson-Gamma geostatistical regression [
61]. Similar to those described for hierarchical approaches, these models also assume that the population density is related to a set of geospatial covariates through a linear predictor, and then produce estimates of population as a function of the product of the predicted population density and the corresponding building counts. However, the Poisson-Gamma approach utilises Gamma probability density instead the lognormal to model the population density in a more flexible manner whilst utilising the integrated nested Laplace approximations (INLA) [
62] techniques in conjunction with stochastic partial differential equation (SPDE) [
63] approaches. The use of the INLA approach enables Bayesian statistical inference based on the posterior marginal distribution, thereby ensuring both accuracy and computational speed. An extra layer of computation speed is achieved via the use of the SPDE approach which specifies spatial autocorrelation using a triangulation (or mesh) of the entire continuous spatial domain so that the usual computationally expensive dense matrix of the Gaussian Process is approximated via a sparse matrix, represented through the Gaussian Markov Random Field. These geostatistical models have been extended to estimate population counts where no building footprints were available and the indicators of human settlements were only partially observed [
30].
An important development is the use of machine learning and AI, including deep learning, to estimate intercensal population counts at high spatial resolution directly from satellite imagery. Landscan HD, for example, uses a data fusion approach to merge relevant land use and building layers in a deep learning framework to generate gridded population estimates [
64]. Another approach is to use a convolutional neural network with visual geometry group architecture for deep learning directly from a concatenation of low-resolution Landsat-7/8, Sentinel-1 and night-time light satellite images [
65,
66]. Key advantages of these methods are the frequent and freely accessible satellite image updates, enabling rapid population map updates and scalability in data scarce regions. The use of very high-resolution satellite images such as the Maxar VIVID 2.0 in conjunction with a small sample of ground truth dataset, appears promising to help with the identification of inhabited areas and non-residential buildings [
67].
Most of the types of population models outlined above estimate the total population, but development and humanitarian interventions often require specific age and sex information (
Figure 2). The most widespread method for age and sex disaggregation is using observed or projected subnational population pyramids to deterministically disaggregate the total population estimates [
68]. There are however, hierarchical model applications where the proportion of under-five population is estimated at 1km resolution using a Bayesian spatio-temporal model [
69], or the entire age and sex structure of each simulation unit is statistically estimated using a Dirichlet-multinomial process [
17]. A recent methodology developed uses a flexible multi-stage Bayesian statistical modelling approach based on ‘sequential’ Binomial probability mass functions to produce disaggregated structured estimates of population counts and population proportions at both administrative units and high-resolution grid cell levels [
70]. The methodology can be used to estimate age/sex classes, and other socio-economic groups such as ethnicity, occupation, education. The key input datasets are the demographic structures (e.g. age, sex, ethnicity, etc) which can come from various sources such as census/microcensus, household surveys (e.g. DHS), and administrative records. The methodology is implemented via INLA-SPDE approach, thus capable of rapidly producing structured estimates of population and population proportions at small area scales including at locations without observations, along with the corresponding estimates of uncertainties. In addition to age/sex information, field campaigns often require household level information that can also be modelled statistically [
27].
Figure 2.
Example population model applications: field observations (panel a), model input (panel b), settlement constrained model outputs with age/sex and uncertainty information (panel c). Data sources: DRC microcensus [
71], DRC population estimates [
17], Mali census cartography [
72], Mali population estimates [
21].
Figure 2.
Example population model applications: field observations (panel a), model input (panel b), settlement constrained model outputs with age/sex and uncertainty information (panel c). Data sources: DRC microcensus [
71], DRC population estimates [
17], Mali census cartography [
72], Mali population estimates [
21].
Outputs and Validation
Small area estimates produced using the modelling approaches outlined above are typically output at the grid square scale (
Figure 2). This format is advantageous because it provides a continuous surface to support visualisation and understanding of small-scale landscape variation. Moreover, the grid format enables flexible aggregation and summation of population estimates to different decision-making units, such as districts, wards or health zones, or by features of interest, such as urban extents, flooded areas, or hurricane tracks. Moreover, the format enables ease of integration with other geolocated datasets, such as the locations of health facilities, conflicts, schools or polling booths. Gridded population estimates have been used to calculate social-distancing indicators [
73], map out-of-school children [
74,
75], support disaster response [
76,
77], create survey sampling frames [
78,
79], and prepare for vaccination campaigns [
80]. Accuracy is however challenging to assess at the grid level unless geolocated household data - with GPS accuracy better than the grid resolution - is available. Gridded estimates require technical expertise to aggregate to more useful spatial units for field operations planning such as settlement or admin boundaries.
Population estimates are often reported as a single ‘best’ estimate of residential population for a specific timepoint, however, they cannot capture accurately the full complexity of our world. Uncertainty in population estimates varies place to place, originating from both the observations used in their construction and from the model design and assumptions. Observation uncertainties include non-representativeness (e.g. too small sample size), sampling bias that under-represents some populations (e.g. slum and informal settlement dwellers), and measurement error (e.g. missed populations, incorrectly recorded data). Partial, unobserved or outdated observations of the settled areas have a direct effect on the accuracy of the estimates and thus on their usability. However, it is difficult to translate the estimated uncertainty across spatial scales.
Using a Bayesian method enables the consideration of these errors during population estimation and provides the most likely estimates with quantified uncertainties. Model designs, guided by the local context and the accessible observations can be tested in a simulation study [
29,
30], assessing the importance of various inputs and the sensitivity of the model structure. Uncertainties are then quantified for the final model design through the posterior distribution of model parameter values. These posterior distributions are used to calculate metrics such as the mean value of the predicted population in an area, the standard deviation, and 95% credible intervals. In addition to communicating varying levels of uncertainty in modelled estimates, the use of these measures has the potential to also help make more informed decisions. For example, (i) using the upper confidence interval for health intervention campaign planning to ensure confidence that enough resources are acquired for each delivery area (ii) to identify where to conduct additional data collection to improve population estimates, or (iii) mapping model uncertainty to identify important predictors that may be missing from the modelling process that could aid better prediction in areas with high uncertainty.
Model validation is vital to ensure that model outputs are accurate and can be trusted by data users and decision makers. This is generally done though model diagnostics, goodness-of-fit assessment and external validation against an independent set of observations. Model diagnostics include checking convergence issues, posterior predictions, grid cell predictions and undertaking cross-validation to check the robustness of the model design, as well as uncertainty visualisation (e.g. 16). Goodness-of-fit assessments compare the outputs against the training data used by calculating metrics such as root mean square error and correlation coefficients (e.g. 61).
Out-of-sample cross validation is considered the gold standard way of validating population models [
81,
82], though data triangulation with other sources is often important for model validation. For example, aggregating the outputs to various administrative units enables cross-checking the results against population predictions and other external data sources. Modelled estimates are however often constructed precisely because of the lack of reliable ground enumeration data, making such comparisons rare and challenging to undertake. Even if an independent, reliable data source exists, there are also always differences that make comparisons difficult and potentially uninformative, such as the time difference between the reference year of the modelled and the independent datasets, the survey design and/or methodology, the population groups being counted and time of year of enumeration.
There have been very few bottom-up model validations using extensive small area enumeration data published so far. Darin
et al. [
60] used the 2018 Colombian census to test the accuracy of two Bayesian model designs against a machine learning methodology at enumeration area and municipality scales, while also varying the use of six different input settlement maps. They sampled from the census data to create an artificial data set, so the sample and the validation data were directly comparable and thus avoiding the issues described above relating to external data triangulation. They highlighted the importance of building footprints as the best ancillary information source for settlements. For unbiased, complete observations, the machine learning approach was the best performing method, whereas coarse or biased observations for sparsely populated regions were best explained by Bayesian methods because of their ability to correct biases. They emphasised that uncertainties varied vastly across different landscapes and at different spatial scales. For example, the median inaccuracy was 32%, but at finer spatial scales it rose to 148%. They highlighted that there is a limit to the information that can be derived from building footprint data, with no apparent increase in accuracy following an increase in sample size.
Chamberlain
et al. [
83] compared the 2019 census-independent modelled population estimates for Zambia [
26] with the population counts from the 2022 national census at province, district and ward-level. Their analysis showed a strong correlation between the modelled population estimates and the census-enumerated counts (r=0.98 at district-level and r=0.95 at ward-level). In province-level comparisons, census counts were within the 95% credible interval (CI) of the modelled estimates for five out of ten provinces, with census counts less than the lower 95% CI for the other five provinces, indicating some degree of positive bias in the modelled estimates. At district- and ward-level, census counts were within the 95% CIs for 64% of districts, and 52% of wards. The difference between the modelled estimates and enumerated census counts is likely in part due to some degree of over-estimation associated with non-residential buildings, however the temporal difference between the modelled estimates and the census also complicated the comparison.
Breuer
et al. [
84] compared gridded population estimates with population counts for a sample of slum areas in eight major cities of the Global South and found that slum populations were significantly underestimated with large spatial variation: only estimating 48 percent on average (range of 8-147 percent) of the true population. Thomson
et al. [
85,
86] similarly found that population models only estimated a fraction (11-39 percent) of slum residents in Nigeria, Kenya and Namibia, potentially omitting 0.75 to 1.5 people for every estimated person. They concluded that this underestimation is due to (i) the 100m resolution being too coarse to capture the vastly different demographic and building patterns in slum areas, and (ii) insufficient observations in slum areas for model training.
Challenges and Future Directions
The significant developments in data sources and methods described above have enabled the generation of geospatial modelled population estimates in a range of countries and contexts. These advances have addressed many challenges outlined by Wardrop
et al. [2018], but some challenges remain, and new ones have emerged as applications involve increasingly diverse sources of data. These challenges can be categorized into data input challenges, methodological considerations, and implementation barriers (
Figure 3).
Methodological Considerations
Novel data sources offer promising avenues for increasing the temporal-resolution of population estimates and thus increased accuracy in context of mass movement. Digital trace data from mobile phone call detail records [
54], smartphone location history data [
99], mobile applications [
100], and social media platforms [
101,
102] have demonstrated potential for measuring near real-time population distributions and movements. These data sources are valuable for their volume and frequency of updates, though spatial resolution varies along urban-rural gradients and representativeness across population subgroups remains uncertain. Methods to correct for these biases using survey data are emerging [
103], but access challenges due to private ownership persist.
Population model design also raises other important methodological considerations. Spatial autocorrelation of inputs requires particular attention for accurate uncertainty estimation, as it can alter observation variance and increase sensitivity to certain covariates [
104,
105]. This is particularly important when there are spatial dependences in the data that cannot be accounted for by readily available covariates, i.e. when spatial autocorrelation remains in the residuals of a fully specified model.
The modifiable areal unit problem (MAUP) - error arising from the choice of spatial units used to summarise point-based observations - represents another crucial consideration, particularly when the spatial scales of observations differ from pixel-level estimation units [
106]. This mismatch can reduce explained variance, increase residual spatial autocorrelation, and potentially lead to erroneous conclusions [
107,
108]. MAUP requires careful harmonization of all scales—observations, human populations, and statistical model predictions [
109]. For population models that rely heavily on satellite-derived information, even the gridding process itself may impose artificial discretization on the underlying continuous landscape [
110]. Simulation studies can help determine appropriate scale choices [
111], while spatial aggregation models that incorporate both data response and sampling frame components can better account for population distribution uncertainties [
108].
Implementation Barriers and Opportunities
While a growing number of applications show how small area modelled population estimates provide valuable support for planning and decision-making during intercensal periods, they often face barriers to acceptance by governments, who may be reluctant to use modelled data, geospatial covariates, or to collaborate with external partners to produce official population estimates. Population statistics have significant implications for resource allocation [
112,
113,
114], political representation [
115], and economic and SDG indicators [
116], making their official endorsement politically sensitive. To facilitate greater governmental uptake, several approaches show promise: improved model validation, enhanced communication of model methods, results and associated uncertainties [
117] and crucially, local ownership of the modelling process [
118]. Local ownership ideally means that when external partners are brought in to support the design and use of these new methods, the approach is co-developed with national statistical offices, and is accompanied by appropriate capacity strengthening for sustainable data generation and use [
119,
120]. Although geographic information systems are now commonly used to analyse and display population data, spatial statistical expertise is often limited in government institutions across low-income settings. Medium- to long-term investment is needed to identify in-country champions, develop institutional skills, and strengthen national university programs to cultivate the next generation of geospatial experts.
Beyond the uptake of model results by various stakeholders, there are other important barriers for population modelling, such as data licenses. Various licence types are available for public data products (e.g. Creative Commons Attribution, Open Data Commons Open Database License). Some of them are fully open, some of them are restricted one way or another. Some geospatial data products have strict restrictions on combination with similar data products and have even restrictions on data dissemination of the derived outputs. Others, such as the OSM license, only require the final product to be made publicly available. The licence of the available data must therefore be considered during model design and publishing.
Another concern is that small area estimates, especially high-resolution population estimates, might expose vulnerable groups to unwanted surveillance or security risks and might impact equitable resource allocations [
121,
122]. Ensuring responsible use of modern technologies and publication and use of population data are crucial. Good practice includes seeking approval from (and if necessary, establishing) independent ethics boards of the data provider, government and/or universities, de-identification of observations, and publishing only aggregated statistical estimates and not the original observations. At the international level, the UN Statistical Commission should consider developing guidance for countries utilising these approaches on effective practice and good governance. In doing so it would empower countries to innovate and try alternative methodologies to fill crucial gaps, whilst showing how this can be done safely and sustainably.
Conclusions
Reliable, recent, and detailed small area population data are increasingly required for effective decision-making, development planning, Sustainable Development Goal reporting, operational campaign planning, and humanitarian response. Since Wardrop
et al. [
10] presented their perspectives on spatially disaggregated population estimates, significant advances have been made in both data sources and methodological approaches. The emergence of building footprint datasets with near-global coverage, applications of Bayesian hierarchical modelling frameworks to handle complex data integration, and development of methods to quantify and communicate uncertainty have substantially improved the quality and reliability of small area population estimates in the absence of census data.
These advances have enabled practical applications across diverse contexts. In countries including Burkina Faso [
123], Mali [
124], Papua New Guinea [
125], and South Sudan [
126], modelled population estimates have been adopted by national statistical offices for country planning activities. In Colombia, the national statistical office has developed capacity to implement these methods themselves, using them to address census enumeration gaps [
20]. UN agencies including UNFPA and UNICEF (United Nations Children's Fund) are actively promoting and applying these approaches [
127,
128,
129] and computer algorithms [
130,
131] to support field operations and data sustainability.
Despite these successes, challenges remain in capturing urban diversity, informal settlements, mobile populations, and populations in areas with persistent cloud cover or heavy vegetation. Continuing methodological advances and stronger partnerships between data producers, data scientists, and stakeholders are needed to further enhance the utility and acceptance of these approaches. Increasing transparency in data products, promoting collaboration and co-production, and investing in capacity strengthening would significantly enhance uptake and sustainability.
As demographic data gaps increase [
91], the complementary role of small area population estimation approaches will likely expand. Their integration with traditional demographic data systems offers a path toward more responsive and cost-efficient statistics, particularly in contexts of census disruption or under-coverage. These methods can play a crucial role in demographic reconciliation, helping to detect and adjust for omissions or overcounts and providing robust, disaggregated data for planning and monitoring. Looking forward, the field is poised to benefit from increasing availability of digital trace data, advances in AI for building characterization, and improvements in statistical methods for data integration and uncertainty quantification.
The elaboration of such models, however, requires more than technical expertise. It depends on the sustained collaboration between national statistical offices, academic institutions, and international partners. In Latin America and the Caribbean, a collaborative partnership model involving UNFPA, ECLAC (Economic Commission for Latin America and the Caribbean), and national statistical offices has proven effective [
132]. These partnerships leverage local contextual knowledge, statistical leadership, and external modelling capacity to co-develop robust estimates. In the Caribbean, coordination with regional statistical entities such as CARICOM (Caribbean Community) has also emerged as a key enabler for institutionalizing these approaches.
To ensure the long-term utility and trust in these estimates, methods and results must be transparently documented, presented with clear uncertainty and goodness of fit metrics, and integrated into national processes such as census reconciliation and population projections. With proper institutionalization and regional coordination, small area estimation methods can become a strategic asset for achieving data-driven, inclusive development.
Acknowledgments
The authors would like to acknowledge that the work summarised in this paper is the result of many years of research and collaboration, with a wide variety of partners including UNFPA, UNICEF, The Gates Foundation, GAVI, GRID3, the UK Foreign, Commonwealth and Development Office. Particular thanks to the many national governments, and their statistical and data offices, who have worked with us to develop and improve upon many of the methodologies here described.
References
- Borowitz, M.; Zhou, J.; Azelton, K.; Nassar, I.-Y. Examining the value of satellite data in halting transmission of polio in Nigeria: A socioeconomic analysis. Data & Policy 2023, 5, e16. [CrossRef]
- J. Bryant (2021) Digital mapping and inclusion in humanitarian response. in HPG working paper. London: ODI (https://odi.org/en/publications/digital-mapping-and-inclusion-in-humanitarian-response).
- Cumbane, S.P.; Gidófalvi, G. Spatial Distribution of Displaced Population Estimated Using Mobile Phone Data to Support Disaster Response Activities. ISPRS International Journal of Geo-Information 2021, 10, 421.
- Greenough, P.G.; Nelson, E.L. Beyond mapping: a case for geospatial analytics in humanitarian health. Conflict and Health 2019, 13, 50. [CrossRef]
- Robin, T.A.; Khan, M.A.; Kabir, N.; Rahaman, S.T.; Karim, A.; Mannan, I.I.; George, J.; Rashid, I. Using spatial analysis and GIS to improve planning and resource allocation in a rural district of Bangladesh. BMJ Global Health 2019, 4, e000832. [CrossRef]
- USCB (2022) Post-Enumeration Survey and Demographic Analysis Help Evaluate 2020 Census Results. in Census Bureau Releases Estimates of Undercount and Overcount in the 2020 Census (United States Census Bureau).
- Randall, S. Where have all the nomads gone? Fifty years of statistical and demographic invisibilities of African mobile pastoralists. Pastoralism 2015, 5, 22. [CrossRef]
- UNECE. Guidelines on the Use of Registers and Administrative Data for Population and Housing Censuses; Geneva, 2018.
- UN. The Sustainable Development Goals Report; United Nations, ISBN: 978-92-1-003135-6: New York, USA, 2024.
- Wardrop, N.A.; Jochem, W.C.; Bird, T.J.; Chamberlain, H.R.; Clarke, D.; Kerr, D.; Bengtsson, L.; Juran, S.; Seaman, V.; Tatem, A.J. Spatially disaggregated population estimates in the absence of national population and housing census data. Proceedings of the National Academy of Sciences 2018, 115, 3529-3537. [CrossRef]
- S. Tadesse (2025) The Evolving Census Landscape: Lessons from the 2020 round and anticipated trends for the 2030 round. in United Nations Statistical Commission, 56th Session, Side Event: Advancing Population and Housing Censuses in the 2030 Round; New York, USA; 5 March 2025 (United Nations Statistics Division (UNSD); https://unstats.un.org/UNSDWebsite/events-details/UNSC56-population-housing-census-5Mar2025/).
- Jensen, E.; Kennel, T. Detailed Coverage Estimates for the 2020 Census Released Today. In America Counts: Stories United States Census Bureau, March 10, 2022. https://www.census.gov/library/stories/2022/03/who-was-undercounted-overcounted-in-2020-census.html, 2022.
- United Nations Economic and Social Council. Future of population and housing censuses and lessons learned from past and current experiences (E/ESCWA/C.1/2024/4); UN, New York, 2024.
- Weber, E.M.; Seaman, V.Y.; Stewart, R.N.; Bird, T.J.; Tatem, A.J.; McKee, J.J.; Bhaduri, B.L.; Moehl, J.J.; Reith, A.E. Census-independent population mapping in northern Nigeria. Remote Sensing of Environment 2018, 204, 786-798. [CrossRef]
- Hillson, R.; Alejandre, J.D.; Jacobsen, K.H.; Ansumana, R.; Bockarie, A.S.; Bangura, U.; Lamin, J.M.; Malanoski, A.P.; Stenger, D.A. Methods for Determining the Uncertainty of Population Estimates Derived from Satellite Imagery and Limited Survey Data: A Case Study of Bo City, Sierra Leone. PLOS ONE 2014, 9, e112241. [CrossRef]
- Leasure, D.R.; Jochem, W.C.; Weber, E.M.; Seaman, V.; Tatem, A.J. National population mapping from sparse survey data: A hierarchical Bayesian modeling framework to account for uncertainty. Proceedings of the National Academy of Sciences 2020, 10.1073/pnas.1913050117, 201913050. [CrossRef]
- Boo, G.; Darin, E.; Leasure, D.R.; Dooley, C.A.; Chamberlain, H.R.; Lázár, A.N.; Tschirhart, K.; Sinai, C.; Hoff, N.A.; Fuller, T.; et al. High-resolution population estimation using household survey data and building footprints. Nature Communications 2022, 13, 1330. [CrossRef]
- G. Boo et al., Tackling public health data gaps through Bayesian high-resolution population estimation. PLOS Global Public Health https://verixiv.org/articles/2-8/v1 (under review).
- Darin, E.; Kuépié, M.; Bassinga, H.; Boo, G.; Tatem, A.J. La population vue du ciel : quand l’imagerie satellite vient au secours du recensement. Population (french edition) 2022, 77, 467-494. [CrossRef]
- Sanchez-Cespedes, L.M.; Leasure, D.R.; Tejedor-Garavito, N.; Amaya Cruz, G.H.; Garcia Velez, G.A.; Mendoza, A.E.; Marín Salazar, Y.A.; Esch, T.; Tatem, A.J.; Ospina Bohórquez, M. Social cartography and satellite-derived building coverage for post-census population estimates in difficult-to-access regions of Colombia. Population Studies 2024, 78, 3-20. [CrossRef]
- WorldPop; Institut National de la Statistique du Mali. Census-cartography-based gridded population estimates for Mali (2020), version 1.0. WorldPop, University of Southampton. https://wopr.worldpop.org/?MLI/Population/v1.0 2022. [CrossRef]
- IPAC. NUMBERS MATTER: THE 2020 CENSUS AND CONFLICT IN PAPUA; Institute for Policy Analysis of Conflict Jakarta, 2019.
- Sullivan, T.A. Who, What, When, and Where of the Census. In Census 2020: Understanding the Issues; Sullivan, T.A., Ed. Springer International Publishing: Cham, 2020; pp. 17-31. [CrossRef]
- Statistics South Africa. Post Enumeration Survey - Statistical Release P0301.5; Pretoriam, Stats SA, 2022.
- A. M. Wazir, A. Goujon (2019) Assessing the 2017 Census of Pakistan Using Demographic Analysis: A Sub-National Perspective. in Vienna Institute of Demography Working Papers No. 06/2019 (Vienna Institute of Demography (VID), Vienna).
- B. A. Dooley et al. (2021) Description of methods for the Zambia modelled population estimates from multiple routinely collected and geolocated survey data, version 1.0. WorldPop, University of Southampton. [CrossRef]
- Nnanatu, C.; Yankey, O.; Abbott, T.; Gadiaga, A.; Lazar, A.; Darin, É.; Tatem, A.; Bondarenko, M. Modelled gridded population estimates for Cameroon 2022. Version 1.0, University of Southampton, 17 Jun 2024. https://data.worldpop.org/repo/wopr/CMR/population/v1.0/; 2024. [CrossRef]
- Engstrom, R.; Newhouse, D.; Soundararajan, V. Estimating small-area population density in Sri Lanka using surveys and Geo-spatial data. PLOS ONE 2020, 15, e0237063. [CrossRef]
- Leasure, D.R.; Dooley, C.A.; Tatem, A. A simulation study exploring weighted likelihood models to recover unbiased population estimates from weighted survey data. University of Southampton. 2021. [CrossRef]
- Nnanatu, C.C.; Bonnie, A.; Joseph, J.; Yankey, O.; Cihan, D.; Gadiaga, A.; Voepel, H.; Abbott, T.; Chamberlain, H.; Tia, M.; et al. Estimating small area population from health intervention campaign surveys and partially observed settlement data. Nature Communications 2025, 16, 4951. [CrossRef]
- Nnanatu, C.; Yankey, O.; Bonnie, A.; Abbott, T.J.; Chamberlain, H.; Lazar, A.N.; Tatem, A.J. Bottom-up gridded population estimates for Maniema province in the Democratic Republic of Congo (2022), version 4.1. 2024. [CrossRef]
- Leyk, S.; Gaughan, A.E.; Adamo, S.B.; de Sherbinin, A.; Balk, D.; Freire, S.; Rose, A.; Stevens, F.R.; Blankespoor, B.; Frye, C.; et al. The spatial allocation of population: a review of large-scale gridded population data products and their fitness for use. Earth Syst. Sci. Data 2019, 11, 1385-1409. [CrossRef]
- MP. M., P. P., GHS-BUILT-S R2023A - GHS built-up surface grid, derived from Sentinel2 composite and Landsat, multitemporal (1975-2030).
- Woods, D.; McKeen, T.; Cunningham, A.; Priyatikanto, R.; Sorichetta, A.; Tatem, A.J.; Bondarenko, M. WorldPop high resolution, harmonised annual global geospatial covariates. Version 1.0. University of Southampton: Southampton, UK, 2024. [CrossRef]
- Nieves, J.J.; Stevens, F.R.; Gaughan, A.E.; Linard, C.; Sorichetta, A.; Hornby, G.; Patel, N.N.; Tatem, A.J. Examining the correlates and drivers of human population distributions across low- and middle-income countries. Journal of The Royal Society Interface 2017, 14, 20170401. [CrossRef]
- Sirko, W.; Kashubin, S.; Ritter, M.; Annkah, A.; Bouchareb, Y.S.E.; Dauphin, Y.; Keysers, D.; Neumann, M.; Cisse, M.; Quinn, J.A. Continental-scale building detection from high resolution satellite imagery. arXiv:2107.12283; 2021. [CrossRef]
- Microsoft. Worldwide building footprints derived from satellite imagery (GitHub Repository); https://github.com/microsoft/GlobalMLBuildingFootprints. 2022.
- Ecopia. Global Feature Extraction: Building footprints; https://www.ecopiatech.com/products/global-feature-extraction. 2020.
- Zhu, X.X.; Chen, S.; Zhang, F.; Shi, Y.; Wang, Y. GlobalBuildingAtlas: An Open Global and Complete Dataset of Building Polygons, Heights and LoD1 3D Models. arXiv:2506.04106 2025. . [CrossRef]
- Hillson, R.; Alejandre, J.D.; Jacobsen, K.H.; Ansumana, R.; Bockarie, A.S.; Bangura, U.; Lamin, J.M.; Stenger, D.A. Stratified Sampling of Neighborhood Sections for Population Estimation: A Case Study of Bo City, Sierra Leone. PLoS One 2015, 10, e0132850. [CrossRef]
- Jochem, W.C.; Leasure, D.R.; Pannell, O.; Chamberlain, H.R.; Jones, P.; Tatem, A.J. Classifying settlement types from multi-scale spatial patterns of building footprints. Environment and Planning B: Urban Analytics and City Science 2020, 10.1177/2399808320921208, 2399808320921208. [CrossRef]
- Lloyd, C.T.; Sturrock, H.J.W.; Leasure, D.R.; Jochem, W.C.; Lázár, A.N.; Tatem, A.J. Using GIS and Machine Learning to Classify Residential Status of Urban Buildings in Low and Middle Income Settings. Remote Sensing 2020, 12, 3847. [CrossRef]
- Tomás, L.; Fonseca, L.; Almeida, C.; Leonardi, F.; Pereira, M. Urban population estimation based on residential buildings volume using IKONOS-2 images and lidar data. International Journal of Remote Sensing 2016, 37, 1-28. [CrossRef]
- Schug, F.; Frantz, D.; van der Linden, S.; Hostert, P. Gridded population mapping for Germany based on building density, height and type from Earth Observation data using census disaggregation and bottom-up estimates. PLOS ONE 2021, 16, e0249044. [CrossRef]
- Microsoft, Road detections from Microsoft Maps aerial imagery. https://github.com/microsoft/RoadDetections?tab=readme-ov-file.
- W. Sirko et al., High-Resolution Building and Road Detection from Sentinel-2. arXiv:2310.11622. https://sites.research.google/gr/open-buildings/temporal/. [CrossRef]
- OSM, OpenStreetMap. https://www.openstreetmap.org/#map=5/54.91/-3.43.
- Herfort, B.; Lautenbach, S.; Porto de Albuquerque, J.; Anderson, J.; Zipf, A. The evolution of humanitarian mapping within the OpenStreetMap community. Scientific Reports 2021, 11, 3037. [CrossRef]
- GRID3. GRID3 Data Hub. https://data.grid3.org/.
- ACLED. ACLED Data. https://acleddata.com/.
- Scher, C.; Van Den Hoek, J. Nationwide conflict damage mapping with interferometric synthetic aperture radar: A study of the 2022 Russia-Ukraine conflict. Science of Remote Sensing 2025. 100217. [CrossRef]
- Wiguna, S.; Adriano, B.; Mas, E.; Koshimura, S. Evaluation of Deep Learning Models for Building Damage Mapping in Emergency Response Settings. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2024, 17, 5651-5667. [CrossRef]
- Al Shafian, S.; Hu, D. Integrating Machine Learning and Remote Sensing in Disaster Management: A Decadal Review of Post-Disaster Building Damage Assessment. Buildings 2024, 14, 2344. [CrossRef]
- Deville, P.; Linard, C.; Martin, S.; Gilbert, M.; Stevens, F.R.; Gaughan, A.E.; Blondel, V.D.; Tatem, A.J. Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences 2014, 111, 15888-15893. [CrossRef]
- Dong, L.; Duarte, F.; Duranton, G.; Santi, P.; Barthelemy, M.; Batty, M.; Bettencourt, L.; Goodchild, M.; Hack, G.; Liu, Y.; et al. Defining a city — delineating urban areas using cell-phone data. Nature Cities 2024, 1, 117-125. [CrossRef]
- Patel, N.N.; Stevens, F.R.; Huang, Z.; Gaughan, A.E.; Elyazar, I.; Tatem, A.J. Improving Large Area Population Mapping Using Geotweet Densities. Transactions in GIS 2017, 21, 317-331. [CrossRef]
- Stathakis, D.; Baltas, P. Seasonal population estimates based on night-time lights. Computers, Environment and Urban Systems 2018, 68, 133-141. [CrossRef]
- Brown, C.F.; Brumby, S.P.; Guzder-Williams, B.; Birch, T.; Hyde, S.B.; Mazzariello, J.; Czerwinski, W.; Pasquarella, V.J.; Haertel, R.; Ilyushchenko, S.; et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Scientific Data 2022, 9, 251. [CrossRef]
- Leasure, D.R.; Dooley, C.A.; Bondarenko, M.; Tatem, A.J. peanutButter: An R package to produce rapid-response gridded population estimates from building footprints, version 1.0.0. , WorldPop, University of Southampton. https://github.com/wpgp/peanutButter, 2021. [CrossRef]
- Darin, E.; Leasure, D.R.; Kashyap, R. How accurate are high resolution settlement maps at predicting population counts in data scarce settings. 2025. [CrossRef]
- Nnanatu, C.C.; Yankey, O.; Dzossa, A.D.; Abbott, T.; Gadiaga, A.; Lazar, A.; Tatem, A.J. Efficient Bayesian Hierarchical Small Area Population Estimation Using INLA-SPDE: Integrating Multiple Data Sources and Spatial-Autocorrelation. In Preprints; 10.20944/preprints202501.0588.v1, 2025; [CrossRef]
- Rue, H.; Martino, S.; Chopin, N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society Series B: Statistical Methodology 2009, 71, 319-392. [CrossRef]
- Lindgren, F.; Rue, H.; Lindström, J. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society Series B: Statistical Methodology 2011, 73, 423-498. [CrossRef]
- Tuccillo, J.V.; Moehl, J.; Adams, D.; Cunningham, A.R.; Urban, M.; Walters, S.; Woody, C.; Reith, A.; Kaufman, J.; Epting, J.; et al. LandScan HD: A High-Resolution Gridded Ambient Population Methodology for the World. 09 April 2025, PREPRINT (Version 1) available at Research Square. 2025. [CrossRef]
- Doupe, P.; Bruzelius, E.; Faghmous, J.; Ruchman, S.G. Equitable development through deep learning: The case of sub-national population density estimation. In Proceedings of Proceedings of the 7th Annual Symposium on Computing for Development, Nairobi, Kenya; p. Article 6.
- Hu, W.; Patel, J.H.; Robert, Z.-A.; Novosad, P.; Asher, S.; Tang, Z.; Burke, M.; Lobell, D.; Ermon, S. Mapping Missing Population in Rural India: A Deep Learning Approach with Satellite Imagery. In Proceedings of Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, Honolulu, HI, USA; pp. 353–359.
- Neal, I.; Seth, S.; Watmough, G.; Diallo, M.S. Census-independent population estimation using representation learning. Scientific Reports 2022, 12, 5185. [CrossRef]
- Pezzulo, C.; Hornby, G.M.; Sorichetta, A.; Gaughan, A.E.; Linard, C.; Bird, T.J.; Kerr, D.; Lloyd, C.T.; Tatem, A.J. Sub-national mapping of population pyramids and dependency ratios in Africa and Asia. Scientific Data 2017, 4, 170089. [CrossRef]
- Alegana, V.A.; Atkinson, P.M.; Pezzulo, C.; Sorichetta, A.; Weiss, D.; Bird, T.; Erbach-Schoenberg, E.; Tatem, A.J. Fine resolution mapping of population age-structures for health and development applications. Journal of The Royal Society Interface 2015, 12, 20150073. [CrossRef]
- C. C. Nnanatu, S. Chaudhuri, A. N. Lazar, A. J. Tatem (2025) jollofR: A Bayesian statistical model-based approach for disaggregating small area population estimates by demographic characteristics. R package version 0.3.0, https://github.com/wpgp/jollofR/.
- UCLA-DRC Health Research and Training Program; Kinshasa School of Public Health. Kinshasa, Kongo Central and former Bandundu microcensus survey data (2017-18). 2018.
- Institut de la Statistique du Mali. Cartographie du RGPH5. 2020.
- Chamberlain, H.R.; Lazar, A.N.; Tatem, A.J. High-resolution estimates of social distancing feasibility, mapped for urban areas in sub-Saharan Africa. Scientific Data 2022, 9, 711. [CrossRef]
- Alegana, V.A.; Pezzulo, C.; Tatem, A.J.; Omar, B.; Christensen, A. Mapping out-of-school adolescents and youths in low- and middle-income countries. Humanities and Social Sciences Communications 2021, 8, 213. [CrossRef]
- Macharia, P.M.; K., M.A.; Eda, M.; Emanuele, G.; A., O.E.; W., S.R.; and Ray, N. Modelling geographic access and school catchment areas across public primary schools to support subnational planning in Kenya. Children's Geographies 2023, 21, 832-848. [CrossRef]
- Smith, A.; Bates, P.D.; Wing, O.; Sampson, C.; Quinn, N.; Neal, J. New estimates of flood exposure in developing countries using high-resolution population data. Nature Communications 2019, 10, 1814. [CrossRef]
- Hierink, F.; Rodrigues, N.; Muñiz, M.; Panciera, R.; Ray, N. Modelling geographical accessibility to support disaster response and rehabilitation of a healthcare system: an impact analysis of Cyclones Idai and Kenneth in Mozambique. BMJ Open 2020, 10, e039138. [CrossRef]
- Qader, S.H.; Lefebvre, V.; Tatem, A.J.; Pape, U.; Jochem, W.; Himelein, K.; Ninneman, A.; Wolburg, P.; Nunez-Chaim, G.; Bengtsson, L.; et al. Using gridded population and quadtree sampling units to support survey sample design in low-income settings. International Journal of Health Geographics 2020, 19, 10. [CrossRef]
- Cajka, J.; Safaa, A.; Jamie, R.; and Allpress, J. Geo-sampling in developing nations. International Journal of Social Research Methodology 2018, 21, 729-746. [CrossRef]
- Borkovska, O.; Pollard, D.; Hamainza, B.; Kooma, E.; Renn, S.; Schmidt, J.; Engin, H.; Heaton, M.; Miller, J.M.; Psychas, P.; et al. Developing High-Resolution Population and Settlement Data for Impactful Malaria Interventions in Zambia. Journal of Environmental and Public Health 2022, 2022, 2941013. [CrossRef]
- Gelman, A.; Vehtari, A.; Simpson, D.; Margossian, C.C.; Carpenter, B.; Yao, Y.; Kennedy, L.; Gabry, J.; Bürkner, P.-C.; Modrák, M. Bayesian workflow. arXiv preprint arXiv:2011.01808 2020. [CrossRef]
- Conn, P.B.; Johnson, D.S.; Williams, P.J.; Melin, S.R.; Hooten, M.B. A guide to Bayesian model checking for ecologists. Ecological Monographs 2018, 88, 526-542. [CrossRef]
- Chamberlain, H.R.; Dooley, C.A.; Tatem, A.J. Assessing the accuracy of census-independent small area modelled population datasets. in preparation.
- Breuer, J.H.P.; Friesen, J.; Taubenböck, H.; Wurm, M.; Pelz, P.F. The unseen population: Do we underestimate slum dwellers in cities of the Global South? Habitat International 2024, 148, 103056. [CrossRef]
- Thomson, D.R.; Gaughan, A.E.; Stevens, F.R.; Yetman, G.; Elias, P.; Chen, R. Evaluating the Accuracy of Gridded Population Estimates in Slums: A Case Study in Nigeria and Kenya. Urban Science 2021, 5, 48. [CrossRef]
- Thomson, D.R.; Stevens, F.R.; Chen, R.; Yetman, G.; Sorichetta, A.; Gaughan, A.E. Improving the accuracy of gridded population estimates in cities and slums to monitor SDG 11: Evidence from a simulation study in Namibia. Land Use Policy 2022, 123, 106392. [CrossRef]
- Davis, J.M.; Wilfahrt, M. Enumerator Experiences in Violent Research Environments. Comparative Political Studies 2024, 57, 675-709. [CrossRef]
- DESA. Guidelines on the use of electronic data collection technologies in population and housing censuses; United Nations, New York, January 2019: Department of Economic and Social Affairs Statistics Division, 2019.
- Hogan, H. Distrust in the Governments Brings Risk to the Census. Harvard Data Science Review 2020, 2. [CrossRef]
- Aguma, H.B.; Rukaari, M.; Nakamatte, R.; Achii, P.; Miti, J.T.; Muhumuza, S.; Nabukenya, M.; Opigo, J.; Lukwago, M. Mass distribution campaign of long-lasting insecticidal nets (LLINs) during the COVID-19 pandemic in Uganda: lessons learned. Malar J 2023, 22, 310. [CrossRef]
- Tatem, A.J.; Espey, J. Global population data is in crisis – here’s why that matters. In The Conversation, 2025; https://theconversation.com/global-population-data-is-in-crisis-heres-why-that-matters-251751.
- Chamberlain, H.R.; Darin, E.; Adewole, W.A.; Jochem, W.C.; Lazar, A.N.; Tatem, A.J. Building footprint data for countries in Africa: To what extent are existing data products comparable? Computers, Environment and Urban Systems 2024, 110, 102104. [CrossRef]
- Visée, C.; Morlighem, C.; Linard, C.; Faty, A.; Henry, S.; Dujardin, S. Addressing bias in national population density models: Focusing on rural Senegal. PLOS ONE 2024, 19, e0310809. [CrossRef]
- Owusu, M.; Kuffer, M.; Belgiu, M.; Grippa, T.; Lennert, M.; Georganos, S.; Vanhuysse, S. Towards user-driven earth observation-based slum mapping. Computers, Environment and Urban Systems 2021, 89, 101681. [CrossRef]
- UNHCR. Global Trends: Forced displacement in 2023; United Nations High Commissioner for Refugees: Copenhagen, Denmark, 2024.
- Dooley, C.A.; Jochem, W.C.; Sorichetta, A.; Lazar, A.N.; Tatem, A. Description of methods for South Sudan 2020 gridded population estimates from census projections adjusted for displacement, version 2.0. WorldPop, University of Southampton. 2021. [CrossRef]
- Quinn, J.A.; Nyhan, M.M.; Navarro, C.; Coluccia, D.; Bromley, L.; Luengo-Oroz, M. Humanitarian applications of machine learning with remote-sensing data: review and case study in refugee settlement mapping. Philos Trans A Math Phys Eng Sci 2018, 376. [CrossRef]
- Darin, E.; Dicko, A.H.; Galal, H.; Jimenez, R.M.; Park, H.; Tatem, A.J.; Qader, S. Mapping refugee populations at high resolution by unlocking humanitarian administrative data. Journal of International Humanitarian Action 2024, 9, 14. [CrossRef]
- Ruktanonchai, N.W.; Ruktanonchai, C.W.; Floyd, J.R.; Tatem, A.J. Using Google Location History data to quantify fine-scale human mobility. International Journal of Health Geographics 2018, 17, 28. [CrossRef]
- Sinclair, M.; Maadi, S.; Zhao, Q.; Hong, J.; Ghermandi, A.; Bailey, N. Assessing the socio-demographic representativeness of mobile phone application data. Applied Geography 2023, 158, 102997. [CrossRef]
- Leasure, D.R.; Kashyap, R.; Rampazzo, F.; Dooley, C.A.; Elbers, B.; Bondarenko, M.; Verhagen, M.; Frey, A.; Yan, J.; Akimova, E.T.; et al. Nowcasting Daily Population Displacement in Ukraine through Social Media Advertising Data. Population and Development Review 2023, 49, 231-254. [CrossRef]
- Chi, G.; Abel, G.J.; Johnston, D.; Giraudy, E.; Bailey, M. Measuring global migration flows using online data. Proceedings of the National Academy of Sciences 2025, 122, e2409418122. [CrossRef]
- Flowminder Foundation; Hosner, R.; Strain-Fajth, Z.; Lefebvre, V. Using survey data to correct for representation biases in mobility indicators derived from mobile operator data to produce high-frequency estimates of population and internal migration. In Proceedings of Netmob 2023. [CrossRef]
- Donegan, C.; Chun, Y.; Hughes, A.E. Bayesian estimation of spatial filters with Moran’s eigenvectors and hierarchical shrinkage priors. Spatial Statistics 2020, 38, 100450. [CrossRef]
- Mets, K.D.; Armenteras, D.; Dávalos, L.M. Spatial autocorrelation reduces model precision and predictive power in deforestation analyses. Ecosphere 2017, 8, e01824. [CrossRef]
- Openshaw, S. The modifiable areal unit problem. Concepts and techniques in modern geography 1984.
- Lee, S.A.; Economou, T.; Lowe, R. A Bayesian modelling framework to quantify multiple sources of spatial variation for disease mapping. Journal of The Royal Society Interface 2022, 19, 20220440. [CrossRef]
- Paige, J.; Fuglstad, G.-A.; Riebler, A.; Wakefield, J. Spatial aggregation with respect to a population distribution: Impact on inference. Spatial Statistics 2022, 52, 100714. [CrossRef]
- Dungan, J.L.; Perry, J.N.; Dale, M.R.T.; Legendre, P.; Citron-Pousty, S.; Fortin, M.-J.; Jakomulska, A.; Miriti, M.; Rosenberg, M.S. A balanced view of scale in spatial statistical analysis. Ecography 2002, 25, 626-640. [CrossRef]
- Atkinson, P.M.; Stein, A.; Jeganathan, C. Spatial sampling, data models, spatial scale and ontologies: Interpreting spatial statistics and machine learning applied to satellite optical remote sensing. Spatial Statistics 2022, 50, 100646. [CrossRef]
- Kimberley, M.O.; Watt, M.S.; Harrison, D. Characterising prediction error as a function of scale in spatial surfaces of tree productivity. New Zealand Journal of Forestry Science 2017, 47, 19. [CrossRef]
- Project on Government Oversight. Dollars and Demographics: How Census Data Shapes Federal Funding Distribution; POGO: Washington DC, 2023.
- Chatzky, A.; Cheatham, A. Why Does the Census Matter?; Backgrounder, Council on Foreign Relations: New York, 2021.
- Desmon, S. CCP Part of $236 Million Contract that Conducts Key Health Surveys Worldwide. Commentary, Johns Hopkins Center for Communication Programs: Baltimore, 2024.
- Krieger, T.; Meierrieks, D. Population size and the size of government. European Journal of Political Economy 2020, 61, 101837. [CrossRef]
- Tuholske, C.; Gaughan, A.E.; Sorichetta, A.; de Sherbinin, A.; Bucherie, A.; Hultquist, C.; Stevens, F.; Kruczkiewicz, A.; Huyck, C.; Yetman, G. Implications for Tracking SDG Indicator Metrics with Gridded Population Data. Sustainability 2021, 13, 7329. [CrossRef]
- Fischhoff, B.; Davis, A.L. Communicating scientific uncertainty. Proceedings of the National Academy of Sciences 2014, 111, 13664-13671, doi:doi:10.1073/pnas.1317504111.
- Kudakwashe Paul, V.; Kate, H.; Matthew, G.; Helen, P.; Lynda, K. Local ownership of health policy and systems research in low-income and middle-income countries: a missing element in the uptake debate. BMJ Global Health 2019, 4, e001523. [CrossRef]
- MacFeely, S.; Barnat, N. Statistical capacity building for sustainable development: Developing the fundamental pillars necessary for modern national statistical systems1. Statistical Journal of the IAOS 2017, 33, 895-909. [CrossRef]
- Knittel, B.; Coile, A.; Zou, A.; Saxena, S.; Brenzel, L.; Orobaton, N.; Bartel, D.; Williams, C.A.; Kambarami, R.; Tiwari, D.P.; et al. Critical barriers to sustainable capacity strengthening in global health: a systems perspective on development assistance. Gates Open Res 2022, 6, 116. [CrossRef]
- Harrell-Bond, B.; Voutira, E.; Leopold, M. Counting the Refugees: Gifts, Givers, Patrons and Clients *. Journal of Refugee Studies 1992, 5, 205-225. [CrossRef]
- Mayemba, C.N.; Nkashama, D.J.K.; Tshimula, J.M.; Dialufuma, M.V.; Muabila, J.T.; Didier, M.M.; Kanda, H.; Galekwa, R.M.; Fita, H.D.; Mundele, S.; et al. A Short Survey of Human Mobility Prediction in Epidemic Modeling from Transformers to LLMs (ver 1, 25 April 2024). arXiv:2404.16921 2024. [CrossRef]
- Institut National de la Statistique et de la Démographie du Burkina Faso. VOLUME I : EVALUATION DE LA QUALITE DES DONNEES, ETAT, STRUCTURE ET DYNAMIQUE DE LA POPULATION; https://www.insd.bf/fr/resultats 2019.
- Institut de la Statistique du Mali. Resultats Globaux du RGPH5. https://www.instat-mali.org/laravel-filemanager/files/shares/rgph/rapport-resultats-globaux-rgph5_rgph.pdf; 2023.
- PNG National Statistical Office. Population Estimates 2021. https://www.nso.gov.pg/statistics/population/.
- Ninrew, C. S. Sudan population is 12.4 million - govt estimates. https://www.eyeradio.org/s-sudan-population-is-12-4-million-govt-estimates/.
- UNFPA. Hybrid Census. Technical Brief; United Nations Population Fund (UNFPA). https://www.unfpa.org/resources/new-methodology-hybrid-census-generate-spatially-disaggregated-population-estimates, 2019.
- UNFPA. The Value of Modeled Population Estimates for Census Planning and Preparation. Technical Guidance note; United Nations Population Fund (UNFPA). https://www.unfpa.org/sites/default/files/resource-pdf/V2_Technical-Guidance-Note_Value_of_Modeled_Pop_Estimates_in_Census.pdf, 2020.
- WHO; UNICEF. Geo-Enabled Microplanning Handbook: A product of the WHO-UNICEF COVAX GIS Working Group; 2023.
- UNICEF (2025) Reach the Unreached - Geospatial modelling mapping methods. (https://github.com/unicef-drp/reach-the-unreached?tab=readme-ov-file).
- Darin, E.; Leasure, D.R.; Tatem, A.J. Statistical population modelling for census support, United Nations Population Fund (UNFPA), Leverhulme Centre for Demographic Science, University of Oxford, and WorldPop, University of Southampton. https://wpgp.github.io/bottom-up-tutorial/. 2023. [CrossRef]
- Gutierrez, A. ECLAC and UNFPA approach to model populations in Latin America and the Caribbean, IAOS-ISI, Mexico City. https://www.inegi.org.mx/eventos/2024/iaos-isi/doc/34.pdf; 2024.
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).