Spatio-Temporal Interurban Regularities in the Global South

The reality of people’s lives has shifted from rural to urban areas, where an ever-increasing proportion of the world’s population lives. Providing infrastructure to serve these areas, especially in the Global South, is a key task of sustainable development. A deep understanding of the spatial arrangement and scales of these urban structures and their temporal evolution can help to develop innovative solutions to issues of energy, water, or transportation infrastructures. For this purpose, in this work we study the temporal evolution of urban built-up structures (Global Artificial Impervious Area) and population distributions (Global Human Settlement Population) in four regions of the Global South (Argentina, India, Egypt, and Nigeria). We qualitatively analyze regularity through the pair correlation function and subsequently identify typical scales within the different interurban systems. In doing so, we identify that especially the large settlement objects arrange themselves in a regular way and thus typical scales exist in urban systems. Thus, settlement objects are usually located about 20 to 40 km apart from each other. This information can be used to develop sustainable infrastructure concepts, for example for passenger transport between settlements.


Introduction
Our world is currently standing at a crossroad: Industrialization and globalization of our lives and economies has lead to unprecedented challenges humanity has to face [1]. Among others, climate change and steady population growth combined with increasing urbanization are probably the most prominent [2,3]. Especially countries in the Global South are facing exponential population growth [4,5]. This leads to the emergence of urban areas in those, posing potential positive effects on society as they propel economic growth [6]. In contrast, urbanization, if not managed in sustainable ways, can lead to the destruction of biodiversity [7], pollution [8], health problems and poverty. A main problem is that there is often no adequate supply infrastructure available for the rapidly growing population. Under this impression, recent research has focused on the field of sustainable infrastructure as infrastructural "decisions [...] made now [...] will lock-in patterns of development for future generations" [9]. In this context, infrastructure is not only understood as goods of transportation but also communication, energy management and many more (see [10]). The main goal is to design a decision process for renewal and creation of infrastructure that is not only beneficial to the economy but also helps create a live worthy future.
Since infrastructure is based on the idea of connecting structures and to provide the possibilities to transport matter, energy or information, the identification of typical scales is crucial [11]. The identification of typical scales can lead to the development of standardized concepts and solutions [12]. The rich data provided by earth observation data, for example, can help identify typical scales of urban systems and develop sustainable solutions to infrastructure issues [13].
As many approaches are currently evolving to reach the goal of sustainable infrastructures, a major obstacle to overcome is the lack of predictability of urbanization processes [9]. Without a basic understanding of transformation processes of a structure, sustainability cannot be achieved. In our understanding, the key to achieve predictability is to define and determine system parameters which are able to describe the nature of specific structures. At this point we can exploit a recurrently identifiable feature in urban mechanisms that is regularity. Regularity can be found in various spatial and temporal processes, just to mention congestion [14], travel [15] or settlement distribution patterns [16]. Furthermore, regularity is a core property and starting point in a variety of urban models based on cellular automates [17], agent based models [18] or pattern formation mechanisms [19]. This important role is attributed to regularity because it is often seen as a strong indicator for the existence of detectable mechanism in natural and sociological processes [20,21]. The facilitated detection of mechanism results from the fact that regular systems require reduced information being included in the model equations [22]. This only makes it possible to reveal explicit system parameters.
In our work we focus on the question if regularity can be found in settlement structures by analyzing built up and population density data. We intend on qualitatively describing the occurrence of regularity and its evolution over time through the investigation of spatiotemporal satellite imagery of selected regions in the Global South. The methodology is described in section 2. Following this, the results of the analysis are presented in section 3, where we undertake the task of determining systems parameters. The results are discussed and critically commented in section 4.

Materials and Methods
Before presenting our approach in detail, an overview of the workflow is given (see Figure 1). As mentioned, the analysis comprises of the investigation of regularity in two differently formatted data sets: a binary data set of the build up structure and a semi-continuous data set of population density meaning a concentration of persons per area. This choice stems from the consideration of dynamics of the respective structure. Building infrastructure is created on longer time scales and lasting for long times [23,24]. Demographic changes are occurring at shorter time scales and population densities can depict residential densification [24,25]. Therefore, both data sets can provide different system parameters of the same subject. After preparing the data, we analyze the regularity of the point patterns. We then identify regularities and quantify typical scales of urban systems, and finally discuss their implications for infrastructure systems.

Data Selection
For the analysis of discrete and continuous settlement structures two independent data sets are used: Discrete data is provided by the Global Artificial Impervious Area (GAIA) data set [26] with yearly images between 1985 and 2018, and spatial resolution of 30m in east-west and north-south extension at the equator. As for some regions the data set is not complete, we only include images between 1994 and 2018. Population density data is provided by the Global Human Settlement Population (GHS-POP) [27] with four time points at 1975, 1990, 2000 and 2015 with a spatial resolution of 250 m in both direction at the equator.
For the investigation four different regions of the Global South, partly intersecting with regions analyzed by [16], are chosen. Those agriculturally dominated regions are placed in India, Egypt, Nigeria and Argentina (See Figure 2, for coordinates see Table  A1). Each data set is an excerpt of roughly 200 km by 200 km in east-west and northeast direction. The selected areas have been chosen as they represent different aspect of economies summarized under the heterogeneous term of the Global South. Those aspects include a variety of different demographic dynamics, political systems and stages of economic development. Furthermore, they feature an apparent regularity.

Creation of Point Patterns
In order to determine the distribution of objects it is common to reduce the information provided by images taken of the structures of interest to essential attributes. In our case those attributes at every time step are as follows:

1.
Position, i.e. the coordinates of each object in the plane 2.
Size of the respective object, i.e. area or population This can be executed by abstracting objects to their mass center position and assigning the size of the object to its mass center. In case of built-up structures this procedure is straightforward as geometrical mass centers can be determined trivially [16]. In contrast, the determination for population density objects can not be done in the same manner as for concentrations density centers have to be considered that they do not correspond to geometrically determined mass centers. Furthermore, the associated attribute to each settlement object is its population which is also not corresponding to the surface of the respective object but its accumulated cluster population. In order to identify population clusters and their centers, we are using a modified approach of the Clustering by Fast Search and Find of Density Peaks algorithm proposed by [28]. The procedure used here is shown exemplary (see Figure 3): We assume a population density distribution with three density peaks.

2.
Each pixel or point P i has a assigned population density ρ pop,i with which the difference to densities of all other points can be determined, 3. Through this, the cluster density ρ cl,i of each point can be calculated as Each cluster density is normalized with the maximal value max(ρ cl,i ).
For the point with the highest cluster density the distance is set to one, δ i = 1. Now the cluster density and distance of each point can be mapped. Points resembling cluster centers have a higher cluster distance and therefore can be defined.

5.
The sole identification via the presented approach is not sufficient because points of higher cluster density can be falsely selected as centers and centers with low cluster density could be neglected. Therefore, we define a cluster criterion that is penalizing points of high density and short distance, and favoring points of lower density and larger distances, Empirically, the value is set to γ cl = 0.02 + δρ cl . 6.
Finally, clusters are assigned to the identified cluster centers by calculating the cluster boundary d b and allocating all points within this boundary,

System Parameters and Regularity
Regularity describes a uniform distribution of points in an area and is one of three possible distribution types that point processes can form [29]. The two other types are a clustered distribution with points gathering around other points, and a random or Poisson distribution with randomly distributed points (see Figure 4).  With the now prepared point patterns, we can determine what point processes are dominant in empirical structures. In the field of urban studies such investigations commonly utilize the Average Nearest Neighbor (ANN) index by [30] which is able to quantitatively describe the regularity of a point pattern [16,31,32]. A major drawback of the ANN is its dependency on the chosen excerpt size. As we intend provide a holistic perspective on regularity and systems parameters, we make use of the pair correlation function (PCF) typically employed in the field of ecology [33].

Preprints
The PCF and its related statistical methods (K and L functions [29,33]) are used to qualitatively describe the distribution of plant populations [21] or termite mounds [34]. The function is a normalized, gradient form of Ripley's K-Function [35] and is defined as follows, The PCF depicts the intensity of points appearing when a circular ring is spanned around a central point. The intensity is than compared to a estimated intensity of a random point pattern. If the actual point pattern is randomly distributed, than the PCF takes the value 1.
If the distribution is clustered, than this value lies above 1 at first and approaches 1 with larger radii of the circle ring. In case of regularity, the function value starts beneath 1, than rises above 1 and afterwards approaches 1 as well (see Figure 5) [29]. In order to analyze a whole region, the observed radii are extended to the edges of each excerpt and several settlement objects are taken as starting points for the PCF. The resulting graphs are merged into one, resembling the overall distribution features. One of advantages of the PCF is that several, in this case spatial, system parameters can be determined directly from the function. Especially, in the case of regularity, the function is able to provide parameters shown in Figure 6 and explained in Table 1 [29,34].

Parameter Interpretation
r 0 minimum inter-point distance r 1 distance from typical point to near neighbors r 2 distance from typical point to regions with a small number of points beyond the nearest neighbors r 3 range of most frequent longer inter-point distance r corr size of the observed point pattern As can be seen, information on all relevant spatial parameters are provided. A question left to address is, how the actual point process can be determined from a PCF of a real structure, as Figure 6 is showing an idealized function. Here, we are using the idealized function as a benchmark from which we have determined four key criteria (see Figure 6 as well): 1.
One of the function values before the most frequently occurring distance r 1 must be below 1. This rules out the possibility that the distribution is clustered, since in this case the intensity at small r is immediately significantly over 1. However, this does not yet allow to state whether the distribution is random or not. 2.
The maximum function value must be significantly above 1, whereby randomness can be excluded, since the function values there move around 1. By this a clustered distribution cannot be excluded, because its maximum function value is also significantly above 1.

3.
To excluded that a so-called clustered hardcore process [29] is present, it must be verified that g(r > r max ), i.e. the function values after the maximum function value at r max , do not remain above 1. Such a hardcore process is characterized by a clustered distribution with points appearing only at a certain distance from the central points.
In this case, condition 1 would be satisfied and an apparent regularity would be erroneously assumed. This is avoided by this condition.

4.
Finally, it must be evaluated whether the function values are statistically significant or lie in the range of absolute randomness, also called complete spatial randomness (CSR) [21,34]. To exclude the CSR case, a significance envelope is calculated. For this, a large number of random distributions of equal intensity (points per area) are generated and the 95 percentile is considered. All values that are within this value can be considered as not significant and therefore random. The analysis of these criteria allows us to qualitatively determine the regularity of settlement structures as each time step provided by the satellite data. Our aim is not only to analyze the overall structure but to gain an in-depth view on the spatial distribution of settlement objects. Therefore, we choose to additionally investigate the dependence of regularity and system parameters on the minimal size of considered objects. Consequently, at each time step a size (meaning area or population) threshold is applied where smaller objects are neglected. This threshold is increased subsequently, in a way that we reduce the the amount of included objects in the regularity analysis while increasing their average size (Figure 7). If this is executed for all regions at all time steps, we can study which point process is present at which object size and how it evolve over time. This proceeding can be compared to a wavelet analysis.

Results
In our investigation we concentrate on regularity and relevant spatial systems parameters in respect to the size or population of the involved objects over time. Firstly, we analyze regularity by creating contour diagrams for built-up and population density structures for all regions. Secondly, we also examine how spatial parameters can be identified focusing on r 1 , the typical nearest neighbor distance, depending on object size and time.

Interurban Regularity
The analysis of the built-up settlement structures shows that urbanization is proceeding as the overall maximal size of structures increases over time [3]. Furthermore, it depicts the aforementioned inhomogeneities of regions of the Global South as the object size ranges in several orders of magnitude with the biggest urban areas in India. If smaller urban structures are also considered, the areas are usually classified as clustered. Despite these differences, our investigation shows that regularity is ubiquitous in the selected regions (see Figure 8). Notably, regularity is occurring in form of bands that can be regarded as quasi-static as they are not changing significantly in size or over time (see Table 2).
A deviation in this matter can be found when comparing the amount of bands, as all regions have only one band appearing at the described 30%-line except for India with three bands. Furthermore, regarding the contour diagram of Nigeria, the chart does not show an uninterrupted regularity band but the detected point pattern distribution is changing from regularity to randomness over time.
Examination of the other occurring point distributions also reveals a difference between the region of India and the other regions: If all objects are included into the point correlation analysis the patterns are regular. In contrast, for the other regions smaller objects tend to arrange around larger ones displaying a clustered settlement distribution.
We now extend the analysis to population density structures with the results presented in Figure 9. Inspection of the charts shows that the overall population of objects is stagnating or falling. Moreover, bands of different point distribution types can be found which also behave quasi-statically. Having said this, further findings are contrasting with those in built-up structures: Bands of regularity can only be found in the density structures of India, apart from that no connection between regularity and size can be found, neither in specific population thresholds nor as a correlation to the maximal object population. Despite this, the main deduction is that regularity can be found in this structures as well. In our opinion, the differences in the results obtained are linked to the small sample size in regards of spatial and temporal resolution and therefore definite conclusions can not be drawn.

System Parameters of Interurban Structures
In order to analyze the system parameters, we determine the distances between settlements that denote the highest intensity in the PCF, depending on the applied lowest size threshold for each region at every point in time. As already mentioned, this distance plays a significant role in regular point patterns but can also be used to describe non-regular patterns, although not providing a comparable informational content. For the investigation, the charts in Figure 10 and Figure 11 denote the distances r 1 that are correlating with regularity. To provide a clear representation, the charts are showing an averaged distance and active point distribution over all points in time in respect to the lower size threshold. Additionally, the averaged graph is accompanied by four parameter graphs at different points in time (cumulated time periods for built-up structures and all four time points for population density structures).
Beginning with the settlement structures in Figure 10, we firstly identify what one would suspect intuitively, that with the increase of the area threshold the distance between settlement is increasing (decrease of distance at higher thresholds is attributed to the decreasing amount of objects and edge correction deviations). In three of four regions (India, Egypt and Argentina) we are able to detect distances in connection with regularity, meaning system parameters of this structures. Here, corresponding to the findings in the previous section, we investigate the parameters at the 30 % of the maximal occurring object area with the interval of ±5 % around this mark and in the object area interval of 3 · 10 5 m 2 -1 · 10 6 m 2 . For the first range, we find that in the region of India two regular distances can be identified at 42 km and 44 km. For regions of Egypt and Argentina multiple distances are found between 20 km and 35 km for Egypt and between 20 km to 40 km for Argentina.
Regarding the second aspect we find, that for regions with regularity in this interval a second typical value lies at the 20 km mark. For India, we find an accumulation of points between 18 km and 22 km, for Argentina three distances in this range are found with an additional stagnation of the average distance graph. Therefore, we derive that all settlements in a region larger than the mentioned threshold are arranged regularly in a distances between 20 km and as 40 km. Notably, the region of Nigeria does not correspond to this findings, as no average regularity is identified and the distances lying in the observed size threshold intervals range between 3 km and 8 km.
Besides, by analyzing the four temporal graphs we find that the distances between the settlements are shrinking. The graphs generally depict a similar behavior but distances are successively lower, as exemplary for region India the peak distance is 51 km between 1994 and 2000, and 48 km between 2013 and 2018 being an indicator of the continuous urbanization (for all regions see Table 3). In contrast, the inspection of population density structures (see Figure 11) reveals that distances that correlate with regularity occur in all regions. The majority of this distances can be found in the calculated 5 % envelope around the 30% line. Similarly to the contour diagram, the region of India additionally exhibits a cumulation of distances outside this interval as regularity is found for a majority of lower thresholds. Moreover, the region of Nigeria shows regular distances within the interval lying in a range of 1 km up to 5 km.
Despite this, when compared with built-up structures, the population density structures do not show a specific trend: The range in which regular distances appear is significantly larger as they vary between e.g. 4 km and 50 km in the region of India (for all ranges see Table 4). Furthermore, the determined tendency towards smaller distances between settlements is not identifiable for population density clusters. These findings can be attributed to the small temporal sample size of the data. Thus, the average distance results are more sensitive to fluctuations of individual graphs.

Discussion
In this work we propose and carry out a novel approach in order to identify regularities and with it associated system parameters in spatio-temporal data of interurban structures. We apply methods commonly used in plant ecology by utilizing the pair correlation function and determining regularity. Our investigation shows that regularity can be found ubiquitously in all investigated interurban, rural dominated, settlement structures of the Global South. Moreover, those findings are independent from the type of interurban structures but are more prominently represented in built-up structures. Additionally, we show that regularities are quasi-static despite the dynamic changes in population and grade of urbanization of the investigated regions. We further observe that regularity is strongly linked to the size of observed objects, and can be found when objects mostly larger than approximately 30 % of the maximal object size are included. Conclusively, we are able to identify, in this case, spatial parameters describing these interurban structures. When considering built-up structures it can be seen that objects arrange at a distance of approximately 20 km -40 km to each other. Thus, when designing telecommunication, energy or educational infrastructure, as well as public transportation, public decision makers should acknowledge the presented scale as [36] have shown. They state that "the level of infrastructure development in secondary cities was found to be somewhat related to the distance from the primary city" in Pakistan. The determined distance can therefore be used to propose guidelines where to allocate funding for infrastructural projects to achieve sustainable development. Unfortunately, such guidelines can only be formulated on the basis of built-up structures and not population density developments as the limited amount of time points does not allow to draw comparable implications. Despite this, population density structures seem to be promising as they exhibit regularity as well. At the time, improved data sets will be available, we already have given a tool allowing such an analysis which is able to provide more insight in scales of population growth in the Global South. Remarkably, our system parameter forms a counterpart to the already detected intraurban scale found by [12]. This indicates that urban development in the Global South takes place on specific spatial scales independent of region or cultural affiliation .
Nevertheless, we are aware that the sole detection of spatial regularity and spatial system parameters is not sufficient to describe the processes of urbanization holistically. The performed abstraction of interurban structures as point patterns causes a negligence and reduction of information and therefore is a decisive simplification. However, the finding of spatio-temporal regularities is important in the development of urbanization models, as it indicates that rather simple approaches could be used to describe these processes [22]. With this perspective on urbanization processes we are able to support the frequent application of regularity in the development of urban models.