Mapping built-up land with high accuracy using Fourier transformation and temporal correction

Long-term, high-accuracy mapping of built-up land dynamics is essential for understanding urbanization and its consequences for the environment. Despite advances in remote sensing and classification algorithms, built-up land mapping using early satellite imagery (e.g., from the 2000s and earlier) remains prone to uncertainty. We mapped the extent of built-up land in Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 4 December 2020 © 2020 by the author(s). Distributed under a Creative Commons CC BY license. doi:10.20944/preprints202012.0105.v1


Introduction
Economic development and population growth have led to drastic changes of the Earth's terrestrial surface, not least through the expansion of built-up lands (Elmore et al., 2012), with urbanization continuing to accelerate in developing countries (United Nations, 2014). Built-up land is defined as land cover comprising more than 50% human-made structures such as roads, buildings, and agricultural and industrial facilities (Schneider and Mertes, 2014). Built-up land plays an essential role in material and energy cycling between humans and the environment, such as water and carbon cycling Hou et al., 2020;Wang et al., 2018), pollution (Shrivastava et al., 2019;Yue et al., 2020), agricultural production (Brown, 1995), biodiversity conservation (Filazzola et al., 2019), ecosystem services Ye et al., 2018) and climate change, both at micro and macro levels (Lamb et al., 2019).
Accurately tracking the dynamics of built-up land is essential to linking human activities with ecological, environmental, and climatic impacts and supporting policy and planning for sustainable development (Vannier et al., 2019). Tracking built-up land dynamics with high accuracy over time is a significant challenge because mapping for earlier years (e.g., the 1990s) is often less accurate than for more recent times (e.g., 2010 onwards). For example, Landsat Operational Land Imager (OLI, with data available from 2013) showed higher classification accuracy than Landsat Thematic Mapper (TM, with data available from 1984 to 2011) when mapping urban areas in East Attica, Greece (Poursanidis et al., 2015).
Inconsistencies can also arise when mapping through time, because bare soil has a similar spectral character to built-up land and is commonly misclassified (Gong et al., 2020;Li et al., 2015;Li and Gong, 2016). In addition, random noise such as cloud and cloud shadows can also lead to inconsistencies in the built-up land mapping (Foga et al., 2017). Unfavorable atmospheric conditions such as cloud cover can compromise satellite observation, hence built-up land mapping may require additional images from a broader temporal range . Increasing classification accuracy in earlier years, removing inconsistent classification, and using an appropriate time frame is therefore crucial for mapping built-up land with high accuracy through time.
Open-data policies combined with advances in computation facilities and innovative algorithms have enabled built-up land to be mapped at high resolution across larger extents, at greater frequency, and over longer periods than before . Two strategies are typically used to increase mapping accuracy and reduce inconsistencies: 1) integrating multisource data and 2) using temporal consistency correction. For example, the Visible Infrared Imaging Radiometer Suite (VIIRS) nighttime light (NTL) has been used as a binary mask to exclude non-urban land (Gong et al., 2020;He et al., 2019;Liu et al., 2019). Sentinel-1 Synthetic Aperture Radar (SAR) data have been merged with Landsat data to increase classification accuracy (Gong et al., 2020;Zhang et al., 2020). The tendency of built-up land to not revert to nature or agriculture (i.e., its irreversibility) has also been used to correct inconsistent multi-temporal classifications (Li et al., 2015) and produce stable and reliable control points . However, the accuracy of built-up land mapping in earlier years is typically lower than that based on more recent remote-sensing data (e.g., after 2013), possibly due to the spectral confusion of early data (Poursanidis et al., 2015).
Spectral features and vegetation indices have been used to map built-up land, but temporal features such as land surface phenology have typically been overlooked (Jönsson et al., 2018). Generally, temporal features are derived from indices such as the normalized difference vegetation index (NDVI) using a smoothing method (Wang et al., 2017), such as a logistic model (Elmore et al., 2012), a Savitzky-Golay filter (Chen et al., 2004), quadratic functions (Beurs and Henebry, 2004), and Discrete Fourier Transform .
Although temporal features have been coupled with change-detection methods to determine when other land types were converted to built-up land , they have been infrequently adopted as mapping predictors . Because temporal features capture relatively stable greenness patterns following interannual plant growth cycles, we hypothesize that they could also mitigate the spectral confusion of earlier satellite data and increase the accuracy of built-up land mapping.
This study aims to accurately track built-up land dynamics at three-year intervals from 1990 to 2019 using dense-stack Landsat data and temporal correction. The North China Plain region was chosen as the study area because of the fierce competition between urbanization and farmland over land resources (Jin et al., 2019). First, we used the Discrete Fourier Transformation to derive temporal features based on dense-stack indices of Landsat data (Odenweller and Johnson, 1984;Song et al., 2016). Second, we tested the performance of different predictors by adding those sequentially to the classification. A temporal correction algorithm was then deployed to remove inconsistent pixel classifications. Finally, we conducted a cross-product comparison to assess our results against other datasets (Stehman and Foody, 2019). Long-term, high-resolution, and high-frequency built-up land maps with high accuracy can be used as reliable inputs to understand regional urban development and link social-economic change to environmental impacts.

Methods
The approach taken in this study is summarized in Figure 1. Due to its high computational performance and vast historical satellite imagery, Google Earth Engine was used to process all remotely sensed data and map built-up land (Gorelick et al., 2017). Control points were visually checked using Landsat images from 1990-1992 and Google Earth highdefinition images (from GeoEye, WorldView, SPOT, and the Pleiades) from 2014 and 2019.
We randomly held out 25% of the control points as validation samples. Cloud-free images, indices images, temporal features, terrain, and meteorology data were sequentially added to a Random Forest (RF) classifier. A temporal correction algorithm was then applied to remove inconsistent classifications. Lastly, a cross-product comparison was carried out using the holdout samples.

Study area
Five middle and eastern provinces of China corresponding to the North China Plain region were selected as the study area ( Figure 2). The area spanned 780,000 km 2 and five provinces (i.e., Henan, Hebei, Shandong, Anhui, and Jiangsu) and two municipalities (Beijing and Tianjin). The study area is one of China's fastest developing regions, with the urban 7 population rate (excluding the two metropoles) tripling from ~20% in 1990 to ~60% in 2018 (National Bureau of Statistics of China, 2019b). The North China Plain holds a strategic position in China in terms of economic development and food security (Song and Deng, 2015), generating ~37% of the gross domestic product and ~35% of China's grain production in 2019 (National Bureau of Statistics of China, 2019aChina, , 2019b. The tension between rapid economic development and food production in the study area therefore calls for accurate quantification of built-up land dynamics to support policy formulation and decision making .

Data and input predictors
We used five types of remotely sensed data as predictors to map built-up areas (   and terrain data were taken from the Shuttle Radar Topography Mission. The spectral predictors were cloud-free images produced from Landsat and Sentinel-2A, the detailed data information can be seen in the supplementary information 1.1 (SI-1.1).
Cloud-free Landsat images were produced using the ee.Algorithms.Landsat.simpleComposite module in the Google Earth Engine platform. For each pixel in the collection of Landsat images, this module computed a cloud score (0-100) and derived the median value from pixels with a cloud score <10 to create a cloud-free image. For the cloud-free image of Sentinel 2 MSI, the Quality Assessment band, indicating whether the pixel was covered by cloud and cirrus, was used to remove cloudy pixels. After removing cloudy pixels, the median value of the remaining pixels was mosaicked to create a cloud-free image.
NDVI, EVI, and NDBI were selected as indices predictors because NDVI and EVI are robust for delineating land covers (Li et al., 2015), and NDBI suits the purpose of built-up mapping . We calculated these indices as follows: where NIR refers to the near-infrared band, R refers to the red band, B refers to the blue band, and SWIR1 refers to the first shortwave infrared band.
The Discrete Fourier Transformation approximates a series of discrete values by summing up a linear function and several pairs of sinuate functions. The fitting formulation was as follows: where is the time difference in year fractions compared to 1970, is the pixel value at time t, is the number of sinuate function pairs, 0 and 1 are the coefficients of the linear function, αk and θk are the sinuate coefficients, is the frequency, and is the error between the actual observation and the fitted value.
In practice, n and data volume should be determined before fitting. We chose n = 3 and used three years of normalized indices data for the transformation to minimize the sum of mean-squared error and avoid overfitting (SI-1.2). Meanwhile, we selected ω = 1 based on the life cycle frequency of most of the vegetation in the study area . The Discrete Fourier Transformation was applied to fit the normalized indices to generate temporal predictors ( Figure 3). Eight coefficients were generated per index, and a total of 24 coefficient bands were created as temporal predictors.

Control point collection
Control points were collected with visual interpretation (SI-2). Built-up and non-built-up control points were collected separately. Raw built-up points were taken from the National Settlements Database of China (http://www.resdc.cn/). Raw non-built-up points were generated via stratified sampling using NDVI. Following visual interpretation, a total of 8,000 control points were collected, with an equal number of built-up and non-built-up points (i.e., 4,000) to reduce bias caused by uneven sample distribution (Stehman and Foody, 2019). A sensitivity test showed that using ~50% of control samples was sufficient to achieve highaccuracy mapping (SI-3.1). Therefore, ~75% of control samples were used as the training sample for mapping (among them, 70% were used in classification and 30% to calculate accuracy), while the remaining ~25% were held out as validation samples for the cross-product comparison.

Classifying built-up land
Random Forest (RF) is an ensemble learning algorithm that was selected for classification due to its flexibility in capturing non-linear patterns between independent and dependent variables (Calderón-Loor et al.). The tree-based structure of RF is also efficient in classifying high-dimensional data . To grow trees for the RF, we set the split nodes as the square root of the number of input predictors, the bag fraction to 50%, and the minimum leaf nodes to 1. The number of trees in the RF was set to 100 because no more improvements could be achieved by further increasing tree number (SI-3.1). In the mapping process, built-up land pixels were allocated a value of 1 and non-built-up land pixels a value of 0.
Predictors were sequentially added to the classification in the following order: spectrum, indices, Fourier, terrain, and meteorology predictors. Traditional classification typically uses spectral and indices predictors, so we first mapped built-up land using those. The Fourier predictors were then incorporated to test classification improvement with temporal features.
Lastly, terrain and meteorology data were added to reduce interference by climate and terrain conditions.
All five types of predictors were used in mapping. We classified built-up land 10 times, each time using a different sample of random control points. In each classification, 70% of the training samples were randomly selected to train the RF classifier, and the remaining 30% were up extent were selected to produce a mask to update the original classification according to a "majority vote" rule (Li et al., 2015). In this study, we set n = 2 by balancing the amount of data used as a mask and resulting improvements in accuracy (SI-4). This temporal correction process was iterated eight times to remove inconsistencies.

Cross-product comparison
We compared our data to other datasets targeting a similar land cover (such as impervious surfaces and human settlements) or including an urban/built-up land layer (Table   2). Built-up area and overall accuracy were used for the comparison. The validation sample (25% of all control samples) was used to compute overall accuracy because it was not used in the original classification of this study, thereby ensuring a reliable comparison (Stehman and Foody, 2019).

Built-up land mapping using different predictors
The performance of different predictors is shown in Figure 5.  Built-up land mapping in 1990-1992 was selected to explore the spatial performance of spectral, indices, and Fourier predictors ( Figure 6). Region 1 (row 1, Figure 6   accuracy (Figure 7b). The Scan-Line Corrector failure of Landsat ETM+ produced inconsistent strips, which were accounted for by the temporal correction (regions A and C). Greenhouses (hazy and light gray patches in region B) were correctly removed by the temporal correction.

Accuracy improvements via temporal correction
Finally, the temporal correction also improved classification quality in hilly and barren areas (region D).

Spatial-temporal change of built-up land
We mosaicked all temporally corrected classifications into one image and used a warmcool color scheme to represent time from 1990 to 2019 (Figure 8). Cities in flat regions (e.g., Baoding, Shangqiu, and Changzhou) expanded radially outward. Cities near rivers (e.g., Xuyi and Xinyang) expanded against those, and cities in mountainous areas (e.g., Fengning) expanded along valleys.

Cross-product comparison
For the spatial comparison, we selected four cities that were evenly distributed across the study area and with datasets with three or more epochs (Figure 10). The low-resolution datasets (ESACCI, Global Urban Expansion, and MCD12Q1) missed many built-up lands in smaller villages and towns. The Global Urban Dynamics used high-resolution images of Landsat as input data, but the binary mask generated from the VIIRS NTL missed out-of-boundary builtup land. However, our data and the GAIA product captured built-up land in large cities and smaller villages and towns. We further compared built-up areas and overall accuracy between our study and other datasets (Figure 11a). Our study computed the second largest built-up area throughout the study period, similar to other high-resolution datasets (i.e., GAIA, GHSL, and GlobeLand30). In the 1990s, our data were in agreement with GAIA and GHSL, while the ESACCI, Global Urban Dynamics, and Global Urban Expansion estimates were significantly lower than ours. GAIA, ESACCI, and our data showed rapid increases in built-up area, while GHSL, MODIS, Global Urban Dynamics, and the Global Urban Expansion showed linear increasing trends.
Due to their similar spatial resolution and land-cover definition, GAIA, Global Impervious Surface, and GlobeLand30 were selected for accuracy comparisons. The accuracy of our study was higher than GHSL and Global Impervious Surface by 10%, and 10-19% higher than GAIA, especially in earlier years. Our accuracy was consistently >94% across years, while that of the Global Impervious Surface and GlobeLand30 were ~85%, and GAIA's accuracy ranged from 75% in 1990 to 84% in 2017.

Discussion
Landsat has a long and continuous archive, which offers a unique opportunity for global and regional land observation (Deng and Zhu, 2020). However, lower quality data in earlier years can lead to poor performance in built-up land mapping and higher uncertainty in earlier points of time series (Gong et al., 2020;Poursanidis et al., 2015) . In this study, we used coefficients from a Discrete Fourier Transformation as temporal predictors and derived an 8% accuracy gain compared to using spectral and indices predictors only. Employing the temporal variables increased mapping accuracy for earlier years, even when using lower quality data (Landsat 5 TM or Landsat 7 ETM+), rivaling the mapping accuracy of more recent, higherquality sensor data (Landsat 8 OLI and Sentinel MSI data). Bare land and farmland were commonly misclassified as built-up land using spectral and indices predictors only (Gong et al., 2019;Liu et al., 2018), but these errors were largely removed following the inclusion of temporal predictors. Our results captured finer-scale built-up features, such as buildings in small villages and towns, rather than solely focusing on large cities. As a result of higher accuracy, our study computed higher estimates of built-up areas than other datasets, except for GlobeLand30.
The effectiveness of Fourier predictors in delineating built-up lands is possibly because features captured in dense time stacks of remotely sensed data are less affected by random noise (e.g., cloud, cloud shadow, and seasonal changes in land surface) than snapshot spectral or indices images. Crop phenology and farming rotations lead to a regular greenness pattern in cultivated sites over the annual growing cycle, which is distinct from that of built-up lands . Temporal predictors were sensitive to this distinction between built-up land and farmland. Similarly, it is difficult for spectral predictors and normalized indices to separate fallow lands from built-up lands, leading to uncertainties in mapping. Incorporating temporal predictors reduces this confusion and substantially increases the accuracy of built-up land mapping.
In addition to incorporating temporal predictors, we also implemented temporal correction to account for the general feature of irreversibility in built-up lands (i.e., once an area is converted to urban land, it tends to remain as urban land use (Li et al., 2015;;. We implemented the rule that built-up land in earlier years was unlikely to occur beyond the extent of built-up land in later years, thereby developing a temporal correction algorithm to remove incorrect pixel classifications. For example, our algorithm removed the misclassified strips caused by the ETM+ Scan-Line Corrector failure. Our method can be deployed on the Google Earth Engine platform and is more straightforward than other temporal correction algorithms. For example, Li et al. (2015) combined a majority vote rule and temporal reasoning to construct a spatial-temporal check algorithm, which required a complicated process to combine transition probabilities and neighborhood characteristics. In this study, we used built-up land maps of future years to iteratively mask out erroneous classified pixels in previous years, which was more straightforward to implement.
As a result of using temporal predictors and temporal correction, we achieved consistently high accuracy in built-up land mapping over a time series spanning 30 years. High uncertainty in built-up land mapping in the 1990s and 2000s has been reported in previous studies (Gong et al., 2020), and GAIA and GlobeLand30 showed lower performance in these years ( Figure 11). Overall accuracy in this study was high and consistent across years (>94%).
The high quality of the data in this study addresses some of the challenges in understanding built-up land development and reduces uncertainties involved in quantifying land-use change induced environmental impacts (Dong et al., 2015). Built-up land occupies only a small portion of the global terrestrial surface but hosts more than half of the world's population . Indeed, 70% of global anthropogenic greenhouse gas emissions in 2016 and 80% of local natural habitat loss in 2018 have been linked to the development of built-up lands (Hopkins et al., 2016;Ke et al., 2018). Therefore, an accurate understanding of the dynamics of built-up land over time is critical to addressing the social and environmental challenges that threaten a sustainable future in rapidly urbanizing areas, including our study area.
Our consistent, high-accuracy data product could be used in urban policy and planning.
For example, urban growth models based on cellular automata use historical data to project future scenarios, but errors in historical data can propagate throughout the projection, reducing confidence in the results (Clarke and Johnson, 2020). This study provides reliable historical data that enables built-up area development to be projected with high confidence.
Environmental change can also be quantified more precisely using accurate built-up land maps.
For example, the Integrated Valuation of Ecosystem Services and Trade-offs (InVEST) model uses land-use maps as a proxy to calculate carbon sequestration, water yield, crop production, and habitat quality (Tallis et al., 2011). High-quality built-up land mapping data can provide more reliable input variables to calculate the anthropogenic impacts of urbanization.
Spatially explicit policies and planning are essential for supporting sustainable development, and one requirement for formulating such policies is access to high-quality data.
The study area is unique in China for its strategic position, rapid urbanization, and essential agricultural productivity (Song and Deng, 2015). To boost the economy of the study area, the Chinese government has announced a series of development plans, such as the Beijing-Tianjin-Hebei Urban Agglomeration development plan (Fang et al., 2019)

and the Central Plains City
Group development plan . These plans include mega-infrastructure projects (e.g., high-speed railways and long-distance expressways) to enhance economic flow among cities . In parallel, to safeguard food security, the Chinese government has also enacted strict farmland protection regulations (e.g., the Basic Farmland Protection Regulations) that prohibit farmland from being converted to built-up lands . Therefore, an accurate understanding of built-up land dynamics is critical to formulating effective development plans that balance rapid urbanization with increased demand for food production (Zhong et al., 2020). Accurate historical built-up data can be used to project future economic development, as well as derive opportunity costs for future urban expansion (e.g., in terms of reduced food security). As a result, urbanization, food security, and sustainability can be coordinated under one framework, promoting the formulation of spatially explicit policies and regulations.
Our study suffers from several limitations and uncertainties. We derived temporal features from the Discrete Fourier Transformation based on three years of indices images.
Hence the exact date of built-up land development cannot be determined at a finer resolution than three years. Another uncertainty was introduced by the temporal correction methodology, which assumes that built-up land in 1990-1992 remained unchanged during the study period.
Small areas of built-up land could have been converted to other land types over time (Fu et al., 2019). However, such conversions typically only comprise a small portion of the total built-up land area (Gong et al., 2020).

Conclusion
In this study, we incorporated temporal features based on a dense time stack of Landsat imagery and a temporal correction method to map the spatial extent of built-up land in the North China Plain (1990-2019). Incorporating temporal predictors increased overall accuracy by 8% compared to using spectral and indices predictors alone. The temporal correction successfully removed incorrectly classified pixels and increased overall accuracy in all periods to a consistently high level (>94%). All provinces and cities in the study region tripled their built-up area over the last three decades, revealing fierce competition between urban and agricultural land uses. Consistent, high-accuracy and long time-series of built-up land data can be used to understand recent patterns of rapid urbanization, quantify impacts for food security and the environment, model future land-use change, and inform policy and planning for managing future urbanization and sustainable development.

Funding
This research was funded by a Deakin University Postgraduate Research Scholarship.

Declaration of Competing Interest
None