A Comparative analysis of cities and towns in the Seoul Metropolitan Region: Integrating landscape metrics and census data

Urban form is associated with both socio-economic and urban physical properties. This study explores the differences among urban forms in the Seoul Metropolitan Region with a comparison between census-based socioeconomic variables and landscape metrics computed from remotely sensed data. To accomplish this, factor analysis and multidimensional scaling were used with the selected variables and metrics. When all of the measures are considered together, four types of cities and towns emerged: 1) exurbanfragmented high growth, 2) exurban-fragmented low growth, 3) compact-extensive urban core and 4) sub-urban compact-high growth. The results indicate that the fusion of knowledge of the physical urban layout and that of socio-economic characteristics is beneficial for a better understanding of urban spatial patterns. However, there remain challenges in delineating each urbanized area and with indicator selection for comparing urban form across cities and towns.


Introduction
As the urbanization process increases, more people will live on less land at higher density, creating social problems. The urban environment belongs among the most dynamic systems on Earth. Several decades of population explosion and accelerating urban growth have had profound environmental and socioeconomic impacts felt in both developing and developed countries alike (de Sherbinin et al., 2002;Longley, 2002). The processes of urban changes are associated with both internal factors, such as simple population increase, and external factors such as institutional regulations and globalization. During the period of urban growth these factors interact with each other. Of critical importance is linking urban patterns to their driving socio-economic and urban physical forces.
The goal of this study was to explore differences among urban forms through comparison with socio-economic variables and remote sensing produced landscape metrics for the 31 secondary cities and towns in the Seoul Metropolitan Region. South Korea. The study conducted the comparative grouping of cities and towns with factor analysis and multi-dimensional scaling. These methods were applied to three variable sets: 1) census, 2) landscape metrics and 3) the combined census and landscape metrics.

Integration of remote sensing and census in urban studies
Abstracting urban change for comparison across cities and scales has a long tradition in the field of geography. Historically, geographers have examined how and why areas or spaces are the same or different. Urban geographers seek to understand and identify regular patterns of urban development based on environmental, demographic, socio-economic or political trends. The goal is to identify the laws which it is believed govern the observed spatial arrangements. These originally were expressed in the concentric zone theory by Burgess (1924), the sector theory by Hoyt (1939), the multiple nuclei theory described by Harris and Ullman (1945) and in von Thunen's bid-rent theory (1826). Geographers tested hypotheses derived from those models, analyzing census tract data with multivariate statistical methods (Johnston, Gregory, Pratt and Watts, 2000). While census data provide a good statistical view of urban pattern based on enumeration units, the actual spatial patterns of urban infrastructure are increasingly being captured by remotely sensed data.
With the advent of high resolution satellite imagery and more advanced image processing and GIS technologies, remote sensing has begun to play a substantive role in measuring and monitoring urban patterns. There is a range of ways in which remote sensing techniques are being extended and developed for use in urban applications. One common approach is the analysis of the spatial pattern of the various land use and land cover categories from classification by image processing. In addition, landscape metrics have been used for urban applications with remotely sensed imagery (Feng et al. 2015;Hasse, 2003;Herold et al, 2001;Jat et al, 2008;Ji et al, 2006;Liu and Yang, 2015;Schneider et al, 2008;Schwarz, N., 2010;Siedentop and Fina, 2012;Siedentop and Fina, 2010;Sudhira et al, 2004;Sun, C. et al, 2013;Torrens et al., 2000). Landscape metrics are scene, class and patch-based statistical descriptors of the spatial forms that make up the landscape. Research has shown that they can be descriptive of land use and other features of urban form. Although progress in the use of landscape metrics in measuring urban form though the development of urban sprawl indicator, integrating landscape metrics with demographic and socio-economic measures remains largely unexplored. Only a few studies (e.g. Benza et al, 2016;Toit and Cilliers, 2011;Weeks, Larson and Rashed, 2003) have used landscape metrics and census-based demographic variables in urban pattern analysis.

Study area
The Seoul Metropolitan Region (SMR) has a population of 25.6 million (2016) and is ranked as the fourth largest metropolitan area in the world. It occupies 11.7% of Korea's territory, and has played a significant role in Korea's urban development. Since the economic development and the consequent rapid urbanization of the 1960s, the development and formation of secondary cities in SMR have been centered on the development of Seoul itself. The SMR contains the Seoul city's administrative district, Incheon city's administrative district and Gyeonggi Province (Figure 1). The gross area is about 11,704 km 2 .
The subject of this research is 21 cities (Shis's) and 10 towns (Guns) of Gyeonggi province. After the peak of Seoul city's population growth was over and its housing capacity reached its limit, another phase of urban growth began after a turning point in the early 1990s in the SMR. Through the 1990s, the government continued urban renewal and housing redevelopment projects focusing on replacing outmoded houses mainly with new apartment units (Kim, 2014). These projects reinforced the urban core toward a more compact development in Seoul. On the other hand, this contributed to the overcrowding of Seoul and encouraged people to move to sub-urban areas. The population of Seoul has decreased since the 1990s, but the SMR is growing faster than any other region in Korea. In particular, the construction of five new towns in the late 1980s and early 1990s, and large-scale housing development projects of the 1990s accelerated the growth of suburban areas. As a result of the different population growth rates, according to the distance to Seoul in the SMR, each city or town has a different landscape of housing types. More populated regions are more apartment-dominated and less populated regions are more housedominated. Apartment dominated areas are generally closer to Seoul.

Statistical techniques to characterize urban growth
We are interested in testing statistically derived ideas about urban growth patterns and methodology for integration with landscape metrics computed from Landsat imagery. We classified the cities with census data and landscape metrics by using the multivariate methods of factor analysis and multidimensional scaling. The general framework of the statistical methods is presented in Figure 2. We analyzed variables that have been used to profile Korean cities (Cho, 2015;Cho and Yim, 2001;Kim, 2014;Kim and Sakong, 2006;Park et al, 2009;Sakong, 2004). We determined which variables are suitable for analysis and which common factors have driven the urban form by using factor analysis. Then using multidimensional scaling we categorized the cities into groups.

Variables for Socioeconomic Analysis
We analyzed 22 variables that fall under the categories of population, social, economic, spatial and institutional characteristics. Table 1 lists the variables by 5 factors which include population, social, economic, spatial and institutional factors for analysis. The data were collected for 31 municipalities in Gyeonggi Province, Korea for the year 1999. The main data sources were government publications: the Population and Housing Census Reports by the National Statistical Office; and the Statistical Yearbooks by local governments.

Variables for landscape metrics analysis
For landscape metrics analysis, a cloud-free Landsat TM scene of the Seoul metropolitan region from 2000 was used. Reference data used in this study included: (1) digital topographic maps derived from the Korean National Geographic Information Institute; (2) land cover maps of the SMR for 1987-1999 generated by the Ministry of Environment of Korea at scales of 1:5000 and 1:25000; (3) digital thematic maps derived from the Korean National Geographic Information Institute. To capture the complex dimensions of urban patterns, fifteen landscape metrics were used computed from the land use map showing patches of similarly classified land: Class area (CA), percentage of landscape (PLAND) number of patches (NP), patch density (PD), largest-patch index (LPI), total number of edges (TE), edge density (ED), landscape-shape index (LSI), the area-weighted mean patch size (AREA_AM), mean patch area (AREA_MN), mean patch fractal dimension (FRAC_MN), perimeter-to-area mean fractal dimension (PARA_MN), perimeter-to-area fractal dimension (PAFRAC), mean Euclidean nearest-neighbor distance (ENN_MN), and contagion (CONTAG) . The descriptions and units of these measures are listed in Table 2.

Factor Analysis
Factor analysis is used to discover the underlying dimensions of a set of interrelated variables. Factor analysis creates groups of metric variables (interval or ratio scaled) called factors. A factor is an underlying quality found to be characteristic of the original variables. The first factor explains most of the variance in the data, and each successive factor explains less of the variance. There are different factor extraction methods. Principal components analysis was used in this study. We used eigenvalues associated with a factor to indicate the substantive importance of that factor to decide whether a factor is statistically important. Factors whose eigenvalues were greater than 1 were extracted. Each factor can be viewed as independent, but correlated aspect of the urban form.

Multidimensional scaling
Using two factors from the factor analysis, Multi-Dimensional Scaling (MDS) was conducted to provide a visual representation of the pattern of proximities (i.e., similarities or distances) among cities and counties in the SMR. The data were measured with an "objective" similarity measure such as total population and distance to Seoul. The data represent the degree of similarity of pairs of objects. In this study, these objects were the 31 municipalities in the SMR. CA the sum of the areas (m 2 ) of all patches of the corresponding patch type, divided by 10,000 (to convert to hectares) Hectares CA > 0, without limit PLAND the sum of the areas (m 2 ) of all patches of the corresponding patch type, divided by total landscape area (m2), multiplied by 100 (to convert to a percentage) Percent 0 < PLAND ≤ 100 NP the number of patches of the corresponding patch type (class). None NP ≥ 1, without limit. PD the number of patches of the corresponding patch type divided by total landscape area (m 2 ), multiplied by 10,000 and 100 (to convert to 100 hectares). Note, total landscape area (A) includes any internal background present. The points are arranged in a space so that the distances between pairs of points have the strongest possible relation to the similarities among the pairs of objects. That is, two similar towns are represented by two points that are close together, and two dissimilar towns are represented by two points that are far apart. The space is usually a two-or threedimensional Euclidean space (Young, 1984).

Census data analysis
As a first step, we applied factor analysis to the 22 census variables and 21 subject areas. We took out 10 variables which had a low contribution to the explanation of total variance and had a high correlation with other variables. The results (Table 3) from the KMO and Bartlett's Test confirmed our sampling adequacy as satisfactory (greater than 0.5) and the strength of the relationship among variables was strong, so the data are valid to proceed to a factor analysis. The final two factors extracted account for 40.1 percent and 39.9 percent each of the total variance. Factor 1 explained house type, cultural infrastructure, distance to Seoul, financial independency rate and factor 2 accounted for population, urbanization, and industry (Table 4). Sig. .000 The distances between each case and every other were generated by MDS. The Euclidean distance model constructed a configuration using the two major dimensions of a standardized version of a distance matrix. This configuration captured almost all the variation in the census variables, having a goodness of fit statistic of 100%. We plotted the distance model in each of the 31 community areas to give an idea of these configurations under various metrics.
This resulted in four groups of cities and counties ( figure 3). The cities and counties in the first group (the top right on figure 3) have medium sized population and urban area, and are located further from Seoul with an average distance of 57 km. There is not much conserved natural area or land development regulation so more than 35% of the land is developed for manufacturing industry and housing. These communities have a relatively high financial independency rate of 60.6 %.
The second group of cities are clustered at the top leftof Figure 3. This cluster has low (less than 5%) urbanization, low population, low financial independency (40%) and low employees in all industry sectors, and it is further from Seoul and dominated by houses rather than apartments. The third group at the bottom right shows it has a long history of urbanization, slow urbanization, high population, high urban area, high financial independency rate (91.2%) and is relatively more apartment-dominated and closer to Seoul.
In this group, the commuting-in-town-rate is relatively low at 59% and more than 40% of people work in Seoul. The last cluster at the bottom left shows rapid urbanization, high population and a relatively high financial independency rate (69.8%). These communities are apartment-dominated and close to Seoul. The commuting-in-town-rate is very low at 44.7%. More than half population commutes to Seoul for work and school.

Landscape Metrics
We applied factor analysis to test the 15 variables (landscape metrics) and 31 subject areas. We took out 8 variables which had a low contribution to the explanation of total variance, or had a high correlation coefficient with other variables. The results (Table 5) from the KMO and Bartlett's Test confirmed the sampling adequacy as satisfactory (greater than 0.5) and the strength of the relationship among the variables was strong. The data was valid to proceed with a factor analysis. This factor model specification was characterized by two common factors that accounted for 51.5 percent and 33.1 percent each of total variance (Table 6). Factor 1 collects four items; LPI, PLAND, Mean Patch Size and LSI. Factor 2 collects three items; CONTAG, Edge Density and Mean Euclidean Distance.   The Euclidean distance model constructed a configuration using the two major dimensions of a standardized version of a distance matrix, generated by MDS. This configuration captures almost all the variation in the landscape metrics, having a goodness of fit statistic of 100%. We plotted the distance model in each of the 31 community areas to give an idea of these configurations under various metrics. The result revealed four groups of cities and towns (figure 4). The cities in the first group (the top right on Figure 4) have the lowest urban percentage (4.9% on average) and highest fragmentation among the four groups. These communities show a small LPI, small mean patch size and a big LSI. The second group of cities are clustered at the top left of figure 4 with the second lowest urban percentage (9.3% on average) and relatively high fragmentation with high LSI and mean patch size. The third group at the bottom right has the highest percentage of urban area (45%) and the lowest fragmentation status with big LPI and mean patch size and low LSI.
The last cluster at the bottom left shows a relatively high urban percentage (30.9%) with a relatively low fragmentation status.
There are four main differences in the grouping in this landscape metric analysis compared to the census data analysis. First, cluster 1 and cluster 2 from the census study were changed in the x-y coordinates of the dimensional distribution. The first group of cities from the census study exhibited larger population size and more developed urban areas while the second group's cities have the smallest (less than 5%) urbanized area. The more urbanized cities are expanding in a less dispersed fashion. These groups were in different locations in the dimensional distribution of cities using landscape metrics. Secondly, Goyang belonged to cluster 3 in the census analysis but is now assigned to cluster 4, separated from the other cities. The third cluster from the census study shows a long history of urbanization, slow urbanization, high population, large urban area and a high financial independency rate (91.2%). Goyang city has a lower amount of urban area and relatively shorter urbanization history among those cities in that cluster. In the metrics analysis, Goyang moved to the other group due to its urban size. The third change is Osan city. Osan city belonged to cluster 1 (i.e. the smallest urban area) in the census study and is assigned to cluster 4 in the landscape metrics study. This city has a small population size but a larger urban developed area and a less fragmented landscape with high LPI, low LSI and bigger mean patch size. The last difference is Uijeongbu city. This city belonged to cluster 4 in the census study in the group of cities where we observed fast-growing, medium-sized population and apartment-domination. In the metrics study, it was classified into cluster 3 which exhibited the least fragmented landscape. Uijeonbu has been growing in a less dispersed pattern than the other cities with the same size population and urban area.

Integration of Census and Metrics Analysis
As a final step, we applied factor analysis to 19 variables and 21 subject areas. The results (Table 7) from the KMO and Bartlett's Test confirmed our sampling adequacy was satisfactory (greater than 0.5) and the strength of the relationship among variables was strong. The data is valid to proceed with a factor analysis. Two final factors were extracted that account for 38.6 percent and 35.8 percent each of the total variance (Table 8). Factor 1 includes house type, cultural infrastructure, distance to Seoul, financial independency rate, LSI, Mean Euclidean Distance, and Edge Density. Factor 2 accounts for population, urbanization, industry, LPI, Pland, and Area-weighted Mean patch size.   The Euclidean distance model constructed a configuration using the two major dimensions of a standardized version of a distance matrix, generated by MDS. This configuration captures almost all the variation in the census variables and landscape metrics, with a goodness of fit statistic of 100%. We plotted the distance model in each of the 31 community areas to give an idea of these configurations under various metrics. As a result, again four groups of cities and towns were found (figure 5) which resemble almost exactly the dimensional distribution of the census study. The cities in the first group (the right top at Figure 5) are the same as those in the first group from the census study. The other groups of cities have the same grouping from the census analysis. There is one exception in this distribution, Osan city was classified in the same group as in the metrics study. The reason here is that Osan exhibits a similar urban landscape pattern (fragmentation and urban percentage) to the other cities in the cluster even with different demographic, social, economic and spatial factors.

Summary of the four urban forms in the SMR
In this study, an integrated approach of remote sensing based landscape metrics and census variables was used to identify four different urban forms within the Seoul Metropolitan Region. This study revealed that statistical testing of integrated landscape metrics and census data provides a better understanding of urban spatial pattern with the fusion of information about the physical urban layout and of the socio-economic characteristics. Figures 3 and 4 show the dimensional distributions of cities by census data and landscape metrics, respectively. When all the measures are considered together, four types of cities and towns emerged in the Seoul Metropolitan Region ( Figure 5) derived from the integrated census and landscape metrics. The descriptive statistics of the four types are summarized in table 9. Type 1 is labeled as "Exurban-fragmented high growth cities/towns". This type can be characterized by a high fragmentation rate (it has 1551 urban patches compared to 753 on average), small urban land, a relatively large population and total number of employees, average financial independence and longer distances to Seoul. Type 2 is described as "Exurban fragmented low growth cities/towns". Type 2 is similar to Type 1 except for lower total population, numbers of employees and financial independence. All the cities and towns in type 2 are located further from central Seoul ( Figure 6). The average percentage of the urban land in this type is 6% so that land remote from the center remain rural. Type 3 is characterized as "Extensive compact urban core" with high total population, employees and financial independence, relatively high urban land, a low fragmentation rate and closeness to Seoul. Type 4 is described as "Sub-urban compact high growth cities". The cities in type 4 have relatively small population and numbers of employees. However, they have above average-financial independence, 25% urban land, a low fragmentation rate and are closer to Seoul.

Discussion
The discussion addresses the variable selection, the comparison between cities and towns, labeling issues and the relationship among urban form, development level and the distance of the towns to Seoul.

Variable selection
This study presented a statistical selection procedure for quantitative indicators of urban form. Based on a literature review (section 1 and 2.2), all the major indicators from both the census and landscape metrics were used for the statistical analysis. These variables in the original set were appropriate for measuring urban form. Then, we chose selectively relevant variables after running statistical analyses. The final variables used to quantify the urban form have no redundancy among the variables statistically and cover different aspects of urban form.

Comparing cities and towns
The subject of this research was 21 cities (Shis's) and 10 towns (Guns) of Gyeonggi province. Urban form does not comply with these administrative units between cities and towns in Korea. Towns (Guns) have large distances between them and are more comparable with counties in the U.S. in terms of their size and administrative system.
Comparison of urban land across cities and towns may introduce bias due to their different boundaries. Figure 6: Map of the four types of cities and towns in SMR, Korea However, some towns (Guns) have gained city-status after the law on requirements for cityhood was loosen in 1995. Only three towns (Guns) remain as towns. These are Gapyeong, Yangpyeong and Yeoncheon. When considering such a complex and dynamic change over the definition of a city in Korea, differences in measurement are inevitable across variable size units.

Labeling
Socioeconomic conditions, urban development level and the fragmentation rate vary in cities and towns in the Seoul Metropolitan Region. Despite this, we were able to identify four types of urban forms, which we labeled "Exurban-fragmented high growth (Type 1)", "Exurban-fragmented low growth (Type 2)", "Extensive compact urban core (Type 3)", and "Sub-urban compact high growth (Type 4)". Most cities and towns in the outer areas of the SMR (Type 1 and Type 2) have experienced more fragmentation compared to the cities and towns closer to the city of Seoul (Type 3 and Type 4). Cities and towns of Type 1 show relatively high numbers of employees and financial independence, hence the term "Exurban-fragmented high growth" is suitable for this type. All the cities in Type 3 and Type 4 are located closer to Seoul and show less fragmented urban patterns. The differences between type 3 and type 4 are their urbanization history, size of urban area, total population and total number of employees.

Urban form, development level and distance to Seoul
The correlation between the fragmentation level of the landscape and the development level confirms the large difference in economic conditions (Table 9). The cities at the mature development stage (Type 3 and type 4) show less fragmented landscapes compared to exurban cities (Type 1 and type 2). The increase of individual urban patches and expansion into open spaces continued toward the later stages of urban growth (Type 3 and type 4). On the other hand, the metrics (urban land percent and fragmentation measures) for types 1 and 2 indicated a small urban core at the initial state of urban growth. In our study, the distance from each city/town to Seoul has a direct impact on its development. If we consider the Seoul Metropolitan Region as one organic city, Seoul grows from its core center, fills in gaps between the core and the closer urban centers, and expands outwards to exurban centers. These 31 cities and towns play as organic part of the metropolitan growth. Seoul is still growing outwards, resulting in agricultural land loss and urbanization in rural areas. The development state of each city and the distance to Seoul are interconnected with their urban form.

Conclusion
This study aimed at comparing urban form across 31 cities and towns in the Seoul Metropolitan Region with the integration of census data and landscape metrics. We determined which variables are suitable for analysis and which common factors have driven the urban pattern by using factor analysis. Then, we categorized the cities into groups using multidimensional scaling. When all the measures are considered together, four types of cities and towns emerged in the Seoul Metropolitan Region: 1) exurbanfragmented high growth, 2) exurban-fragmented low growth, 3) extensive compact urban core and 4) sub-urban compact-high growth. An important finding from the four types is that cities closer to Seoul are more compact and denser than those further from Seoul. This can be explained as the Seoul Metropolitan Region functioning as one organic city. There are a few exceptions (Goyan city, Osan city and Uijeonbu city) within the classifications by census, landscape metrics and combined variables due to their strength of fragmentation compared to socio-economic conditions. The results indicate that the fusion of knowledge of the physical urban layout and knowledge of socio-economic characteristics may be beneficial for better understanding of urban spatial pattern.
Future research on these patterns may benefit from several refinements. First, the urban form of Korean cities and towns should be compared across different ways of delineating urbanized areas by remote sensing. Differences in measuring what is "urban" can have considerable impacts on the resultant urban form. This challenge applies to any attempt to measure urban form. It is possible that using more disaggregated spatial units of analysis would resolve some of the definitional discrepencies. Second, the extension to other metropolitan regions at a similar size and influence of the central city may help us to understand how global cities can grow and impacts the urban form of the surrounding cities and towns over time.