An Analysis of Time-Varying Crowding on Subway Platforms using AFC Data in Seoul Metropolitan Subway Network

Management of crowding at subway platform is essential to improving services, preventing train delays and ensuring passenger safety. Establishing effective measures to mitigate crowding at platform requires accurate estimation of actual crowding levels. At present, there are temporal and spatial constraints since subway platform crowding is assessed only at certain locations, done every 1~2 years, and counting is performed manually Notwithstanding, data from smart cards is considered real-time big data that is generated 24 hours a day and thus, deemed appropriate basic data for estimating crowding. This study proposes the use of smart card data in creating a model that dynamically estimates crowding. It first defines crowding as demand, which can be translated into passengers dynamically moving along a subway network. In line with this, our model also identifies the travel trajectory of individual passengers, and is able to calculate passenger flow, which concentrates and disperses at the platform, every minute. Lastly, the level of platform crowding is estimated in a way that considers the effective waiting area of each platform structure.


Introduction
Subway platforms have a concentrated flow of passengers coming to wait for the train or moving through the station. Adequately maintaining a level of crowding on subway platforms not only serves as the basis for management of train operations, but its importance should be highlighted to prepare for the occurrence of accidents. The Seoul Metropolitan Government (SMG) currently focuses only on expanding and improving physical facilities, such as passages, staircases, and escalators within stations. However, since this involves mediocre cost effectiveness, ways to distribute demand during peak hours are needed. With the early-hour public transport discount recently introduced by the SMG serving as a leading example, there is a call for a wider range of such policies that will support demand management.
Crowding on platforms needs to be analyzed at the Seoul Metropolitan Subway Network level, as crowding at a specific station platform is organically connected to all stations on the network. Previous estimations of numbers are less accurate since they rely on calculations (i.e. counting by hand), performed every year or two on a specific location at a specific time. Using such inaccurate estimates as the basis for transport policy will likely result in serious problems.
Using the latest AFC (Automated Fare Collection) and ICT (Information Communication Technology) based smart card data may enable more accurate estimation of hourly platform crowding. The smart transit cards used across the Seoul metropolitan area conveys data on approximately 9 million passengers per day. This is equivalent to roughly 40 million items of weekday raw data and 20 million items of paired data on passenger flow, which can be translated into nearly 15 million individual trip chains. The Seoul Metropolitan Area Subway Network is comprised of 23 lines operated by 3 public enterprises and 7 private operators. Various and extensive data on passenger flow is generated every hour through terminals installed at the entrances and exits of 628 stations. The advantage of this daily flow of data is that it can be used as an alternative to lowcost basic data related to platform crowding on an hourly, daily, monthly and even yearly basis.  In this regard, this study used smart card data to dynamically predict platform crowding through the Seoul Metropolitan Area Subway Network. The word "dynamically" refers to a dynamic flow model, which means prediction of the demand that concentrates and disperses by the minute by identifying the hourly trajectory of all subway passengers. Platform demand was further divided into pedestrian trip for in-station transfers, boarding and alighting from platforms on the same line, and transfers between lines to estimate platform crowding at individual stations.
The platform crowding in this report is based on Guidelines [1~2]. Since these Guidelines propose that adequate passenger facilities be designed towards more efficient use of funding and in consideration of socio-economic aspects, they were used to analyze platform crowding and level of service (LOS). This report suggests how to apply the dynamic crowding model to platform crowding in multiple ways. This report is organized as follows. Chapter 2 reviews previous analysis on public transport crowding and identifies the differences between itself and those studies. Chapter 3 specifies the dynamic crowding model and presents analysis methodologies. Chapter 4 reviews the results of analysis on subway lines and platform crowding produced by a smart card data-based model and Chapter 5 presents a conclusion and discussion on future use of the analysis model.

Review of Previous Studies on Subway Crowding
Mitigating traffic congestion at peak times is an unresolved problem and long overdue task in transport studies. Theories or methodologies related to crowding analysis are widely applied to transport facilities, modes of transport, and passengers. While there have been plenty of studies on road traffic congestion, research on subway crowding has been relatively scarce until recently, mainly because analysis and validation of subway passenger flow relied on onsite surveys or travel patterns by household, which were very expensive and based on less reliable data. In contrast, smart cards provide a raw data record of subway passenger flow, which is easily verifiable and highly reliable. With the widespread use of smartcards since the mid-2000s, plenty of smart card-based studies on subway crowding have appeared.
Analysis of subway crowding is largely divided into on-board crowding and platform crowding. This research looked into on-board crowding, estimated crowding on Seoul Subway's busiest linethe outer circle of Line 2 (Sadang to Samseong) -and proposed a demand management policy based on analysis of passenger OD [3]. To help spread demand between passengers traveling further than Samseong Station and those getting off before Samseong, the study proposed mixed operation of existing circular lines and one-way lines to Samseong Station. However, cases covered in the study are limited to crowding in certain sections of the line and introducing one-way trains is likely to cause passenger inconvenience by increasing travel time and wait time for riders on circular lines. This research proposed a solution to mitigate crowding on trains, but it did not look at mitigating crowding in waiting areas. In this paper, we set out methodologies for estimating passenger crowding costs and set up a route preference analysis system [4]. Based on relevant data ranging from demand, train location, standing passenger density to the chances of getting a seat, the study predicted changes in on-board crowding. Nevertheless, it also fell short of reflecting platform crowding within stations.
As can be seen from these studies, research on on-board crowding did not factor in waiting area crowding. Thus, a study that looks at platform crowding is needed. The paper developed an algorithm that estimates hourly crowding at subway platforms [5]. Based on station information, such as station structure and transferability, individual boarding and alighting, OD records and transfer records, the algorithm calculated the accumulated number of people flowing into each station by train direction while estimating the number of people on the platform by looking into train departure times at each station. For the platform waiting area, the actual size of the waiting area in the station was applied, whereas platform crowding by train direction is calculated by estimating the number of passengers on the platform compared to the actual passenger capacity of the platform. However, the crowding analysis is confined to a single station and included alighting and boarding passengers only, failing to reflect the number of people waiting to transfer. The rest of paper used AFC data to conduct research designed to assign demand gathered from estimations of alighting stations at the station entrance [6]. Crowding was predicted every 30 minutes in real time. However, this research, too, did not consider transferring passengers.
Reviewing the previous studies [3~7], it can be concluded that there have been none that estimate platform crowding, factoring in transfer passengers by minute from the entire body of smart card data. Given that the subway interval at peak hours is 2 to 3 minutes, platform crowding needs to be analyzed on a smaller scale, for instance, on a minute-by-minute basis. In particular, to determine dynamic flow at passenger facilities, analysis should be minute-by-minute. There is also a need for research on predicting crowding on trains and platforms using a route choice algorithm for transferring passengers.
This study stands out from the existing literature for the following reasons. First, by identifying the movement trajectory of all subway passengers on an hourly basis, it predicts dynamic crowding on a minute-by-minute basis. Second, it estimates platform crowding at individual stations based on platform demand categorized into in-station transfer, boarding/alighting and transferring between lines. Third, while previous analyses were confined to modes of transport or transport facilities, this report concurrently analyzes modes of transport and passenger crowding.  1 Number of on-board passengers /train space. 2 Number of passengers waiting at platform/ practical waiting area at platform. 3 Waiting area at platform/ number of passengers waiting at platform.

Platform types and categorization of passenger flow
A platform refers to a space where passengers board and alight from a train. They can basically be divided into side platforms and island platforms, and can be subcategorized into double-island platforms (2 sided, 4 tracked), double-sided platforms, 2 sided and 3 tracked platform, etc. Side platforms are located directly across from each other with two tracks heading up and down between them. Passengers are divided into those traveling up and those traveling down as they move down to the platform from waiting areas. Island platforms are located between tracks, which enables passengers to access trains heading up and down as they head down to the platform from waiting areas [8].  There are three categories of passenger flow depending on platform types: boarding/alighting through stations on the same line, transferring between lines, and in-station transfer. Station entrance and exit IDs are identical to actual traveling route of passengers boarding and alighting through stations on the same line, and this type of passengers makes up the largest share among the three categories. In a station which serves only one line, passenger movement occurs at the transfer station to reach the destination. When a move is made for a transfer, alighting occurs at a platform on the first line while boarding occurs at a platform on the transferred line. The pedestrian trip is measured at stations of both lines. In-station transfer is similar in that it occurs at a transfer station, but line IDs at station entrance and exit are not identical to the IDs from boarding and alighting lines. In this case, a passenger traverses not only a platform of the line they have entered, but they also walk through a platform they actually alight and board although there may be some differences depending on platform structure. For instance, Gangnam Station, where Line 2 and the Shinbundang Line intersect, a passenger taking the Shinbundang Line swipes his/her card at a Line 2 terminal and walks via the Line 2 platform and the Shinbundang Line platform. If a passenger has tapped their card in at Gangnam Station on Line 2, estimation should be made for both the Line 2 platform and the Shinbundang platform and this should be reflected even in platform crowding. The distinction amongst trip types is made to accurately analyze factors leading to platform crowding because whereas the act of boarding and alighting becomes the source of overcrowding for large numbers of passengers traveling on the same line, with line transfers or in-station transfers, the transferring passengers themselves become the source of crowding.

Dynamic passenger flow model at subway platform
A passenger flow assignment model was set up under the assumption that passengers choose the quickest route from boarding station to alighting station. Individual passengers take the quickest route and given the subway crowding objective, the function can be set as in equation (1)

Estimation of concentration and dispersion of dynamic demand at platform
To estimate dynamic crowding at a subway platform, concentration and dispersion of passengers from the platform must be simulated. Here, every individual represents the process of concentration and dispersion at the transferring station ( ), while choosing to travel by the same shortest route from the departure station( ), to the arrival station( ),. The route selection algorithm where every passenger takes the shortest path is shown in Eq.

Estimating dynamic crowding at subway platform
Predicting dynamic crowding at subway platform is a three-step process. First, minute-byminute traffic volume is estimated based on smart card data and passengers are calculated in consideration of dispatch intervals. Second, pedestrian volume within a station is estimated taking into account passenger flow types. Third, crowding is predicted by dividing the size of actual waiting space at individual stations by number of passengers.
Since boarding and alighting time is stored in smart card data, travel and waiting time can be estimated based on passenger flow pattern and dispatch intervals. It was assumed that entry at the gate to time of boarding takes 3 minutes while alighting to the gate takes 1 minute. The distance between transfer stations was translated into an average traveling time to calculate the time taken between transfer stations. For instance, if a passenger taps his or her smart card in at 12:00 at the gate, he or she would arrive at the platform at 12:03 and it would be assumed that the passenger would board at 12:06, given 3 minutes of waiting. For alighting, it is assumed that the passenger taps out at the gate 1 minute later. Time taken for transfers is estimated through a route selection model, based on transfer distance data and the average speed of walking (1.2 m/s). The number of passengers waiting or moving across the platform is then calculated considering passenger flow type on a minute-by-minute basis. Lastly, dynamic crowding at the actual platform is estimated as in Eq. (4), considering the size of practical waiting area which reflects passenger flow pattern at the platform. An adjusting variable ( ( )) here is one used to reflect platform structures that differ by station and platform. In practice, the variable should be based on definitive research data on all platform types and structures, but due to a lack of relevant data, that could not be factored into this study. Since subway platforms are places where passengers converge to board/alight or transfer, detailed studies on waiting areas are needed to more accurately estimate crowding. Numbered lists can be added as follows [8]: • Inaccessible space refers to areas a passenger cannot access due to a structure within the platform; • Non-preferred boarding space refers to areas far from train doors or less frequently used by passengers; • A waiting area where passengers stay to board the train is defined as "a practical waiting area." The standards for estimating subway crowding include the Korea Highway Capacity Manual [1] and Urban Railway Station and Transfer and Convenience Facilities Design guidelines [2]. Korea Highway Capacity Manual [1] is deemed an adequate indicator of platform LOS because It is based on the dimensions of the average Korean body. National standards suggest that the required space and platform LOS at peak times be designed at 0.8 ㎡ /person and Level D, respectively. The occupancy space is larger than 0.4 /person, but smaller than 0.6 ㎡/person while the average waiting space is 0.5 ㎡/person. Level D is a service design criterion used to estimate crowding of an actual station. For instance, if the size of a crowded space used by passengers is assumed to be 235 ㎡, when the size of accessible platform area in Yeoksam Station that is part of the inner-circle line in the previous study is 783.5 ㎡, the recommended maximum number of passengers at platform would be roughly 470, factoring in an average waiting space of 0.5 ㎡/person. If 470 or more passengers flow into the platform at any given time during peak times, the station could be categorized as needing improvements as it would exceed the planned capacity.

Data
One-day passenger flow data (Oct. 17th, 2016) from the Seoul metropolitan area was the transit card data used in this analysis. Trip chains totaled 9,089,620. One-time transit passes that had no on/off-boarding time were excluded, leaving 8,947,636 (98.44%) available for use as the basis for tracking transit routes.
A physical network was established around the Seoul metropolitan area to analyze passenger flow, and included Seoul, Gyeonggi-do and Incheon as of October 2016. First, subway stations were flagged as nodes, of which there were 674 with transit card terminal number and subway station name included. Links were identical to the subway line map, with 1,332 links in total including the virtual links of Line 9 (express). Data on the transfer cost for each direction was constructed with transfers taking place in 93 stations (including express stations on Line 9). For the interval between trains, data on light rail and the Gyeonggang Line was also adopted.

Validation of model accuracy
To validate the optimum route choice model for this analysis, we used smart card data on transfers to private subway lines. This data is collected to track passenger flow and settle fares, and because transfer tap information is also collected when transferring to private subway lines, it can also be used to verify passenger routes.
This data on transfers to private subway lines is arranged in the order of user ID (unique card ID), transaction, transfer gate station ID, previously boarded station ID, boarding time at the previously boarded station, and the time at which the passenger passed through the transfer gate. With this smart card data, it is possible to compare the results of adopting the optimum passenger route choice algorithm with passenger routes from the data on transfers to private subway lines and therefore verify accuracy of the model. Analysis found that 79.82% of passenger routes from the data on transfers to private subway lines were successfully detected by the first optimum route, and 19.48% for the remaining similar routes (99.3% in total). This means that in most cases, routes coming from the optimum route choice algorithm coincide with actual passenger routes, meaning the model successfully mimics reality.

Gangnam Station
Gangnam Station is the fifth-most crowded station in the metropolitan area, and was the most crowded non-transfer station before October 2011 when it became a transfer station upon opening of the Sinbundang Line. With that opening, passenger traffic increased by 50%, from 200,000/day in 2010 to 305,000/day by October 2016. This was due to the natural increase in passengers wanting to use Gangnam Station and the growing demand to transfer there by Sinbundang line users to reach the Gangnam area of downtown. Analysis revealed that 73% of this demand was for on/off-boarding at the station and only 24% was for transfer. Therefore it seems appropriate to pick Gangnam Station for single station analysis.
Analysis of passenger traffic at the inner circle platform of Gangnam Station found that it was relatively more crowded during the evening peak rush hour (18:00~19:30) rather than during the morning peak rush hour (08:00~09:00). The platform is crowded with passengers who use it to go home to Gwacheon, Anyang and the Gangbuk area as well as western Seoul such as Gwanak-gu, Guro-gu and Yeongdeungpo-gu, and Bucheon and Incheon. It is also crowded at around 22:00 (late night peak hour) with passengers going home after dinner or extracurricular classes. Passenger traffic at the outer circle platform was also analyzed, which revealed that it was most crowded during the morning peak rush hour (08:00~09:00), followed by the late-night peak rush hour (at around 22:00) during which passengers use the platform to go home after dinner or extracurricular classes. This high passenger traffic at Gangnam Station is thought to be due to its location, as it lies in the Gangnam downtown area, which is one of the three largest downtown areas in Seoul with its concentration of offices, commercial buildings and private education institutions.
The practical waiting areas for both the inner and outer circle platforms at Gangnam Station are bigger than the average at other stations, but are still severely crowded as there are so many people using them. The highest passenger traffic at Gangnam Station was observed at 08:32 during the morning peak rush hour on the outer circle platform, when the service level was LOS E. The evening rush hour traffic is lower than the morning rush hour traffic, because passenger flow is more dispersed during the evening.

Express Bus Terminal Station
Express Bus Terminal Station is where lines 3, 7, and 9 converge and therefore many passengers transfer at this station. It is also used by many looking to take an express bus (in-station transfer: in this example, the Line 3 entrance is used to transfer to Lines 7 or 9 or to the express bus terminal). Looking at overall passenger flow, Line 3 has the highest, followed by Lines 7 and 9, which have similar flows to each other. Lines 3 and 7 have the highest platform crowding at the station during the morning rush hour, with Line 9 showing the highest traffic during the morning rush hour (in the direction of Sports Complex Station) and evening rush hour (in the direction of Gaehwa Station), both of which are similar in traffic. For transfer patterns between the three lines, passenger flow between Lines 3 and 7 is similar to the flow between Lines 3 and 9, with both peaking at 600/h during the morning rush hour. Transfer traffic between Lines 7 and 9 is rather low, as the long walk to transfer discourages many from using it, and both lines run in the Gangnam area, meaning there is little need to transfer between them.
Regarding the highest platform crowding at each line, Lines 3 and 7 were found to be at a decent level, with Line 3 showing LOS C (in both directions) and Line 7 showing LOS A (in the direction of Bupyeong-gu Office Station) and LOS B (in the direction of Jangam). However, Line 9 showed LOS D, meaning that it is in need of long-term care.

Crowding level at key subway stations
Passenger  As the structure and passenger flow at each station differ, different strategies to alleviate crowding are needed that are suited to each station. Factors that can affect platform crowding include passenger traffic, practical waiting area on the platform, and transfer distance. Given this, the passenger flow and structural characteristics of each station were taken into account for this study when assessing platform crowding. LOS (level of service) analysis was conducted on the basis of maximum passenger flow, which was calculated by the number of passengers per minute during the morning peak rush hour (8:00~9:00).
The highest passenger flow during morning peak rush hour at Seoul Station (northbound) was 1,142/min, while for City Hall Station (northbound), the highest passenger flow during the morning peak rush hour was 838. The service level on the northbound platform at Seoul Station was LOS E. Jongno 3-ga, Dongmyo and Cheongnyangni stations had somewhat lower crowding levels, which only spiked briefly during the morning peak rush hour.
An analysis of the platform service level on Line 2 stations during their most congested hours revealed that Dongdaemun History & Culture Park Station had the lowest LOS (both inner and outer circle), which was LOS E. The outer circle line from Gangnam Station to Yeoksam Station, where businesses and leisure facilities are concentrated, was also rated LOS E, as were Jamsil Station, where transfer flow is high, Sindorim Station, where passenger flow is the highest, and Guro Digital Complex Station, where businesses are concentrated. Dongdaemun History & Culture Park Station on Line 3 was also rated LOS E, as it is one of the most crowded stations in the metropolitan subway network with a peak passenger flow reaching 1,528/min.

Conclusion and Future Studies
Platform crowding and line crowding on the metropolitan subway are very important indicators for traffic policy. Alleviating this crowding needs to be viewed from the perspective of reducing the related costs and enhancing passenger safety. In this analysis, efforts were made to go beyond the restrictions of previous studies that were mostly dependent on visual measurements and single station crowding and build on AFC-based, ICT-related big data (smart card data) to construct a model.
Our model was constructed to help choose the optimal analogous route by dynamically tracking passenger flow and dividing it at subway platforms into transfers between lines, in-station transfers that are indicated by entrance line and departure line, and boarding and alighting on the same line. To assess platform crowding, it was suggested herein to calculate passenger concentration in the practical waiting area of the station platform.
Case studies were conducted to reenact the dynamic passenger flow per minute and identify passenger flow along the entire metropolitan subway network, while ways to understand passenger movements at each station platform were also offered. In addition, the research demonstrated that results of a multifaceted nature on subway traffic could be obtained, encompassing such measures as passenger movement at each station, passenger movement on each subway line, passenger flow at transfer stations, and dynamic use of station platforms by passengers.
The unique characteristic of this study is that it involved construction of a model that can serve as a basis for various uses in terms of platform crowding analysis, operation, and design. Here are the suggested benefits to utilizing our dynamic platform crowding model. First, long-term analysis of the changes in crowding is possible as there are no time or space restrictions in estimation of that crowding. It could also be used to assess crowding by region or subway line. The second benefit is on the operations side. Providing real-time or estimated traffic information to passengers allows them to avoid traveling during the most crowded hours or, alternatively, change their route away from the crowding, which can serve as an early step to mitigating crowding. Such information can also be used to adjust train schedules or put more trains into operation at the right times. The third benefit is on the design side. As real-time traffic information is generated, it can be referred to in designing station platforms or changing their design. It could also be used in designing emergency shelters or evacuation routes. Future studies may include those that take various factors such as train schedules, real-time information, changing demand, and weather into account for more accuracy in estimating platform crowding.