Ride-hailing Impacts on Transit Ridership: Chicago Case

25 Existing literature on the relationship between ride-hailing (RH) and transit services is limited to 26 empirical studies that lack real-time spatial contexts. To fill this gap, we took a novel real-time 27 geospatial analysis approach. With source data on ride-hailing trips in Chicago, Illinois, we 28 computed real-time transit-equivalent trips for all 7,949,902 ride-hailing trips in June 2019; the 29 sheer size of our sample is incomparable to the samples studied in existing literature. An existing 30 Multinomial Nested Logit Model was used to determine the probability of a ride-hailer selecting 31 a transit alternative to serve the specific O-D pair, P(Transit|CTA)1. We find that 31% of ride32 hailing trips are replaceable, whereas 61% of trips are not replaceable. The remaining 8% lie 33 within a buffer zone. We measured the robustness of this probability using a parametric 34 sensitivity analysis and performed a two-tailed t-test. Our results indicate that of the four 35 sensitivity parameters, the probability was most sensitive to the total travel time of a transit trip. 36 The main contribution of our research is our thorough approach and fine-tuned series of real-time 37 spatiotemporal analyses that investigate the replaceability of ride-hailing trips for public transit. 38 The results and discussion intend to provide perspective derived from real trips and we anticipate 39 that this paper will demonstrate the research benefits associated with the recording and release of 40 ride-hailing data. 41 1 This value defines the replaceability of the trip, where a value ranging from 0 to 0.45 is considered notreplaceable (NR), and a value ranging from 0.55 to 1.0 is considered replaceable (R). Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 30 September 2020 doi:10.20944/preprints202009.0753.v1 © 2020 by the author(s). Distributed under a Creative Commons CC BY license. Breuer, Du, and Rakha 2


25
Existing literature on the relationship between ride-hailing (RH) and transit services is limited to 26 empirical studies that lack real-time spatial contexts. To fill this gap, we took a novel real-time 27 geospatial analysis approach. With source data on ride-hailing trips in Chicago, Illinois, we 28 computed real-time transit-equivalent trips for all 7,949,902 ride-hailing trips in June 2019; the 29 sheer size of our sample is incomparable to the samples studied in existing literature. An existing 30 Multinomial Nested Logit Model was used to determine the probability of a ride-hailer selecting 31 a transit alternative to serve the specific O-D pair, P(Transit|CTA) 1 . We find that 31% of ride-32 hailing trips are replaceable, whereas 61% of trips are not replaceable. The remaining 8% lie 33 within a buffer zone. We measured the robustness of this probability using a parametric 34 sensitivity analysis and performed a two-tailed t-test. Our results indicate that of the four 35 sensitivity parameters, the probability was most sensitive to the total travel time of a transit trip. 36 The main contribution of our research is our thorough approach and fine-tuned series of real-time 37 spatiotemporal analyses that investigate the replaceability of ride-hailing trips for public transit. 38 The results and discussion intend to provide perspective derived from real trips and we anticipate 39 that this paper will demonstrate the research benefits associated with the recording and release of 40 ride-hailing data. 41 The following is an extensive list of terminology relevant to this paper, and their corresponding 2 contextual definitions. 3 First-and last-mile (FLM): this refers to the first and last leg of the transit trip that connect the 4 individual from their origin to the first transit stop, and/or from the last transit stop to their 5 destination. 6 In-vehicle travel time (IVTT): this is the portion of the total travel time, and accounts for all 7 time spent traveling inside the transit vehicle(s). In our analysis, we may refer to this as the 8 "transit time". 9 Not-replaced (NR) trip/group: this is the group containing all transit-equivalent trips with a 10 ( | ) ≤ 0.45. A transit-equivalent trip that has a probability in this range [0 ∪ 11 0.45] is deemed to be inviable to the individual, and ultimately, does not compete with the RH 12 trip service. 13 Out-of-vehicle travel time (OVTT): this is a portion of the total travel time outside of the 14 vehicle(s), i.e. accessing, egressing, wait time, transfer walk time. 15 Pooled Trip, Ride-hailing: these are ride-hailing trips that combine two or more trips, such that 16 passengers 'share' the ride. In some scenarios, all passengers meet at a specified location and are 17 dropped off at a shared location. Whereas in other scenarios, passengers are picked up at their 18 desired location and then dropped off in the most efficient order. 19 Replaced (R) group: this is the group containing all transit-equivalent trips with a 20 ( | ) ≥ 0.55. A transit-equivalent trip that has a probability in the range [0.55 ∪ 21 1.0] is considered a viable mode of service for the specific O-D pair. 22 Ride-hailing (RH): this refers to the act of servicing a trip via a transportation network company 23 (TNC). Users must have an account with the respective TNC, and have the app downloaded onto 24 their smartphone. These trips are ordered using the TNC's app and require the user to input their 25 destination, whereas the origin is automatically determined using the smartphone's internal GIS 26 software. TNC trip fare pricing is dynamic and dependent on the surrounding demand.  hailing trips can be pooled or single passenger, refer to their definitions. itself Lyft, to exclusively operate as a ride-hailing service [2]. Over the next two years, 7 competition heightened as Uber expanded to 60 cities across six continents, and Lyft announced 8 its plan to expand to 24 more cities, totaling coverage of over 60 cities [2,3]. As of January 9 2019, nearly a decade later, 36% of US adults have used or currently use ride-hailing services 10 [4]. 11 Ride-hailing is best defined as an on-demand, app-based, and real-time service that 12 provides customers with door-to-door transportation for a single trip [5]. Through the company's 13 smartphone app, a customer enters a specific pick-up and drop-off location (O-D pair). On the 14 backend, the RH company's algorithm calculates an appropriate route and trip fare. It then 15 selects the optimal driver to service the trip and notifies the customer of the estimated pick-up 16 time and vehicle/driver details. 17 Coincidentally, when the ride-hailing market began rapidly gaining traction through 18 geographic expansion and increased acceptance in 2014, average public transit ridership in the 19 United States began its decline. In the early twenty-first century, transit ridership in the United 20 States experienced two periods of growth followed by decline ( Figure 1). 21 22 On average, from 2003 to 2008, public transit ridership in the United States increased by 28 2.58% each year. Following the 2008 economic recession, ridership levels took a downturn until 29 2 Within the existing literature, ride-hailing is more commonly known as "ridesharing" but because it entails 'hailing' a ride which is not necessarily shared, ride-hailing is most appropriate. From 2008 to 2018, the population of eligible riders 3 increased by 5.63%. Yet, the annual 12 transit trip per eligible rider decreased by 11.6%, as shown in Figure 2. Moreover, the transit 13 ridership per eligible rider decreased by 8.6% from 2014 to 2018. Despite the steady growth of 14 the US population, transit ridership does not reflect that. 15 Historically, declines in transit ridership can be a result of macroeconomic, geographic, 16 and demographic changes in a region. The first period of ridership decline in the 21 st century 17 started in 2008 and was evidently a ramification of the economic recession. Yet, the cause(s) of 18 the most recent decline is not as discernable. Further, this period exhibited a larger decline in 19 magnitude and has spanned 5 years, as opposed to 2 years. So, what could have possibly caused 20 a more crippling effect on transit ridership than the economic recession? Was the second decline 21 in transit ridership a result of an alternative mode, ride-hailing? 22 This paper will approach answering this question and will fulfill the literature gap by 23 identifying the replicability of transit by RH services through non-empirical based methods. In 24 our study we use ride-hailing trip source data from the City of Chicago containing over 25 8,000,000 trips, hence the contributions of our research are expanded by the use of a relatively 26 large dataset. We use spatial analyses and methodologies to deliver a real-time transit-equivalent 27 route. We then input trip characteristics into our selected utility model, from which we will 28 calculate the corresponding probability that an original ride-hailer would select transit over RH. We will perform two analyses, the first exploring the replaceability of all RH trips as a whole, 1 and second, a parametric sensitivity analysis of the P(Transit|CTA) with respect to four 2 parameters. 3

4
It is important to review the Terminology section prior to reading this section, as much existing 5 literature includes new terminology. 6 The current body of research on ride-hailing is limited by its novelty and the lack of 7 publicly available ride-hailing trip data. Given that ride-hailing services were first introduced to 8 the market in 2010, external research on its utility and impact is relatively untouched. Moreover, 9 ride-hailing services are privately, consequently, trip-specific data is exclusively withheld and 10 unavailable for public research use. While there is no existing literature that definitively states 11 how ride-hailing services impact public transit ridership, many stipulate a correlation between 12 the two, and if ride-hailing is a contributor, it is likely not acting alone. 13 This absence of trip data has led researchers to obtain empirical data through stated 14 preference (SP) and revealed preference (RP) surveys [8][9][10][11]. Some studies executed intercept 15 surveys at points of interest [12], and one executing in-person interviews [13].Yet to our 16 knowledge, there exists no research on the relationship between ride-hailing and public transit 17 that is uses source-data. Consequently, these empirical methods confine the spatial and temporal 18 ranges, limiting the application and testing the integrity of the findings. Ultimately, this has led 19 to conflicting arguments that have yet to be resolved. In the following literature review, we will 20 identify reoccurring themes and findings regarding the impact of ride-hailing services on public 21 transit ridership. We will also highlight the methods used to obtain data. Lastly, we will 22 determine gaps in the literature and how we will address these gaps in our study. 23 It is important to note that the relationship between ride-hailing and public transit 24 encompasses one-to-many relationships. Bus and rail (light and heavy) both fall under 'public 25 transit', although each mode services trips of differing purposes, rider demographics, and LOS 26 metrics. Accordingly, most literature analyses each modality separately. 27 In general, the current literature seeks to explore the impact of ride-hailing on VMT and 28 vehicle emissions, its relative safety, and its effect on mode selection. Yet, the latter of the three 29 concerns is the least explored. Contreras and Paz presented three questions, one of which 30 illustrates this concern, "have RHC's [ride-hailing companies] had a negative or positive effect 31 on transit ridership and/or revenue?" [14]. Answering this question requires empirical and 32 source-data based research. 33 As stated previously, the lack of source-data based research has led to conflicting 34 arguments. Considering that "public transit" encompasses many transit modes, positions tend to 35 be unique per mode (bus, rail). The first position argues that the perceived gains of ride-hailing 36 services attract riders and thereby, substitutes transit. This is based on the significant difference 37 between the gains, and marginal difference between the cost between public transit and ride-38 hailing. Thus, the cost differential is perceived to be worth the gain of ride-hailing, and thereby 39 replacing public transit. Accordingly, these critics pose that ride-hailing services contribute to the 40 recent decline in public transit ridership. Whereas the second and opposing position argues that 41 ride-hailing complements and reinforces the use of public transit by servicing the first-and/or 42 last-mile (FLM) arrangement, and therefore induces revenue. 43 Most studies have explored mode choice behavior towards ride-hailing through 44 observation-based research methods, such as SP, RP, and intercept surveys [9,10,15]. 45 According to Clewlow and Mishra, ride- alternative of interest. In brief, users chose ride-hailing over the bus because it was faster and 6 over rail because it was faster, easier to pay, and had less wait time [12]. 7 Henao and Marshall worked as Uber drivers to obtain observational data in real time. 8 Verbal and recorded interviews were used during ride-hailing trips. Of the 311 passengers 9 interviewed, only 5.5% of riders were using the ride-hailing service to get to or from a transit 10 station [13]. This implies that 94.5% of ride-hailing trips do not service the FLM arrangement. 11 However, the small sample size challenges the range of application and the question with a 12 binary response option minimizes bias. 13 Nelson and Sadowsky used a difference in differences (DID) modeling by comparing 14 transit ridership and operational metrics red before and after the entry of ride-hailing service(s). 15 Their findings concluded that transit ridership increased following the entry of the first ride-16 hailing company, then decreased once the second company entered the regional market. While these methods are useful and highly qualitative, they assume an ideal condition 36 that respondents are not biases. Hence, the results are vulnerable to many biases. The first, 37 hypothetical bias, is the propensity of humans to view survey questions hypothetically to an 38 extent that skews responses validity. Second, strategic bias is the tendency for a respondent to 39 evaluate their hypothetical behavior such that it favors the response with greater perceived value. 40 Lastly, framing bias is how the phrasing and wordage of a question influences its interpretation. 41 The overwhelming use of surveys and interviews serves as an opportunity to deploy a 42 more quantitative study that focuses on individual trips and their corresponding LOS attributes. 43 Until we can collectively concur upon the effect of ride-hailing, designers, planners, and 44 politicians cannot make sound decisions. We hope to contribute to the field by pioneering new 45 methods and approaches to analyze the impact of RH on public transit ridershiop. Our 46 publication of source data-based research will not only result in greater clarity and insight but 1 will illuminate gray areas with more intensity. From this, empirical studies should be refined to 2 focus on investigating these ambiguous regions and identifying their sources. 3

16
Historical data on the services present during our study period is unavailable at this time. 17 However, as of January 2020, three personal-car ride-hailing services operate in the City of 18 Chicago: Uber, Lyft, and Via [23]. 19

TNP (Transportation Network Providers) -Trips Dataset 21
This dataset served as our primary source data for ride-hailing trips and was obtained from the 22 City of Chicago's online data portal. The dataset contains 129 million unique TNC trips that span 23 from November 2018 to the present day and is aggregated by the month [24]. We chose to only 24 study one month, June 2019 because it does not contain any nationally recognized holidays that 25 could hinder the representativeness of the results. Our selection from this dataset contains all 26 TNC trips that have a trip start time on or after June 1, 2019 12:00:00 AM and before July 1, 27 2019 12:00:00 AM. The dataset contains 23 fields per trip, including a unique identifier, trip start 28 and end time, pick-up and drop-off longitudinal and latitudinal coordinates, pick-up and drop-off 29 census tract ID, trip fare, and if the ride was authorized as "shared" through the respective TNC 30 app. 31

Public Transit Data (General Transit Feed Specification (GTFS) Dataset) 32
To perform a public transit network analysis in ArcGIS, it requires the GTFS dataset 33 corresponding the area of interest. GTFS datasets are publicly available from OpenMobilityData, 34 which contains archived real-time and fixed components of transit agencies' schedules [25]. The 35 dataset holds the corresponding schedules, routes, stops, and transfers for the time period. With 36 respect to this paper, this dataset will be integrated into ArcGIS such that the Network Analysis 37 program can identify the corresponding transit route under spatial and temporal conditions. 38 39

40
In Figure 3 below, is a preliminary analysis of the dataset in terms of trip passenger count and 41 day type, weekday (M-F) or weekend (Saturday and Sunday). Over the month of June 2019, 42 there were 8,136,461 ride-hailing trips serviced in the City of Chicago. This figure demonstrates 1 that single occupancy ride-hailing trips have significantly greater demand than pooled trips. 2 When compared to driving alone, this modality has a higher contribution towards congestion 3 because its utility is comparably low due to deadheading mileage. For weekday trips, two demand peaks exist at hours 8 and 16, whereas for weekend trips peak 3 periods are not as distinct. On weekends, people tend to participate in social and leisure activities 4 that are not confined to the traditional 9-5 work schedule. Hence, travel demand is more evenly 5 distributed throughout daylight hours. 6

7
Data processing was completed in three steps, with the ultimate output being the probability of a 8 rider choosing public transit. This probability is derived from a multinomial nested logit (MNL) 9 model based on the Chicago's travel behaviors in 2015 [11]. This model is descriptively 10 explained, and its relevancy is introduced later in this section. 11 As a brief overview, the first two step were performed in the program, ArcGIS, using two 12 separate tools: (1) Route Analyst and (2) Spatial Join. These two steps are novel, in that we 13 combine GTFS data and source data to compute the time-conscious transit-equivalent route. The 14 output of these two steps, per RH trip, were a transit-equivalent trip and the number of transfers 15 required to complete the trip. For the third and final step, we used an existing utility model to 16 compute the P(Transit|CTA), the probability of choosing to service a trip using CTA. 17 Considering the scope of our research, we chose to conduct a thorough literature review 18 to find an existing utility model with obtainable input values, and that was derived from a sample 19 with similar demographics and travel behaviors. We compared our methodology and dataset 20 against these models to determine which existing utility model was most suitable. We selected 21 the multinomial nested logit (MNL) model developed by Javanmardi et al. [11]. Their model was 22 developed from a revealed preference survey using Google Maps API and RTA's Goroo 23 TripPlanner trip data. The purpose of their MNL model is to measure the varying mode choice 24 behaviors regarding alternative transportation, with increased accuracy from RP surveys. This 25 model was developed from trips executed within the city limits of Chicago. Given the model 26 shares the same study region of this paper, it is the most appropriate option in that it will capture 27 the travel behavior with a greater degree of accuracy. Lastly, the study year (2015) of their 28 research is appropriate in that ride-hailing was introduced to Chicago before that time, thus their 29 model should capture any evolution of mode choice behavior and preferences towards or against 30 alternative transportation. 31 The model's formulae are represented by the equations below (Equations 1-6). Given that 32 the model was used for a range of modes, we modified its subscripts to align with the variables 33 While this derivation of this model exhibited strong similarities to our study 10 characteristics, it did contain several caveats. Our dataset did not completely satisfy all required 11 input of the model, therefore we either generalized, estimated, or calculated these parameters. In 12 doing so, assumptions were made.

18
Household Income (HHI): given the privatization of the ride-hailing dataset, we were not 1 granted access to socioeconomic characteristics of the individual ride-hailer. However, we 2 computed the HHI for a rider using a dataset containing the average HHI per census tract. We 3 defined the HHI to equal the average HHI of the origin, if the trip was executed on a weekday 4 between 5:00 AM and 12:00 PM, or of the destination, if the trip was executed on a weekday 5 between 12:00 PM and 7:00 PM. For all trips outside this boundary, the HHI was defined as the 6 average between the origin and destination HHI. 7 Census Tracts of Origins and Destinations: The ride-hailing dataset contained two fields for 8 the origin and destination tract values, although a group of trips would had null origin or 9 destination tract values but would have the geographic coordinates of the origin and destination. 10 To account for this, we estimated the corresponding tract IDs via a minimum distance program in 11 Matlab. First, we obtained the geographic boundaries of the census tracts from the Chicago Data 12 Portal. We imported this SHP file into ArcGIS and performed geometry calculations to calculate 13 the tract's centroid and to output the corresponding latitudes and longitudes. This output Destination within CBD at Rush Hour (destCBD): Due to the anonymity of the dataset, we did 29 not have individual details on trip characteristics such as the trip purpose. We assumed the AM 30 and PM peak period to occur between 6:00:00 and 9:00:00 AM and 16:00:00 and 19:00:00 PM, 31 respectively. For a trip to test "true" (destCBD = 1), we first determined if the destination census 32 tract was a member of the CBD census tracts array. If true, the trip start hour was then tested 33 against the two peak periods, otherwise, destCBD = 0. 34 Trip Purpose: Work (wrktrp): like destCBD, we assumed that a trip was deemed as a "to-or-35 from work" commute (wrktrp = 1) if the start time lied between 5:00:00 and 19:00:00 36 on a weekday. 37

Replaceability of a Transit Trip 39
Ultimately, to determine if a ride-hailing trip "replaced" its transit-equivalent trip, we compute 40 the probability of using CTA. The magnitude of these probabilities indicates the viability of 41 public transit serving a specified trip and depends on how favorable the trip's LOS attributes and 42 trip-specific characteristics are to the rider. We chose to classify a transit-equivalent trip by its 43 replaceability, categorized by two groups: replaced (R) trips and not-replaced (NR) trips. We 44 initially assumed the threshold value distinguishing a trip being "replaced" (R) or "not replaced" 45 (NR), to be 0.5. Where all trips with a P(Transit|CTA) < 0.5 were classified as not replaced and 46 all trips with a P(Transit|CTA)>=0.5 were classified as replaced. However, following the first 1 sensitivity analysis trial, we determined that trips with P(Transit|CTA) close to 0.5 switch 2 between the R-and NR-groups. These trips are fuzzy and are not reliable indicators of true 3 mode-choice modeling behavior. Thus, we chose to implement a buffer, where trips with 4 P(Transit|CTA) = (0.45-0.55) are removed and excluded from the summary statistics. This 5 modification is represented by the conditional statement below. 6 For an individual ride-hailing trip, T, 7

Sensitivity of P(Transit|CTA) 9
We then chose to conduct a parametric sensitivity analysis of P(Transit|CTA) with respect to the 10 following decision variables: 11 1. Transit stops per census tract (StopsinTract) 12 2. Household income (HHI) 13 3. Total travel time (TTT) 14 4. Walk time (WT) 15 The ( | ) was recalculated under 20 sensitivity conditions, for each decision 16 variable. Per variable, its observed value was adjusted in increments of 5%, ranging from -50% 17 to +50%. Given that each variable was tested independently, there were a total of 80 trials. For 18 each sensitivity condition, the algorithm was executed with the adjusted parameter and a new 19 P(Transit|CTA) was output for all trips and averaged per group (R/NR). 20 Assuming that both groups share the same standard deviation, we can estimate σ by 21 calculating the pooled standard deviation, , with the equation below. The pooled standard 22 deviation for the observed and sensitivity condition data sets, for group R or NR, is: Where, 24 = the number of trips in the observed group 25 = standard deviation of the observed group 26 = number of trips in the sensitivity group 27 = standard deviation of the sensitivity group 28 It should be noted that and are fixed values under all sensitivity 29 conditions. 30 To measure the level of influence and statistical relationship of each decision variable 31 and the ( | ), we performed a two-tailed pooled t-test. Considering there is no 32 overlap between the observed and sensitivity condition data, the two-tailed test was most 33 suitable. We performed the t-test per variable, sensitivity condition, and trip group (R and NR). 34 The relationship between the t-statistic and the critical value indicate whether we accept or reject 35 the null hypotheses stated below: 36 37 If the t-statistic is greater than the critical value (1.365), then we reject the null hypothesis 4 and refer to the alternative hypothesis. The alternative hypothesis opposes the null by concluding 5 that there is a statistically significant difference between the observed and the sensitivity 6 condition data. Meaning, the influence of the decision variable on the ( | ) is 7 expected to have an effect on the whole population, similar to the effect of the sensitivity 8 condition. 9 The following equation was used to compute the t-statistic per variable and group for 10 each sensitivity condition (Equation 8): 11 Where, 12 = sample trips of group g (R or NR), decision variable var, and percent-change condition %∆.

24
The following section introduces the results from the replaceability and sensitivity analyses for 25 the four sensitivity variables: TTT, WT, HHI, and SiT. 26

27
The P(Transit|CTA) estimation is a function of the abovementioned procedures and their 28 respective outputs. To reduce the paper's length, we will summarize each group in terms of size 29 and P(Transit|CTA).. For ease of recall, the group for a trip, T, is categorized by the following 30 conditional: 31

32
Of the 7,949,902 trips output from the route analysis, approximately 8% (646,808 trips) 33 had a probability lying within the buffer range of 0.45<P<0.55. Moving forward, we will be 34 summarizing findings in terms of the R and NR groups only. 35

1
Results from our sensitivity analyses were extensive and will be summarized for the submission 2 guidelines of this paper. 3 For the aforementioned sensitivity parameters, t-tests were conducted at the 95% 4 confidence level. Under each sensitivity condition, all variables exhibited t-values greater than 5 the critical value. This implies that with 5% error, we can assume the adjustment of each of those 6 four parameters will have a statistically meaningful impact on the P(Transit|CTA). 7 For each trip, the variable of interest was adjusted, the P(Transit|CTA) was recalculated, 8 and the trip was recategorized based on its magnitude. Thus, it is important to consider that for 9 each percent-change condition, the sample for R and NR trip groups will vary in size. Moreover, 10 means of both groups will fluctuate with the sensitivity condition. 11 We will introduce stacked bar charts per sensitivity variable that depicts the overall 12 weighted mean P and trip count per sensitivity-condition. Equation 10 below is used to compute 13 the total weighted mean P(Transit|CTA): 14 The data labels (percentages) within each bar corresponds to the percentage of total trips 15 (nT) that each group contains. Per the legend, the blue portion of the stacked bar corresponds to 16 the replaced (R) trips, whereas the orange portion corresponds to the not replaced (NR) trips. It 17 should be clarified that these percentages are independent from the portion heights of the stacked 18 bars. 19 In summary, the first three variables (TTT, WT, and HHI) exhibit an inverse correlation 20 with P(Transit|CTA) whereas the remaining variable, SiT, has a positive relationship with 21 P(Transit|CTA). 22 The first two variables are associated to travel time: WT and TTT. The P(Transit|CTA) is 23 more sensitive to adjustments in TTT when compared to WT. The range of the weighted 24 probability for TTT is approximately 0.20, as opposed to 0.10 for WT. The relatively heightened 25 sensitivity for TTT can be explained by TTT's formula and the nesting of WT in its value. Recall 26 that the TTT is the sum of the IVTT, wait time, and walk time. Meaning, a 50% decrease in TTT 27 includes a 50% decrease in IVTT and wait time in addition to a decrease in walk time. Whereas a 28 50% decrease in WT does not include the reduction in IVTT and walk time. When the new TTT is 29 calculated, it uses the observed wait time and IVTT, but changes the WT. While these differences 30 exist, they are relatively small in comparison to their distribution patterns and weighted 31 probability values. In Figure 6 and Figure 7, the volumetric share between R and NR trips is 32 indistinguishable. When analyzing the WT, it is important to consider that a trip's replaceability is 5 contingent on the persons' capabilities of physical exertion. For example, an elderly, disabled 6 person may need to traverse two blocks to get to the grocery store. Under ArcGIS' Route 7 Analysis program it will likely output that walking is most efficient, although given the user's 8 conditions, walking is not an option. Additionally, the replaceability is influenced by the user's 9 safety, which is dependent upon the perception of the route's surrounding physical 10 environment(s). These conditions produce a bias to act more conservatively such that hazardous 11 events are mitigated. All of these conditions, concerns, and exceptions cannot be explicitly 12 accounted for in our model. 13 The next variable, average HHI, has less impact on P(Transit|CTA) per Figure 8. In 18 existing literature, it was determined that ride-hailers exhibit demographic characteristics that are 19 at variance with the average American. Ride-hailers were found to be more educated and be of a 20 higher income class. Therefore, the use of the average HHI may undervalue that of the average 1 ride-hailer and the accuracy of these results could be challenged. Nonetheless, the trend and 2 behavior HHI has on P(Transit|CTA) is transposable. In opposition to the prior variables, the number of transit stops per census tract (SiT) was 9 exhibited a positive correlation with the P(Transit|CTA), as illustrated in Figure 9. The number 10 of transit stops in a network has many implications on operations and ridership. An increase in 11 transit stops implies an increase in route LOS. As the distance between consecutive stops is 12 decreased, the average access and egress distance decreases. With this increase in accessibility, 13 the volume of serviceable patrons increases. 14 15 16 Although there exist caveats with the more extreme positive sensitivity conditions that 2 are not captured in the utility model used. The addition of transit stops to existing routes must be 3 optimized to account for added lost time. Delay has the opportunity to incur at every transit stop 4 during the approach, boarding and dwelling, and exit. Boarding and dwell times can quickly 5 accumulate during peak period hours when there are large platoons of entering riders, and there 6 is discontinuity in payment forms. Additionally, this is a consequence of increased ridership. The 7 second source is called the 're-entry' delay; this is the time required for the driver to merge into 8 oncoming traffic. For every additional transit stop, one re-entry delay is incurred per cycle. The 9 summation of these delays per stop and per cycle can adversely affect the travel time between 10 stops, and the TTT of each rider. In summary, the addition of transit stops increases the 11 accessibility and consequently, utility. Although designing addition to increase ridership must 12 strategically consider implications it has on the existing travel times and LOS attributes. 13

14
The impact of ride-hailing services on the recent decline in public transit ridership has not been 15 widely explored. The current body of research is constrained to empirical studies that vary in 16 methodologies used and analyze relatively small samples. To our knowledge, there are no studies 17 that explore the research question using a massive dataset that contains individual trips over an 18 extended period of time. Furthermore, our approach to exploring the research question is 19 resourceful and novel. We define the replaceability of a RH trip by a series of spatiotemporal and 20 mathematical analyses. First, the real-time transit equivalent trip is computed using the GTFS-21 integrated ArcGIS Route Analysis. Then, the probability of choosing transit over all other 22 alternatives identified any potential viable transit-equivalent trips. 23 Our findings indicate that 31% of ride-hailing trips were executed where the transit 24 alternative exhibited a competitive utility, with respect to travel times, fare/expenses, and 25 workload. Consequently, over the month of June, the total revenue lost from trips replaced by 26 ride-hailing is estimated to be $6,114,450 4 . If we assume the percentage of replaced trips and trip 27 counts for each month can be represented by June 2019, then the total loss in fare revenue in 28 Chicago over one year would be approximately 73 million dollars. Further, the ramifications of 29 the demand transfer to ride-hailing services is not fully represented by the loss in revenue. As 30 such, public transit agencies should employ strategies to increase transit utility such that a 31 significant portion of this estimate can be recovered. 32 Publicly available ride-hailing trip data will likely maintain its anonymity by recording 33 origins and destinations as their census tract centroids. Given it is unlikely for the precision to 34 increase, studies that are macroscopic and encompass all attribute types (temporal, spatial, 35 monetary) should be executed. However, the use of our methodologies and approach is only 36 doable for regions that mandate the submission of all ride-hailing trips and their corresponding 37 spatial and temporal parameters of the origin and destination. Recording and releasing this data 38 will enable institutions to publish research that will provide a greater understanding of how ride-39 hailing impacts the transportation network and economy in varying geographic settings. 40 Moving forward, future research should focus on mode-choice behavior to thoroughly 41 understand the factors favoring and disfavoring the use of transit. Replaced trips have 42 comparable transit alternatives, hence there exists preferences and attitudes exclusive from LOS 1 attributes that favor ride-hailing. Regarding NR trips, transit agencies should turn inwards and 2 evaluate services, or the lack thereof, in the corresponding origin and destination zones. This effort was funded by the Urban Mobility and Equity Center (UMEC). 10