Parameter Estimation of the Macroscopic Fundamental Diagram: a Maximum Likelihood Approach

This paper extends the Stochastic Method of Cuts (SMoC) to approximate of the Macroscopic Fundamental Diagram (MFD) of urban networks and uses Maximum Likelihood Estimation (MLE) method to estimate the model parameters based on empirical data from a corridor and 30 cities around the world. For the corridor case, the estimated values are in good agreement with the measured values of the parameters. For the network datasets, the results indicate that the method yields satisfactory parameter estimates and graphical ﬁts for roughly 50% of the studied networks, where estimations fall within the expected range of the parameter values. The satisfactory estimates are mostly for the datasets which (i) cover a relatively wider range of densities and (ii) the average ﬂow values at diﬀerent densities are approximately normally distributed similar to the probability density function of the SMoC. The estimated parameter values are compared to the real or expected values and any discrepancies and their potential causes are discussed in depth to identify the challenges in the MFD estimation both analytically and empirically. In particular, we ﬁnd that the most important issues needing further investigation are: (i) the distribution of loop detectors within the links, (ii) the distribution of loop detectors across the network, and (iii) the treatment of unsignalized intersections and their impact on the block length.


Background
The MFD estimation methods in the literature can in general be classified into 3 main 19 categories: (i) empirical, (ii) simulation-based and (iii) analytical studies. Although the 20 empirical method is the most reliable and accurate way to estimate the MFD, the required 21 data is not available for many cities and even when available it is subject to significant 22 errors and bias. The exorbitant dependency of empirical approach on the availability and 23 accuracy of network-wide data makes it an impractical option for estimation of MFD in 24 many cities. While simulation seems to be a promising tool to overcome the dependency of 25 empirical approach on data, calibration of detailed micro-simulation models for large-scale 26 urban networks is a cumbersome and prohibitive task. 27 The aforementioned drawbacks of the empirical and simulation-based MFD estimation 28 methods highlight the significance of the analytical approach to determine the network 29 MFD using network topology and control characteristics and find out the impact of each   Nevertheless, the MoC is proposed for homogeneous corridors with equal block lengths 16 and signal settings, which makes it complicated to be scaled up for the network MFD  This method results in normally-distributed stochastic cuts, q s (k); therefore, the yielding 32 upper-bound MFD found using this model will also be stochastic, whose CDF is formulated 33 in the study. The authors found that the shape of the MFD mainly depends on two dimen-  Castrillón and Laval (2018) revisits the MoC to include the impact of buses on the MFD 38 of homogeneous corridors. They found that presence of buses has two major impacts: (i) 39 "moving bottleneck effect" due to lower speed of buses compared to other vehicles and (ii) 40 reduction in capacity due to the "short block" effect. Each of these effects result in an 1 additional cut in the MoC providing a tighter upper bound for the corridor MFD.   The model assumes that all links follow a triangular FD with capacity Q and jam density 3 K. The FD has been normalized by expressing the flow q and density k in terms of Q and K, 4 respectively. Laval and Chilukuri (2016) shows that by applying a shear transformation to 5 the FD one can reach a symmetric FD that can help to simplify the analytical computations. 6 Laval and Castrillón (2015) articulates that if such transformation is applied to the link 7 FDs in a network, the resulting MFD of the network will also have a symmetric shape. This 8 density transformation is a mapping (q, k) → (q, k ) that keeps the same flow but transforms 9 the density to the transformed density k via the following equation: where the transformed density domain is k ∈ [−1/2, 1/2]. This transformation is illustrated 11 graphically in Fig. 1a. One of the main benefits of this transformation would be that as a 12 result of the transformation the empirical data fed to the MLE method will be parametric 13 itself, which will make finding the PDF of the given dataset more tractable providing the 14 method more degrees of freedom to seek for a better fit.

15
After calculating the cumulative distribution function (CDF) of cuts for each strategy, 16 s, the CDF of the corridor MFD, F q(k ) (q), is given by: where Ω = {s 1 , s 1 , s 2 , s 2 } denotes the set of different forward-moving (denoted as ·) and 18 backward-moving (denoted as ·) strategies, F s,k (q) is the CDF of cut s, s 0 is the stationary 19 cut and B is the minimum number of stationary cuts. It has been shown that the cuts can 20 analytically be determined by the mean block length, µ , the mean green, µ g , and the mean 21 red, µ r , times and their respective standard deviations δ , δ g and δ r , which for the sake of 22 further simplification of the problem are all assumed to be equal δ = δ g = δ r = δ. The authors further claim that the shape of the MFD is mainly influenced by two di-1 mensionless parameters: (i) the mean block length to critical length ratio λ = µ µ * , and (ii) 2 the mean red to green signal time ratio ρ = µr µg . The critical block length is the minimum 3 block length to avoid the short block effect; see Fig. 1b. As seen in this figure, the critical 4 block length can be computed as: where k c is the critical density. If the dimensionless forms of flow and density are used in 6 the equation above, one will have: Knowing the values of the required parameters for a corridor or network, one can com-8 pute the cut CDFs and eventually determine the corresponding CDF of the MFD. However, 9 gathering the data required to find the model parameters is not a straightforward task. In 10 many cities the data is not readily available due to the lack of measurement equipment and 11 even when it is available, it is prone to different biases and measurement errors. Computing 12 the parameters becomes more challenging if the method is applied for network MFD esti-13 mation due to the higher complexity and heterogeneity of the network topology and control 14 settings. We will delve into these challenges in more detail later in the paper. Here, we 15 will present a method to statistically estimate the parameters by applying the maximum  is not true for the signal timing parameters and especially for the mean red to green signal 30 time ratio, ρ, which is one of the most influential parameters impacting the shape of the data is available for all approaches of all intersection across the network, assuming that most 36 signals have only two phases, the sum of green times will almost be equal to the sum of red 37 times (neglecting the lost times) at the approach, intersection and network level, therefore 1 resulting in ρ ≈ 1.
2 Nevertheless, in many cities there are intersections with protected left turn phases or 3 more than two phases, which will make the mean red time higher than the mean green time 4 and ultimately ρ >1. To gain a better understanding of how the presence of traffic signals 5 with any general number of phases, n i , impacts the value of ρ, let us first focus on a single 6 phase of the intersection. By definition, one can find the mean red to green signal time ratio 7 for a phase, ρ j , as: where g j is the average green time for phase j and g k represents the average green time for 9 the rest of the phases, k.

10
The value of ρ in the model is in fact an indicator of the available intersection capacity.

11
Higher ρ values mean that more time is assigned to the red phase compared to the green 12 time and less intersection capacity is being utilized, while lower ρ values would mean higher 13 utilization of the cycle time and intersection capacity. Therefore, the rational way to compute 14 the average ρ i for an intersection, i, is to take a weighted average of the values for all phases in 15 the signal timing with respect to their respective demand or flow. Without loss of generality, 16 one can assume that the signal timing is efficiently programmed to be proportionate to the 17 demand values at each phase. Hence, the weight of each phase, w j , can be found as: (weighting factor of each phase) where f j is the demand or flow of each phase j and C is the total cycle time for the 19 intersection. Now, the mean red to green time ratio for intersection i, ρ i , can be computed 20 as: Having obtained a simple expression for ρ i , the mean red to green signal time ratio for 22 the entire network with N intersections can be calculated as: wheren is the average number of signal phases of the intersections in the network. Now 24 that we have found how all the model parameters can be determined to approximate the 25 network MFDs, we can proceed to estimation of model parameters using empirical data. Once the probability distribution of a dataset is known, we can estimate the population 2 parameters using the distribution fitting methods such as the method of moments or the 3 maximum likelihood estimation (MLE) method. In this study, we will use the MLE due 4 to its large-sample or asymptotic properties to estimate the analytical MFD given by the Let the probability density function (PDF) of the MFD, which can be derived as the 10 derivative with respect to q of the CDF given in Eq.
(3), be denoted as f (x|Θ). The 11 likelihood function for a given dataset X is the joint density of n independent and identically 12 distributed (i.i.d.) observations, which is the product of their individual densities: where Θ is the unknown parameter vector. Considering that some of the model parameters 14 presented in Table 1 are dependent and can be derived using the other parameters, we choose 15 Θ = {Q, K, θ, µ , µ g , ρ, δ} as the independent parameter vector. It is worth noting that the 16 likelihood function is a function of the parameters conditioned on the data whereas the joint 17 density is a function of the data conditioned on the parameters. In the MLE method, we aim 18 to find the parameter values that maximize the likelihood function for the observed data.

19
However, for the sake of more convenience in the computations, it is preferred to maximize 20 the natural logarithm of the likelihood function, l:  Knowing the distribution of the estimators will enable us to perform hypothesis testings for 33 the estimated values. MFDs for the networks in the dataset and then apply the proposed estimation method to 10 approximate the analytical MFD and estimate the network topology and control parameters.

11
However, the provided raw data needs to be cleaned and processed to produce valid empirical 12 MFDs, which will be elucidated in the ensuing subsection for the sake of reproducibility of 13 the research. with valid data for more than 80% of the time intervals.

40
• Moreover, we only keep the intervals which have valid data from more than 80% of 41 the loop detectors selected in the previous step.

42
After cleaning and processing the raw data as explained in the steps above, we obtain 1 the average flow versus average occupancy MFDs for 30 cities as shown in Fig. 2. One of 2 the most conspicuous issues observed in these empirical MFDs is that the resulting diagrams 3 only cover a very limited range of occupancy (density), resulting in a lack of the capacity 4 state and the descending congested branch of the MFD for most cases. This deficiency in the 5 empirical data might impact the performance of the proposed parameter estimation method 6 and its results, which will be discussed further in the next section.  these constraints for all model parameters are given in Table 2, which have been selected 25 in a way to provide a large feasible space for the parameters so that the optimizer can find 26 optimum parameters inside the parameter space. In addition to the parameter constraints 27 given in Table 2, another constraint have also been imposed to keep the free-flow speed, u, 28 less than 60 km/h.

29
Although the CMLE problem will ensure that the estimated values are in plausible   space. In such case, the first derivative of the log-likelihood function with respect to the 1 parameter on the edge of the parameter space is not necessarily equal to zero, therefore the 2 approximate multivariate normal distribution for the parameters will be invalid. Therefore, 3 we should be careful about the selection of the boundaries of the parameter space to prevent 4 this problem from happening.

5
However, as it will be discussed in the next section, hitting the boundaries in the present 6 problem might be inevitable due to high number of parameters needed to be estimated, the

14
In this section, we will estimate the analytical MFD parameters using the proposed 15 MLE method for two different sets of empirical MFD data. The first set is the data from a  Table 2 has been utilized to find the optimum parameter value     Presence of buses is known to lower the capacity due to their lower speed and frequent stops, the discrepancy. Taking all into account, we conclude that the SMoC can capture the shape 10 of the corridor MFD satisfactorily provided that the parameters are carefully estimated.

11
Our proposed method is a step in that direction.      The estimated ρ values are all between 1 and 2, indicating that the average number 8 of signal phases for the studied networks is between 2 and 3 based on the formulation 9 presented in Section 3. Unfortunately, we do not have any information on the signal timing 10 configurations for the studied network, but the estimation results sound reasonable based 11 on what we can expect of the real-life configurations in urban network with many signals 12 having more than 2 phases.

13
Out of the estimated parameters, we can only obtain information on the link lengths 14 inside the analysis area of each network to compute the average block lengths, µ , using the 15 provided data. Fig. 5 illustrates the 95% confidence intervals for the average block length 16 estimations along with the measured average block length using the available data for the 5 17 networks presented in Fig. 4. Although the graphical fits look up to scratch, the measured in the analytical model. All of these issues will be discussed in further detail in the next 1 section.

2
The next subsection will present another set of estimation results assuming that the 3 values for the average block length and jam density parameters are already known for all 4 networks and the method only will estimate the 5 remaining model parameters to fit the 5 analytical MFD to the empirical data.  Among the SMoC model parameters, the only parameter that we can readily compute using 10 the available data is the actual average block length, . For each network, can been 11 computed as the average length of unique links with valid loop detectors. 12 The jam density can be defined as the traffic density when all the vehicles are jammed 13 on the road bumper-to-bumper. Assuming that the average vehicle length across different 14 urban networks does not vary significantly, the jam density can be considered as a universal 15 constant that does not vary from a link to another link or a network to another network. 16 Considering that average vehicle length is about 5 meters and allowing extra spacing about 17 1 meter between the vehicles, the universal jam density would approximately be 160 vehicles 18 per kilometer.  Figure 6: Estimated analytical MFDs and cuts using only 5 model parameters for the same networks as Fig.  4 displayed in the normalized flow units with respect to the estimated link capacity, Q, and the transformed and normalized density, k , as discussed in subsection 3.1. Note that the yellow markers show the empirical MFD points and the blue markers represent the hypothetical congested branch mirrored with respect to the line k = 0.5. The solid black line is the 50% CDF and the shaded area shows the 16% and 84% CDF limits for the corresponding density values.   Table 5, where all the newly-estimated parameter values are statistically significant.

27
The lack of a bending trend in the empirical MFD will prevent the proposed method 28 from accurately locating the MFD cuts and estimating their associated parameter values.

29
Therefore, the significance of many parameters in the model will drop and in order to max-   Zurich network is illustrated in Fig. 10. This figure shows a section of the Zurich network 12 where there are six unsignalized intersections with stop signs on the minor approaches along 13 a link between two signalized intersections. 14 We can safely assume that the incoming vehicles from the minor approaches, which 15 have to stop behind the stop signs, can enter the link only if there is sufficient headway 16 in the major link stream. Hence, they will not affect the flow of the vehicles that were for several datasets. 25 We have found that to extend the SMoC to estimate the analytical network MFDs, we parameters for networks are also discussed in this paper. However, reaching clear methods 12 to tackle these issues needs further consideration using additional empirical data, which are