The Concept of Comprehensive Data Analysis from Ultra-Wideband Subsystem for Smart City Positioning Purposes

As a part of the proposed article, the authors presented comprehensive data analysis for movement data that comes from a positioning system based on ultra-wide band (UWB) technology. For purpose of this article, a test was carried out during which the car equipped with cruise control overcame the given path at a speed from 10 km/h to 60 km/h. The obtained motion models (information about position) have been filtered through a series of filters from fundamentals filters with a variable window (median, moving average, Savitzky-Golay filter), through more complex ones like the Wiener or Kalman filter. As a result, the authors proposed a form of data analysis and filtration depending on the speed of the moving object. In addition, the maximum accuracy that can be obtained for a given traffic model was also determined. The whole research proves that it is possible to use a system based on UWB technology in positioning objects for urban applications smart city, in industry 4.0 applications as well as for positioning autonomous vehicles in urban applications, such as well as on highways to maintain cohesion of convoys vehicles.


Introduction
Determining the position of objects is now a very important function of many applications in the field of industry and urban solutions. In the implementation of this issue, many technologies are used, and several divisions can be specified, for example on Indoor Positioning Systems (IPS) or Outdoor Positioning Systems (OPS) [1]. Some of the technologies seem to be closely related to a specific application -for example, the use of global positioning system (GPS) in OPS applications [2][3][4]) or positioning using Radio-frequency Identification (RFID) tags in IPS applications [5,6]). There are technologies that, due to their accuracy, play an assisting role (e.g. Microelectromechanical systems (MEMS) [7], Wi-Fi [8][9][10] or GSM [11] as GPS assist systems). Finally, there are attempts to use various other systems for the role of localization systems such as Bluetooth [12][13][14][15], Zigbee [16,17], Infrared [18], Vision systems [19][20][21] etc.. As you can see, most of the available systems are implemented in IPS, and their use outside closed spaces meets with many limitations, such as operating range (Bluetooth, Zigbee, Infrared), stability (MEMS) or required technical facilities (vision systems).
An ever-growing society needs precise location, especially in heavily urbanized urban spaces, where the GPS signal is disturbed by buildings. Also, an industry seeks to maximize profits through the optimization and automation of production process. To meet these requirements they need a single, universal system that ensures a precise position in both IPS and OPS applications. Ultrawideband (UWB) technology can be the answer to this demand -although it is already well known and described in the case of IPS applications [1,22,23] -it is still waiting to be discovered in outdoor applications.
As the UWB technology is the subject of many cross-cutting works [1], none of them has a description of how this technology works in the case of objects moving at high speed, the authors decided to carry out research aimed at sensibility of using this technology in such applications. One of the additional impulses motivating the research was the fact that on the current level of development of this technology, automotive companies are already conducting research on optimal UWB placement of antennas on cars [24], which in the future is to apply to maintenance-free opening of vehicles. Manufacturers of UWB systems themselves perform accuracy tests based on speed, but they focus mainly on IPS applications. An example is a leading company producing UWB modules and their tests of DWM1000 modules, which showed accuracy of 10cm at speeds up to 5 m/s (18 km/h) [25]. Therefore, it has been intriguing whether it is possible to apply this technology in the positioning of moving vehicles, and with what efficiency we will have to deal with it.
In connection with the above, the authors carried out research to determine whether it is possible to use UWB technology for positioning vehicles moving at a regular, urban speed, and which path in data processing should be chosen so that the positioning accuracy is as high as possible. The effect of the authors' work was to become the answer whether the UWB is a technology that is a response to the demands that in the context of positioning puts such areas of life as smart city, autonomous vehicles or industry 4.0.

Data analysis process
The process of tag position determination was connecting few steps of data acquisition and signal processing. At the hardware level there were several ways to obtain distance from the time and propagation velocity of the electromagnetic wave [23]. In presented system the TDoA (time difference of arrival) was used (see Section 2.2). Figure 1 presents flow data diagram which represent several steps in analysis of data that was obtained from hardware. The data frame that was received from hardware included not only distances to particular nodes ( ) but also timestamp, quality factor and receive signal strength indicator. So, in first step the information about distances had to be selected to further data analysis.  Figure 1. Data flow in the system -from hardware to accuracy calculation Data buffer contained actually analysed distances from ranging system ( = { , , , }). In fact, it was the FIFO (first in firs out) queue with the size of used window width. Raw distances can be burdened with error that depends on distance between nodes. This type of error was reduced (block B) by using a distance-dependent function (shape of the function and its arguments were adjusted in background phase). Raw data should be analysed with respect to random system error. Outliers -data that was distant from other observations in window, are detected (block C) by the use of: multiplied standard deviation; maximum difference between values in window according to velocity of the object; or combined of both previous (the value of this factor is signed ). Found outliers can be corrected by the mean as well as median value (block D). After prefiltration processes position of the object was calculated (block E). Main data filtration (block F), like mean, median, Savitzky-Golay or Kalman and Wiener filters, was performed on object position in each axis separately. Statistical measures, like RMSE (root mean square error), were used to check if the filtration process gives improvement in position (block G).

Ranging algorithm and device
Particular distance between nodes was calculated by the use of travel time of a radio signal and the velocity of propagation of the electromagnetic wave. The communication process is presented in the Figure 2. The obtained wave propagation time -, was converted to the distance between nodestag, and particular anchor. The time of flight can be calculated with the use of the (1). When the time of flight was calculated, the distance between tag and particular anchor could be calculated with the use of the (2).
where: -the distance between nodes; -the speed of electromagnetic wave propagation; and -the wave propagation time between nodes.
The test stand consisted of a UWB tag -a movable devices that processes signals and calculates distances, and four anchors -a stationary, reference points, presented in Figure 3. All of these devices were made of DWM1000 chips manufactured by DecaWave, compatible with IEEE802.15.4-2011 standard. According to the manufacturer declaration, the system was designed to create a real-time (RT) indoor positioning systems (IPS). It allows to localize objects with 10 cm accuracy, with the maximum moving speed up to 5 m/s (with subject to IPS). This technology also provides high-speed data transmission up to 6.8 Mb/s [25]. The system (localized object) returned distances in centimetres (using TDoA) between the tag and following anchors.  The four anchors defined the coordinate system, and the entire test stand was presented as is showed in Figure 4. Reference points (anchors) were placed on the ground, on a square of 5 m, and a car was moved from right to left through centre of the designed square -a track of the object is marked on the Figure 4 with a dashed arrow. The experiment was performed for various speeds in the range from 2.78 m/s (10 km/h) up to 16.67 m/s (60km/h) every 2,78 m/s (10 km/h). The movement was carried out in a straight line, but it must be mentioned that the vehicle was driven by the human driver and the human factor influenced the exact trajectory. The UWB node -tag, was placed in the passenger side glove compartment, and two people in the front seats were in the car during the journey. Electronics devices like radio, chargers, mobile phones have been turned off in the car.

Data interpretation
As a result of the data acquisition in different test scenario, six measurement series (for difference speeds) were recorded -see Table 1. ; } -quality indicator for all anchors; no unit, values between 0 -the worst quality, and 1 -the best quality.

Position determination
The system (tag through the serial port -COM) returned distances in centimetres between the tag and following anchors (as shown on datagram ). The trilateration algorithm was applied in order to convert the distances data to a position in created coordinate system.
The principle of operation flows from the fundamental geometry and the main idea is depicted in the Figure 5 [26]. There were three reference points (anchors) which were selected from all available, for example 1 , 2 and 3 . The position of the anchors were well known in three dimensions 1 → ( , , ) , 2 → ( , , ) , 3 → ( , , ) as well as distance from tag to particular anchor , and .  Equation (3) can be arranged as (4) or in matrix representation (5).
In our case , and did not lie on a straight line so the ( ) = 3 and ( ( ) = 1. The general solution of (6) is as equation (8).
where is the real parameter, is a particular solution of equation (8) and is a solution of homogeneous system of equations (9) -a Basis of Kernel ( ).
The and vectors can be determined using the Gaussian elimination method. The particular solution can also be excluded by the pseudo inverse of the matrix . To determine the parameter t let do (10).
After application of the constraint ∈ it follows (12) and thus (13).
This is a quadratic equation in the form + + = 0 with the solution as (14).
The solutions of the equation system (8) are (15). In the case of positioning using 3 anchors, the position of → ( , , ) can be represented as → ( , , ) or → ( , , ) depending on the expected range of positions in the , and axes. Applied multilateration algorithm does not require any additional range selection, because of the use of the additional reference point that allows us to indicate the point in three-dimensional space. However, it should be noted that in further considerations was made projection from the threedimensional, to the two-dimensional view (by omitting the height data) from above of the road on which the object moved, due to the nature of the problem, and the lack of relevant data for traffic on the line, which could be obtained from the height information.

Outlier recognition and removal
As it was mentioned distances to particular anchor (generally after correction of distance related error) should be analysed to find random system error. Outliers in this system were detected ( Figure 6) in window using mean value of distances without an -th sample and the error factor which indicates the maximum value of the object shift (16).
where: -speed of the object; Δ -time difference; and -scaling factor. When the outlier was recognized than the value of the distance could be replaced by the median or mean value in window.

Performed filtration
During the research different type of filtration methods have been applied to data after trilateration process with different widow size (3, 5, 9, 11, 13 and 15). Median filtration with window size of sample, where = 1, … , and ≤ is presented in equation (17).
In this case, the value was always an odd number, so the form for = 2 + 1 was used. Another filtration method was moving average filter. This type of filtration reduces the amplitude of random outliers very well (18).
The next filtration method was Savitzky-Golay filter which is based on polynomial approximation by linear least squares method. This filter fit degree polynomial to data in window so that the sum ( ) of squares of differences of values calculated from the approximating function ( ) and the value of the sample of the input signal ( ) was as small as possible in the window (19).
The new filtered value of was calculated using approximated function coefficient (20).
Kalman filtering (KF), (also known as linear quadratic estimation -LQE), is an recursion algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone [27,28]. The Kalman filter model assumes the true state at time ( ) is evolved from the state at ( − 1) (21).
The classical form of a linear-discrete KF is given by prediction shown as (22) Using the data properties which are characterized by the normal (Gaussian) distribution of samples, it was decided to use the Wiener filter. The Wiener filter [29] estimates local mean (24), and variance (25), around each sample where is the -by-local neighborhood of sample . Due to the fact that the filtered data are vector of the window width size (1-by-) the above equation can be simplified as follow (26)  The output of the filter is given by the formula (28), where is the variance of the noise. The neighbourhood used in Wiener filtration is [5 1] which is equal to windows width of 5 samples.

Measure the value of the results
The results obtained were checked for several different statistical parameters. The first of these was the RMSE (root-mean-square error) (29).
It allowed us to determine how much the distribution of points differs from the expected passage line for each of the measurement series.
The mean absolute error of position (MAE) from the estimated route was also considered (30).
This value allows you to determine how much the entire ride goes past the expected value. Another value considered was the maximum deviation of the sample for each of the measurement series expressed by the formula (31) The value obtained from this metric allows us to determine how much carried out filtration affects the maximum error of samples from expected points, it means maximum error can be expected when moving along a straight line with the declared speed.
Each measurement series was different number of samples which were collected during the object movement. The reason is that the area of movement was limited and the different speed of the vehicle. To compare results of filtration method between different series -speeds (in fact different number of samples); the standard error (SE) was used.

Position delay
It has to be taken under consideration that data analysis, acquisition and use filters with windows produce delay. The windows size influent the delay (33) as well as the time which is need for data acquisition and analysis (34). So, the maximum delay (represented by a travelled distance) which is produced by system related to speed and window size is expressed by the formula (35).  Table 2 presents how maximum delay -, changes with windows size and speed. Delays increase linearly with window size - Figure 7. For real time locating system (RTLS) it will be necessary to use additional subsystem that has high frequency of data acquisition if the delay of using only UWB system is too high. Values about delay and information about movement direction (using e.g. MEMS sensors [7]) can be used for position prediction between the successive position from UWB system.

Results and discussion
The mean absolute error of position, maximum deviation, root mean square error and standard error are presented in the Table 2, obtained for raw data and referring to distance were adopted as a reference data for further calculations. The results deviate significantly from the values declared by the producer (Section 1). Both the placement of antennas in the vehicle and the upward disturbances around it can affect this as well as the speed of the vehicle.
The first suggested methods of data processing were Distance Dependent Error Correction (DDEC) -block B in Section 2.1; and Outliers Removal (OR) -block C and D in Section 2.1. The first of them provided improvement up to 3.32 cm (R2), the second improved the results to 1.08 cm (R3). Their combination gave an improvement up to 4.40 cm (R2). The result of the correction was the improvement of the position obtained from 2% -5% for OR up to even 20% when using DDEC+OR combination. This allowed us to approach the accuracy declared by the manufacturer, when determining the distance from one reference point. It is worth to mention that there is also an error of multilateration process (the average for all series after correction DDEC + OR was 16.42 cm -2.84cm better than for the RAW data, and only 16.42cm difference from the manufacturer's accuracy for one distance at a speed of up to 5 m/s).
The next stage of the research was to carry out filtration with fundamental filters i.e. median, moving average and Savitzky-Golay filters. The obtained results of the RMSE of distance for the exemplary two series -R1 and R5; are presented respectively on the Figure 8 and Figure 9. The best RMSE results for data filtration with fundamental filters, at the lowest speed are obtained for the moving average filter -the error is 0.99 cm lower. In addition, there is a noticeable improvement in the results obtained with the increase of the filter window size. The other filters obtained a similar result, 0.45 cm for the median filter and 0.62 cm for the Savitzky-Golay. At this speed, using a filter with a moving window of size 15 generates 32,89 cm of delay (what was presented in Section 2.8). 25  In relation to above graph, like in the R1 series, the best RMSE results for data filtration with fundamental filters, at the R5 speed are obtained for the moving average filter -the error is here 2.93 cm lower. The noticeable improvement for median results with the increase in the size of the filter window occurs here to the size of window 3. Later changes are less significant, but they still occur. For moving average filtration, a significant improvement is also visible for the filter window 3. Then the error decreases almost linearly. Figure 10 presents the results for smoothing process (both DDEC and OR) and filtering process using Kalman and Wiener method separately for each measurement series. Standard error gives an opportunity to compare results which are achieved on unequal number of samples in data set. In average, Kalman filtration gives slightly better results in both cases -with full smoothing process and without it. The percentage error (based on average value for each speed value) for both Kalman and Wiener filtration is approximately 7%. In comparison, Figure 11 presents average RMSE and MAE values across all measurement series (from R1 to R6).   The final stage was to combine the smoothing process (both DDEC and OR), prefiltration of three different types of filtration -median, mean and Savitzky-Golay; and finally, main filtration using Kalman filter - Figure 12. As we can see, the best results were obtained for the moving average with Kalman filtration for all test series. In addition, the table also shows that the values for R6 has dropped up to 50%. The other filtration methods also improved the results, but only the improvement for the Kalman filter in combination with the moving average pre-filtration is noticeable for all measuring series. For best filtration process -DDEC and OR smoothing process, pre-filtration by moving average and main filtration by Kalman filter, the 10 th to 90 th percentiles of position error for R4 and R6 are presented respectively on Figure 13 and Figure 14. For R4 series in average about 70% of measurements have error of position lower than half maximum error value. In case of R6 series 90% of measurements have error of position lower than half maximum error value.

Conclusions
Based on the research, it can be said that there is the possibility of positioning objects in urban traffic (up to 60 km/h) using the UWB system. However, one must keep in mind the existing limitations of this technology e.g. response time is not deterministic.
The results indicate that for vehicle traffic UWB data, they are susceptible to the occurrence of extreme outliers. In addition, it is possible to correct them using a suitably designed function that is based on statistical analysis of acquired samples.
Depending on the metric used, the best results of filtration of data from the UWB system were obtained for the combination of DDEC, OR filtration, Kalman filter and moving average with window 15 for MAE, MAX and RMSE. The best result of the improvement, averaged across all speeds was from 8% (MAX), 17% (MAE, RMSE) up to 20% (SE). Obviously, the important issue is filter window size, e.g. for higher speeds, filters with window 7 and above gave equally good results, which suggests that using a higher window size leads to longer delay with constant accuracy. The experiments showed that the accuracy declared by the manufacturer of 10 cm is reflected in the raw data error from one reference point and static object. After the trilateration, the cumulative error from the four reference points is larger (RMSE 15.92 cm -30.58 cm), however, if the filtration is used, even for vehicular traffic in urban conditions (60 km/h), authors are able to obtain the position RMSE accuracy of 7.60 cm -12.50 cm however the position information delay have to be taken under consideration.
In the case of an RTLS system, it is required to use a high-frequency positioning subsystem. Higher operation of the positioning subsystem guarantees smaller impact of the distance travelled on the total position error resulting from the system error and shift of the moving object.
Thus, the presented solution corresponds to the thesis presented in the assumptions of the study, and its use -considering the precision of operation below 1 meter for high speed -can be widely used in smart city or industry 4.0 applications.