Point Cloud Data Retrieval from 3D Geospatial Database for Automated Road Median Extraction

Laser scanning systems make use of Light Detection and Ranging (LiDAR) technology 1 to acquire accurately georeferenced sets of dense 3D point cloud data. The information acquired 2 using these systems produces better knowledge about the terrain objects which are inherently 3D 3 in nature. The LiDAR data acquired from mobile, airborne or terrestrial platforms provides several 4 benefit over conventional sources of data acquisition in terms of accuracy, resolution and attributes. 5 However, the large volume and scale of LiDAR data have inhibited the development of automated 6 feature extraction algorithms due to the extensive computational cost involved in it. Moreover, the 7 heterogeneously distributed point cloud, which represents objects with varying size, point density, 8 holes and complicated structures pose a great challenge for data processing. Currently, geospatial 9 database systems do not provide a robust solution for efficient storage and accessibility of raw data 10 in a way that data processing could be applied based on optimal spatial extent. In this paper, we 11 present Global LiDAR and Imagery Mobile Processing Spatial Environment (GLIMPSE) system 12 that provides a framework for storage, management and integration of 3D LiDAR data acquired 13 from multiple platforms. The system facilitates an efficient accessibility to the raw dataset, which is 14 hierarchically represented in a geographically meaningful way. We utilise the GLIMPSE system to 15 automatically extract road median from Airborne Laser Scanning (ALS) point cloud. In the first part 16 of this paper, we detail an approach to efficiently retrieve the point cloud data from the GLIMPSE 17 system for a particular geographic area based on user requirements. In the second part, we present an 18 algorithm to automatically extract road median from the retrieved LiDAR data. The developed road 19 median extraction algorithm utilises the LiDAR elevation and intensity attributes to distinguish the 20 median from the road surface. We successfully tested our algorithms on two road sections consisting 21 of distinct road median types based on concrete and grass-hedge barriers. The use of GLIMPSE 22 improved the efficiency of the road median extraction in terms of fast accessibility to ALS point cloud 23 data for the required road sections. The developed system and its associated algorithms provide a 24 comprehensive solution to the user’s requirement for an efficient storage, integration, retrieval and 25 processing of large volumes of LiDAR point cloud data. These findings and knowledge contribute to 26 a more rapid, cost-effective and comprehensive approach to surveying road networks. 27


of 21
polygon and combination of them) and provide 3D spatial indexing mechanism that enables fast data 133 retrieval. 134 Several other approaches have been developed for managing the large volume of LiDAR point 135 cloud data. [16] presented a scalable approach to interpolate grid DEM from large LiDAR dataset 136 based on quad-tree segmentation. In their approach, the point cloud data was partitioned into a set of 137 quad-tree segments and then each segment was interpolated using points within the segment and its 138 neighbourhood. The approach was tested on 390 million points (around 20GB) to interpolate DEM 139 in about 53 hours, which other GIS software suites were unable to process. [17] applied octree and 140 local KD tree based approach to manage and visualize large LiDAR dataset, while [18] optimized a 141 workflow for processing airborne LiDAR data within GIS-based environment. [19] presented a study 142 with the use of spatial extensions in IBM's DB2 database to manage high-resolution airborne LiDAR 143 data. They also experimented with a single partitioned database on a supercomputer resource and 144 multi-partitioned database across several nodes that function together as a single database engine, 145 in order to deal with large volume dataset. In [20], a method was proposed for distributed data 146 organisation and parallel data retrieval from huge volume of airborne LiDAR data. The distribution 147 strategy took into account the spatial relationship in between the dataset, while an improved data 148 retrieval speed led to fast analysis, visualization and processing of the point cloud. [21] implemented 149 an octree data structure to store and compress 3D point cloud data. They further demonstrated its 150 usage for an easier exchange of file format, fast data visualization and an efficient plane detection 151 algorithm.
[22] presented a point cloud management system based on groups of points that provided 152 a perspective for meta-data, concurrency, integration with other geospatial datasets, filtering and 153 fast processing. The proposed system was tested with several billion points acquired from aerial, 154 terrestrial LiDAR and stereo-vision. With an expansive growth in cloud computing services, there has 155 been an increased deployment of various distributed web-based LiDAR management applications. 156 NSF-funded OpenTopography is one such significant application that provides a web-based access 157 to high resolution LiDAR topography data along with derivative products and online processing 158 tools [23]. Another such application is Dielmo's LiDAR-Online that provides a web-based platform to 159 visualize, access and process LiDAR data [24]. 160 LiDAR has matured to an accurate technology which can be employed for reliable extraction 161 of various features along road networks. The extracted roads can be represented as homogeneous 162 areas or pairs of parallel lines corresponding to edges, depending upon the spatial resolution of the 163 input dataset. The methods developed for segmenting roads from ALS datasets are mostly based 164 on utilizing their attributes to distinguish road areas from other objects. [25] reported their work 165 on the segmentation of ALS data into road and non-road objects based on elevation and intensity 166 attributes, while [26] detected kerbstones based on the detection of small height jumps caused by them 167 in the ALS data. Their road extraction results were influenced by the presence of parking, private 168 roads and parked cars in the surveyed areas. The integration of high resolution optical imagery or 169 2D topographic map data with aerial LiDAR data for road extraction has also been reported [27][28][29].

170
However, the road extraction accuracy might be affected from positioning errors inherited in maps 171 and occlusion arising from building and tree shadows in optical imageries. In most recent works, [30] 172 presented road detection approach in which ALS data was filtered to estimate ground points and then 173 road candidates were identified based on local distribution of intensity histogram.
[31] proposed a 174 method to extract road centrelines using ALS data. Their method was based on filtering the ground 175 points and then estimating road points by applying an optimal intensity threshold. The estimated 176 points were finally refined by removing narrow roads and attached areas to extract the network of road 177 centrelines.
[32] detected the roads in forested mountainous areas using ALS data. In their approach, a 178 supervised classification was applied to Digital Terrain Model (DTM) and then a graph was built over 179 candidate regions to locate the roads. Finally, the roads were characterised to estimate their width and 180 slope parameters using an object-based image analysis. Several methods have also been reported for 181 extracting road edges, in particular, from MLS data. These works are particularly focused on extracting kerb edges in an urban environment, where there is a sufficient height or slope difference in between 183 the road and kerb points [33][34][35][36][37]. In rural conditions, the road comprises of grass-soil surface, in which 184 case the edges are not as easily defined by slope or elevation changes alone. The approaches developed 185 for extracting rural road edges from MLS data are based on integrated use of its elevation, intensity 186 and pulse width attributes which were utilized to distinguish the road from grass-soil surface [4,38,39].

187
Apart from these, several other methods have been proposed for extracting road markings [40,41] 190 One of the major constraints in the approaches developed for extracting road objects from LiDAR 191 data is the computational intensive, iterative and time consuming processes involved in them. This is 192 due to the massive size and un-organised nature of LiDAR datasets that limits their meaningful analysis 193 for extracting relevant information. Such huge volume datasets give rise to significant challenges for 194 data visualization, efficient data analysis and rapid data processing. The users are prohibited from 195 exploiting the full range of opportunities that LiDAR data offers. There has been very limited use 196 of any LiDAR data management platform in the road features extraction processes that could have 197 provided spatially optimised accessibility and fast data processing to the users. This, in turn, would 198 have been beneficial in terms of improved efficiency and computational capabilities of automated 199 algorithms. Some SDBMSs offer capabilities for storage, management and retrieval of LiDAR data 200 but fail to support an efficient analysis of such vast dataset. The existing data types and spatial 201 indexing techniques appear to be insufficient to handle large volumes of LiDAR data. There has been 202 very few systems where the LiDAR datasets acquired from terrestrial, mobile and aerial platforms, 203 could be integrated into a single data management solution. There is a need for a more robust and 204 comprehensive data management framework that could provide spatially optimised, unrestricted 205 and integrated access to the LiDAR points and its attributes. This would facilitate an efficient data 206 analysis and fast data processing in order to extract road features from LiDAR data in an automated 207 and operational way. Towards this goal, we describe the GLIMPSE system in the next section.

209
Empirical experience with both ALS and MLS geospatial data has shown that the primary 210 obstacles in the processing of these datasets is their considerable size and the inability to easily constrain 211 them based on point attributes. Leading on from this is the preparation and extraction difficulties, 212 when using these data for bespoke requirements. For example, in the case of extracting a road median, 213 the process would be significantly constrained by the survey-processing methodology that prevails in 214 industry standard software suites. These suites provide no context for spatial optimisation of the data 215 loaded from many different surveys. Thus, data segmentation for road median detection cannot be 216 easily implemented through an optimal and empirically informed spatial approach.

217
However, approaching this problem with a point-cloud fusion and spatial-constraint perspective, 218 it is possible to optimise the LiDAR data being output to algorithms that specialise in feature extraction 219 process. This can be achieved through procedures that leverage the power of a platform such as 220 PostGIS and its numerous, integrated, spatial API's. The geo-referenced raw LiDAR is stored in a 221 database where optimised spatial indexes can be generated in order to facilitate efficient querying of 222 the data. Consequently, optimally located LiDAR data, across numerous surveys, can be output in a 223 user required spatial context. In that case, the road median algorithm, can be operated on a reduced 224 target data set relative to the original survey but, also, at higher point densities as spatially coincident 225 point clouds can be segmented and fused in the same database operations.

226
Towards this objective, a prototype cloud-application has been developed called GLIMPSE that LiDAR data uploaded into the system. The importance of this step is that it gives a spatial context,

232
where optimally located LiDAR data can be segmented and fused for output to a feature extraction 233 processing algorithm, as is the case with the road median extraction. The system enables a user to 234 segment the raw data using a spatial tool and then the processed results are visualised through the 235 GLIMPSE WebGL viewer, as shown in Figure 2.  In Section 3.1, we detail how spatial hierarchies are built and defined in the GLIMPSE system, 237 while in Section 3.2, we describe the optimal approach to data segmentation, fusion and retrieval.

238
Through these sections, the cloud-application based User Interface (UI) in the GLIMPSE system is also 239 detailed that enables interaction, understanding and visualisation of the platforms objectives. shown in the Google maps UI as blue transparent polygon. In Figure 3(a), the coverage of ALS survey polygon creation tools to intersect a planar view of the available data in this area, which is presented 249 in Section 3.2. This is also true for automated process that can be scripted into the GLIMPSE platform.

250
The spatial hierarchy procedure has been implemented as a bespoke implementation of the The first stage in our process is to snap all the LiDAR data to a spatial grid. A sub-step in the 261 first stage involves the application of a sub-sampling threshold to the LiDAR. This sub-sampling is 262 applied differently depending on the LiDAR data source; in the case of MLS data, areas with high point 263 densities, close to the survey vehicle, are sub-sampled with a higher threshold than areas with lower 264 point densities. The second stage is to generate concave hulls for all the sub-sampled and gridded data.

265
Finally, a spatial union is performed on all the concave hulls for the LiDAR data being processed.

266
This final stage can take the form of a number of different spatial unions such that the highest 267 accuracy concave hull is a direct 2D spatial representation of a single table of raw LiDAR. This idea 268 follows through to unions that define all the data in different spatial configurations; local, regional, 269 national, etc. The spatial union can also be applied in a number of other ways such as modelling the 270 data using the survey based approach that exists in typical commercial software suites.

272
Having approached each stage of this framework pipeline with a spatial constraint perspective to 273 the fore, it is possible at this stage to optimize the LiDAR data being output. This can be achieved, 274 once again, through procedures that leverage the power of the PostGIS platform through its numerous, 275 integrated, spatial functionalities. Due to the spatial-indexing and spatial-hierarchies that have been LiDAR. Consequently, this could be an automated processing algorithm, such as the road median 280 extraction algorithm, where subsets of LiDAR can be spatially optimized and fused such that the 281 algorithm handles only relevant LiDAR in the system. In Figure 3, we can see a sample of the GLIMPSE 282 UI which highlights how this can happen. Alternatively, this example use-case could just as easily be a

283
LiDAR awareness approach where LiDAR can be quickly and easily segmented for a user to view the 284 outputs such that they are suitable for a given process or requirement.

285
In this example both point or polygon spatial-constraint geometries can be created by a user; 286 polygons have been created in this case as can be seen in Figure 4. This operation provides  logically extend. In the next section, we describe our point cloud retrieval and road median extraction 307 algorithms.

309
We developed algorithms to efficiently retrieve ALS data from the GLIMPSE system based on user 310 specified spatial extent and then to automatically extract road median from the retrieved data. The use 311 of GLIMPSE system facilitates the segmentation and fast retrieval of point cloud data for a particular 312 geographical area. In the first part, we describe a method to efficiently retrieve ALS data from the  Our point cloud retrieval algorithm is developed based on the assumption that a polygon spatial 321 tool in the GLIMPSE system can be utilised to segment the ALS data, as shown in Figure 6. The    LiDAR data for extracting the road median. The process of estimating road surface points using road 365 polylines is shown in Figure 9.

367
The road sections consist of highways crossing above them at some locations, which are required 368 to be removed in order to get a correct estimation of the road median. In Step 2 of our algorithm, these 369 crossing highways are removed based on frequency distribution of the elevation values obtained from 370 the road surface LiDAR points. We assume that along the road section, a large number of LiDAR points 371 will belong to its surface, while in comparison less points will correspond to any highway crossing are retained for further processing. The value of parameter is estimated empirically and fixed for 376 all the road sections in such a way that it could be useful in removing the crossing highways. In this 377 way, crossing highways above the road sections are removed by detecting the road surface points with 378 maximum elevation frequency. An example of removing crossing highway above the road section is 379 shown in Figure 10.     In the second process, we group cells into objects in the dilated image using connectivity. If a cell 414 has a value of 1 then it is connected to the cells whose values are 1 and are directly above, below, left or 415 right of that cell. We calculate the length and average width values of each object in the dilated image.

416
Objects whose length and average width values are less than length threshold, T L and width threshold,

417
T W are considered as other road surface elements and are removed from the image, as shown in Figure   418 12 (  In final Step 6 of our algorithm, we extract the 3D road median points from the 2D output. The 427 original 3D LiDAR points which are contained within the 2D road median cell boundaries are extracted.

428
In the next section, we present the test results of our algorithm on the road sections.

430
The dataset acquired using an ALS system along dual carriageway roads in Ireland was uploaded 431 into the GLIMPSE system. In the first part, we applied our point cloud retrieval algorithm to efficiently 432 access ALS data from the GLIMPSE system based on our specific requirements. We, as a user, explored 433 the road polylines imported in the GLIMPSE system and clicked the points near two preferred road 434 sections of dual carriageway. In each section, other input parameters were provided as l c = 50m, 435 w c = 50m and l t = 1000m. The first 1km section consisted of road median with narrow concrete 436 barrier, as shown in Figure 13(a), while the second 1km section contained road median with wide 437 grass-hedge concrete barrier, as shown in Figure 13(b).

443
In the second part, the accessed data files were batch processed to extract the road median 444 along the tested road sections using empirically estimated parameters. The value of was selected 445 as 4m, which was found to be useful in removing the highways crossing above the road sections.

446
The threshold parameters, T elev and T int were applied as 200 and 100 respectively, to get an initial

456
Our algorithms were able to successfully extract the road median in the tested road sections based 457 on a specified spatial extent. We, as a user, specified input parameters in the point cloud retrieval   incidence angle of the laser pulse, the distance from the laser scanner and the illuminated surface.

501
The normalisation of intensity attribute with respect to these factors will provide true reflectance 502 values from the targeted objects. The use of such normalised intensity values in our algorithm will 503 improve the quality of extracted road median. The tested road sections were also associated with 504 highways crossing above them at some locations, which were efficiently removed based on frequency 505 distribution analysis of elevation values. This analysis was done based on an assumption that large 506 number of points will belong to the road surface in comparison with crossing highways. However, in 507 case of wider highway, the large number of LiDAR points will belong to it and this will lead to the 508 removal of road section beneath it. In the morphological operations, we applied different values of 509 width threshold due to different width of the median in the tested road sections. 510 We analysed the computational performance of our two algorithms by estimating the total time 511 taken to retrieve the data from the GLIMPSE system and then to finally extract the road median.

512
In the first and second road sections with each 1km length and 50m cross-section length, it took 513 approximately 37 minutes and 119 minutes respectively to retrieve and process the dataset. This 514 analysis was performed on a computer with Intel Core i5-6600 processor @3.30GHz, 8GB RAM and a 515 64-bit operating system. The second road section consisted of wider road median due to which it took 516 more time to process the dataset in comparison with the first road section.

518
In this paper, we presented the GLIMPSE system that provides a comprehensive framework    representation of large scale data in a geographical meaningful way, that enables the user to spatially 521 segment the data based on its requirement. The use of such a system provides a framework for the fast 522 retrieval of point cloud data based on an optimal spatial extent which, in turn, improves the efficiency 523 of automated algorithms in terms of computational cost and reduced processing. We developed the 524 methods to efficiently retrieve point clouds from the GLIMPSE system and to automatically extract the 525 road median from the retrieved data. In our point cloud retrieval algorithm, the road vector polylines 526 are used as secondary data source to estimate the parameters required for spatial segmentation of ALS 527 point cloud in the GLIMPSE based on user input information. In case of MLS dataset, the navigation 528 points can also be utilised as secondary data source, which are usually procured during the data 529 acquisition process. The GLIMPSE system provides a single platform for storage, integration, retrieval 530 and processing of large volume of LiDAR point cloud in a computationally efficient manner.

531
Our road median extraction algorithm was developed based on the assumption that LiDAR 532 data provides elevation and intensity values, which can be utilised to distinguish the median. The

543
In future work, the road median algorithm will be tested on road sections with more distinct 544 medians. The elevation and intensity threshold values applied to get an initial estimate of the road 545 median, were estimated empirically. However, a more robust and automated approach will need to be 546 developed in order to get the threshold values. We will also focus on the normalisation of the intensity 547 attribute, which will improve the quality of road median extraction. The size of input data sections 548 and cell size of raster surfaces impact the efficiency of our algorithm in terms of computational cost.

549
These parameters are required to be efficiently analysed to find their optimal values. Future work 550 will also focus on the integration of LiDAR and imagery data acquired from multiple platforms in the 551 GLIMPSE system, which will then be utilised to develop other road feature extraction algorithms.