3. Methodology
3.1. Map Page Adaptive Grid Division
A map page is a PDF page that uses graphic objects to express all map content such as elements inside and outside the map. Considering that different maps have different map page sizes, different vector PDF map file sizes and number of graphic objects in the map page and other factors, the division of the map page cannot be carried out by using fixed rows, columns or simply based on distance. This paper adopts adaptive grid division of map pages based on PDF map file size. The goal is to finally obtain a series of PDF tiles with file sizes within a specified threshold range, thereby providing a data source for fast and efficient display of maps. Taking the bottom left corner of the rectangle of the map page as the origin, the bottom edge of the map page as the X-axis, and the left side of the map page as the Y-axis, a flat rectangular coordinate system is established. The map page with width W and height H is divided into I row and J column, so as to get the cell grid with width dx and height dy.
The grid starts from the lower left corner, with rows horizontally and columns vertically, and is numbered incrementally to the upper right. For example, the grid number (i,j) indicates that the grid is located in the i-th row and the j-th column, where:0< i ≤ I, 0< j ≤ J. Schematic diagram of adaptive grid division on the map page is shown in
Figure 1.
The specific calculation method of meshing is as follows:
(1) Roughly calculate the grid side length S based on the PDF map file size, as shown in the following equation:
where:
1) N denotes the PDF file size of the PDF map in megabytes (MB),
2) M signifies the anticipated PDF file size of the resultant PDF tiles after meshing and cropping in MB,
3) W denotes the map page width,
4) H indicates the map page height.
Setting the parameter M to 0.3, corresponding to approximately 300KB, ensures optimal graphic display efficiency.
(2) The number of rows I and columns J equally divided on the map page are calculated by the equation:
Here, ⌊ ⌋ it means that the floating-point value of the inner surface is rounded down.
(3) Calculate the width dx and height dy of the element grid, as shown in the following equation:
3.2. Map Page Graphic Object Cropping
According to the width and height parameters of the unit grid obtained by the grid division calculation, the rectangular coordinates of each unit grid are calculated and used as a crop rectangle to crop the graphic objects in the map page. There are five types of graphic objects supported in the PDF format: text objects (Text), path objects (Path), embedded image objects (Images), shading objects (Shading), and external objects (XObject). Generally, the graphic objects in a vector PDF map page are mainly composed of path objects, and the path object consists of several straight line segments or three cubic Bezier curves or some combination thereof. Limited to the length of the article, this paper focuses on the clipping method of path objects composed of several straight line segments.
For the cell grid with grid number (i, j), the point coordinate in the lower left corner of the grid rectangle Rij is , and the point coordinate in the upper right corner is . This grid rectangle Rij is used as the crop rectangle to crop all graphic objects in the map page to generate a PDF tile file. The PDF tile file named Cell_i_j.pdf corresponds to the cell grid (i, j). The specific process of generating the PDF tile file is as follows:
(1) Create a new PDF document that contains only one PDF page, name the PDF page as a tile page, and respectively set the tile page width and height to dx and dy.
(2) Iterate through all graphic objects in the map page, as shown in
Figure 2, using the crop rectangle R
ij to perform the following crop processing:
1) If the circumscribed rectangle of the graphic object is outside the cropped rectangle R
ij, as shown in (a) of
Figure 2, no processing is performed.
2) If the circumscribed rectangle of the graphic object is located inside the cropped rectangle R
ij, as shown in (b) of
Figure 2, copy the graphic object and use it as a new graphic object to perform coordinate translation transformation on the new graphic object, that is, translate its coordinate origin to the cropped rectangle R
ij in the lower left corner point
of the page, add a new graphic object to the tile page.
3) If the circumscribed rectangle of the graphic object intersects with the cropped rectangle R
ij , as shown in (c) of
Figure 2, the specific cropping method is as follows:
As shown in
Figure 3, the graphic object is a polyline or a polygon, and the path consists of only a few straight line segments. Generally, the path can be clipped using classic algorithms, such as the Cyus-Beck line clipping algorithm Liang-Bar-sky line clipping algorithm, Nicholl-Lee-Nicholl line clipping algorithm, Weiler-Atherton polygon clipping algorithm and Sutherland-Hodgeman polygon clipping algorithm, etc. [
32,
33,
34,
35,
36].
In
Figure 3 (a), the path is a polyline. Starting from the starting point A, the path polyline is obtained along the direction of the path polyline and the entry point into the cropped rectangular area is 1. The exit point from the cropped rectangular area is 2, structure after cutting paths, set into the dot operator to “m”, set the dot operator to “l”, keep the point and its adjacent points out in the middle of the part. If the starting point in cutting inside the rectangular area, the starting point as the first point. If the end point is inside the cropped rectangular area, use the end point as the last out point to get the cropped path “1-B-C-2”.
The path in
Figure 3 (b) is a polygon. Starting from the starting point A, find the entry point 1 of the polygonal edge into the cropped rectangular area and the exit point 2 from the cropped rectangular area along the direction of the path fold line. If the starting point is inside the clipping rectangular region, it will be taken as the first entry point. With the first entry point as the starting point, one entry point and one exit point constitute one group along the direction of the broken line and the nodes in the middle are reserved. Thus, the polygon is constructed as “1-B-C-2-1”.
Move the cropped path coordinate origin to the lower left corner point of the cropped rectangle Rij to get the cropped path. Create a new graphic object based on the cropped path coordinates. At the same time, copy the parameter values such as line width, color, and fill of the cropped graphic object, and add a new graphic object to the tile page.
(3) Update the tile page and store the PDF document containing the tile page as a PDF tile file Cell_i_j.pdf.
3.3. Map Data Organization and Storage
After a single vector published map is meshed and physically cropped, hundreds of PDF tile files can be obtained, and a large amount of map data can be used to obtain huge data PDF tiles. For example: After the global 1:50 thousand topographic map vector published map grid is divided and physically cropped, the number of PDF tile files obtained can reach 269 million. Based on the average size of 200KB of each PDF tile file, the total size of PDF tile files is about 51.2TB. Such a huge number of files and data, if stored separately for each PDF tile file, it will inevitably bring problems such as low map retrieval efficiency and slow file transfer speed, which will greatly affect the map service capabilities. How to reasonably organize and store such a huge PDF tile file and quickly and efficiently display and data migration is an issue that must be considered. This paper proposes a set of methods for organizing and storing vector published map data based on the Geographical coordinate global Subdivision grid with One-dimension-integer on Two to n-th power (referred to as GeoSOT) [
37]. This method proposes the original map projection, based on the spatial extent of the map frame and the coordinates of the center point of the unit grid corresponding to the PDF tile file. The double mapping relationship between the grids is shown in
Figure 4. In the map data storage process, a single map is given a segmented grid aggregation code, a PDF tile is given a positioning grid code, and a map code data table and a thumb map field are added to the map tile data table. The PDF tiles data table adds segmentation coding and browse map fields to achieve the purpose of efficient indexing and fast display of map data.
This strategy facilitates efficient indexing and rapid display of map data.
(1) Map data external overall location coding
For single map data, first determine which grid or grids the map ranges fall into. These grids are called the GeoSOT location reference positioning grid outside the map data to identify the location of the map and space range. The location code is GeoSOT location reference location grid code, and the location vector is taken as a corner grid of GeoSOT location grid as the location vector, and the grid span along the latitude direction and the grid span along the longitude direction as the scale vector. For example, in the location coding, is the corner positioning grid coding, M is the grid span in the latitudinal direction, N is the grid span in the longitudinal direction, and M and N range from {1,2,3, ..., n}. When M = 1 and N = 1, and then.
The overall location coding process of the map data is shown in
Figure 5, which is as follows: First, select a sectional layer with a similar spatial scale according to the minimum outer rectangle of the map frame, and then coordinate the four corner points of the smallest outer rectangle of the map frame. They are converted into the grid codes of the section and hierarchy respectively, and their location reference positioning grids are determined by determining whether they are the same.
(2) PDF tile positioning coding inside map data
For a single map, the geographical coverage of a single PDF tile file data within the map is approximately equal and consistent with the average coverage of the cell grid. Calculate the average coverage of the cell grid according to the meshing parameters in
Section 3.2, select a section level with a similar spatial scale according to the coverage, and then convert the coordinates of the center point of the cell grid corresponding to the PDF tile file to this hierarchical grid coding is used as the PDF tile positioning coding inside the map data.
(3) Map data overall thumb map and PDF tile browsing map
The thumb map is a schematic diagram of the entire map data, drawn directly from the map page of vector published maps at a scale of less than 1 (e.g., 0.2). The map thumb is associated with the overall location code outside the map data. The PDF tile browsing map is an image drawn at 100% of the PDF tile pages inside the map. All PDF tile browsing maps can be stitched together and viewed as an overall image of map data. The PDF tile browsing map is associated with the PDF tile positioning code inside the map data.
(4) Logical storage of map data
The map data adopts a three-layer storage mode, including the location association layer, map product layer, and tile data layer, as depicted in
Figure 6.
The location association layer mainly includes fields such as RID, grid code, and product ID. Among them: RID is the unique ID of the association ID, grid code is the GeoSOT grid code included in the overall external location code of the map data, and product ID is the unique number of the vector published map product. The map product layer mainly includes product ID, map name, scale, page width, page height, number of rows, number of columns, thumb map, location code and other fields. Among them: the product ID is the unique number of the vector published map product, the map name is the vector published map product name, the scale is the vector published map scale, the page width is the map page width W, the page height is the map page height H, and the number of rows is the number of grid division rows on the map page, and the number of columns is the number of grid division columns on the map page, the thumb map is binary data of the entire thumb data of the map data, and the location code is the overall location code of the external map data. The tile data layer mainly includes fields such as tile ID, product ID, number of rows, number of columns, browse map, data stream, and positioning code. Among them: the tile ID is the unique number of the PDF tile, and the product ID is the unique number of the vector published map product, the row number is the row number of the PDF tile when the map page grid is divided, the column number is the column number of the PDF tile when the map page grid is divided, the browse map is the PDF tile browse map binary data, and the data flow is PDF tile file binary data, positioning coding is PDF tile positioning coding.
3.4. Map Data Query and Retrieval
Map data query and retrieval methods mainly include searching for map products based on product related information and based on area. Retrieve the map product according to the product related information. Directly query the map product layer data table, and return the matching map product ID and all its map data. Retrieving map products based on area is a retrieval method for quickly extracting area map data based on the GeoSOT mesh. The details are as follows:
According to the minimum outermost latitude and longitude rectangle of the input query area range, select a section level with a similar spatial scale, and transform the query area into a grid code. The grid code uses quaternary one-dimensional coding and starts with “G”. The retrieval operation only needs to retrieve the grid code in the data table of the map data location association layer and compare it with the code items in the table, so as to search the grid code of the parent grid, child grid and grandson grid corresponding to the input grid code, as follows:
From left to right, if each bit of the encoding is the same, if one of the bits is different, the item is discarded and the next encoding item is matched. If the same number of bits in the code have matched the last bit of either, then the code entry is recorded. According to the matching result, it can be divided into the following two processing methods:
(1) When the grid code in the data table is longer than the grid code of the query area, the data information of the coded item is matched to extract the map data associated with the product ID. The map data includes all the PDF tiles divided by the map page.
(2) When the grid code in the data table is equal to or shorter than the grid code of the query area, continue to query all tile positioning codes corresponding to the product ID in the tile data layer data table. If the codes match, extract all matching PDF tiles, that is, the local extent of the map matches the input query area.
3.5. Map Data Display Initialization
Two common trigger conditions for map data display are direct display of a single map according to the product ID and display of the map according to a specified area. The main purpose of the map display initialization is to establish the mapping relationship between the screen display window, the map page and the PDF tile. The map display initialization is slightly different for different trigger conditions.
(1) Initialization of full map display of a single map
In the map product layer data table, parameters such as page width, page height, number of rows, number of columns, and thumb map are obtained based on the product ID to initialize the full map display. The process is outlined in
Figure 7 as follows:
1) Fill the map page’s rectangle within the screen display window’s rectangle and make the centers of the two rectangles coincide, thereby establishing the transformation relationship between the PDF user coordinate system (X, Y) and the screen coordinate system.
2) Calculate the pixel width and height of the corresponding page display rectangle of the map page rectangle in the screen coordinate system.
3) Compare the size of the thumb map and the rectangle displayed on the page:
① If the thumb map is larger than the rectangle displayed on the page, directly zoom the thumb map to the position of the page display rectangle on the screen.
② If the thumb map is smaller, retrieve the PDF tile browsing map of all PDF tiles from the tile data layer data table based on the product ID. Stitch a large page view according to the position of the tiles in the grid division and zoom the page view to the position of the page display rectangle on the screen.
(2) Initialize the display of the specified area range
Use the map data query and retrieval method to obtain map data in the specified area. If multiple product IDs are included in the results, display one of the maps based on interactive selection, and display the PDF tile data in the specified area within the screen display window, as shown in
Figure 8, as follows:
1) Extract the smallest outer rectangle of the PDF tiles in the search results, fill the smallest outer rectangle with the screen display window rectangle, and make the centers of the two rectangles coincide, thereby establishing the PDF user coordinate system (X, Y) and screen coordinate system transformation relationship.
2) Recalculate the grid in the screen display window and obtain its PDF tile data.
3) For all PDF tiles in the screen display window, in the screen coordinate system, calculate the pixel coordinates of the grid rectangle where the PDF tiles are located as the display grid, and make the following judgments with the PDF tile view:
① If the display grid is smaller than the PDF tile view, directly zoom the PDF tile view to the position of the display grid.
② If the display grid is larger, use the PDF rendering engine to draw the PDF tiles into a display grid size and display the image at the grid’s position.
3.6. Map Data Display Control
After initializing the map data display, the user can implement panning, zooming in and out of the map through display control operations. The data sources for map display mainly include thumb map, browse map, and PDF tile. Among them: thumb map is used for the overall display of a single map, that is, to provide a preview of the map outline; the browse map is limited by resolution, and is used for ordinary map browsing; PDF tiles have vector characteristics and are used for detailed map display. The purpose of map data display control is to formulate a set of strategies for rapid dispatch and efficient display of map data. Simultaneously, the establishment of PDF document object cache and grid image cache aims to enhance map display precision and reduce response times for an improved user experience.
(1) Map pan and zoom operations
Map panning is essentially achieved by adjusting the rectangle position of the screen display window, and map zooming is essentially achieved by adjusting the rectangle size of the screen display window. During map panning, the data source used for screen display remains unchanged. During the map zooming process, the real-time calculation of the size of the unit grid occupying the screen display window is called the display grid, and the image occupied by the unit grid in the thumb map is called the thumb grid. By judging the display grid, thumb grid, and PDF tile browsing map, dynamically call the thumb map, browsing map, or PDF tile as the display data source, specifically:
1) If the display grid is not larger than the thumb grid, call the thumb map.
2) If the display grid is not larger than the PDF tile view, and is larger than the thumb grid, call the PDF tile view.
3) If the display grid is larger than the PDF tile view, call the PDF tile.
(2) Create PDF document object cache
In the map display process, when using PDF tiles as the display data source, in order to shorten the screen display response time and improve the user experience, a PDF document object cache needs to be established, as shown in
Figure 9. The specific process is as follows:
1) Creating a reference rectangle centered on the screen display window, forming a nine-square grid.
2) Calculate and obtain the element grid that falls into the nine-square grid by taking the smallest outer rectangle of the nine-grid as a reference.
3) Based on the distance between the unit grid in the nine-square grid and the center of the nine-square grid as a reference, sort the unit grid in the nine-square grid in order of the distance from small to large.
4) Determine whether the PDF document object corresponding to the cell grid exists in the PDF document object cache one by one in order. If it does not exist, extract the PDF tile data stream from the tile data layer data table and parse it into a PDF document object, and stored in the PDF document object cache
(3) Create a PDF tile image cache
After the PDF document object cache is established, the PDF rendering engine needs to be used to draw the pages in the PDF document object into an image, that is, a PDF tile image, which can be quickly displayed on the screen window using this image. As the drawing of PDF tile images takes a certain amount of time, it will directly affect the speed of map display. Therefore, this paper will cache the already drawn PDF tile images and identify them with a grid number, so that they can be called directly when needed next time. Similarly, due to memory constraints, thresholds need to be set when creating a PDF tile image cache. When the number of PDF tile image buffers exceeds the threshold, some PDF tile images need to be released. The cache release is based on the principle that the unit grid where the PDF tile image is located is farther away from the center of the nine-square grid. In addition, when the zoom operation is performed, the PDF tile image cache needs to be completely cleared. According to actual experience, the PDF tile image cache threshold is at least two times the PDF document object cache threshold.
(4) Map display based on double buffer
Using the PDF document object cache and PDF tile image cache, vector published maps can be quickly displayed on the screen. The specific process is as follows:
1) Create a memory display area compatible with the screen display device.
2) Calculate the pixel width and height of the rectangle of the unit grid in the screen coordinate system of the screen display window one by one, and create a blank bitmap.
3) Retrieve the PDF tile image corresponding to the cell grid in the PDF tile image cache and copy it to the blank bitmap created in step 2 as a display bitmap.
4) Copy the display bitmap to the corresponding position of the cell grid in the memory display area.
5) Copy the memory display area to the screen display window in real time.