A Precision Evaluation Method for Remote Sensing Data Sampling Based on Hexagon Discrete Grid

With the rapid development of earth observation, satellite navigation, mobile communication and other technologies, the order of magnitude of the spatial data we acquire and accumulate is increasing, and higher requirements are put forward for the application and storage of spatial data. Under this circumstance, a new form of spatial data organization emerged-the global discrete grid. This form of data management can be used for the efficient storage and application of large-scale global spatial data, which is a digital multi-resolution the geo-reference model that helps to establish a new model of data association and fusion. It is expected to make up for the shortcomings in the organization, processing and application of current spatial data. There are different types of grid system according to the grid division form, including global discrete grids with equal latitude and longitude, global discrete grids with variable latitude and longitude, and global discrete grids based on regular polyhedrons. However, there is no accuracy evaluation index system for remote sensing images expressed on the global discrete grid to solve this problem. This paper is dedicated to finding a suitable way to express remote sensing data on discrete grids, and establishing a suitable accuracy evaluation system for modeling remote sensing data based on hexagonal grids to evaluate modeling accuracy. The results show that this accuracy evaluation method can evaluate and analyze remote sensing data based on hexagonal grids from multiple levels, and the comprehensive similarity coefficient of the images before and after conversion is greater than 98%, which further proves that the availability hexagonal grid-based remote sensing data of remote sensing images. And among the three sampling methods, the image obtained by the nearest interpolation sampling method has the highest correlation with the original image.


Introduction
With the rapid development of earth observation, satellite navigation, mobile communication and other technologies, with the construction and application of smart earth and smart cities, the explosive growth of the amount of data we obtain has updated from the current GB (Giga Byte) and TB (Tera Byte) level to EB (Exa Byte) or even ZB (Zetta Byte) level, which has brought challenges to the integration and utilization of spatial data [1,2]. However, the application and storage methods of spatial data have not been developed to the corresponding level, which means a large amount of data is just stored but not fully utilized. Researchers have proposed the concept of Global Discrete Grid System, which is a multi-resolution hierarchical structure that uses a specific method to subdivide the surface of the earth infinitely [3,4]. Compared with the traditional spatial data organization and application mode, the global discrete grid system has hierarchical and continuity, which can avoid the deformation caused by direct projection. It is suitable for solving large-scale issues, and supports efficient processing of multi-resolution in structure [5,6].

Materials and Methods
In the process of using a hexagonal grid to reorganize remote sensing images requires hexagonal grid pixels to represent rectangular pixels, the key step is image sampling. This study determines the level of the hexagonal grid to which the remote sensing data belongs based on the spatial resolution of the selected remote sensing data, and then sample the original remote sensing image into hexagonal grid pixels according to the corresponding relationship between the grid unit and the pixel unit.

Conversion of original remote sensing image to hexagonal grid data
This study chose the planar aperture 4 hexagon grid system as the framework to transform the original remote sensing data. Because in this system the direction of the hexagon stays consistent and here is the hexagonal grid planar structure: (a) (b) Figure 1. This is a figure of the hexagonal grid planar structure. (a) shows that the ratio of the area of two adjacent layers is 4;The ratio of the side lengths of the two layers is 2; (b) shows the unit structure between different layers, and all the unit directions are the same.
We used the concept of "point grid" to establish a planar hexagonal grid based on an orthogonal coordinate system. As shown in Figure 2, the number of rows and columns (i, j) where the grid is located is coded, and (0, 0) is the origin of the coordinates. x y Figure 2. Planar hexagonal grid system based on orthogonal coordinate system As shown in Figure 2, when a regular hexagon is tiled, the odd and even columns have a certain offset. If the side length of the hexagonal unit is l, then the offset between odd rows and even rows is (i-1,j+1) Figure 3. The misalignment relationship between odd and even rows, take the hexagonal unit (i, j) as an example. (a) Schematic diagram of grid structure in odd rows;(b) Schematic diagram of grid structure in even rows According to the idea of "rasterization" and to simplify calculations, the attributes of the hexagonal grid cells and the calculation and analysis between grids are replaced by the corresponding grid center points, so that each grid in the grid structure is obtained. The geographic coordinate value of the center point of the grid is a key step in the application of the hexagonal grid structure to remote sensing data. According to the above analysis, the corresponding relationship between the planar hexagonal grid and geographic coordinates can be defined as: Where ( , ) and ( , ) are the geographical coordinates of the grid center point with grid coordinates of (i, j); is the side length of the hexagonal grid cell.
The appropriate grid level needs to be selected for modeling and sampling. Since the size of the grid cannot be exactly the same as the size of each pixel unit of the remote sensing image, in actual processing, the level we choose is the grid area that is slightly smaller than the area represented by each pixel of the remote sensing image. For example, each pixel of the remote sensing image represents 4 square meters, the size of a single grid on the 22nd layer is about 3.86 square meters, and the size of a single grid on the 21st layer is about 15.26 square meters. According to this standard, Landsat 8 images with a resolution of 30m can be stored on the 18th layer, and images with a resolution of about 1km can be stored on the 13th layer. On the selected level, the correspondence between each grid and the raster data in the remote sensing image should be determined. Since the area of each grid of the grid that we selected to store a certain remote sensing image is smaller than a single pixel of the remote sensing image, each grid can store the pixel value of the grid with the largest area, so as to minimize the loss of the original remote sensing image data. The basic process is as follows: It is the first step to determine the level of the hexagonal grid corresponding to the remote sensing image pixel according to the spatial resolution of the remote sensing image, and the corresponding discrete grid level should meet the requirement of 1 ≤ ≤ 2 . Then, based on the correspondence between geographic coordinates and hexagonal grid coding, a conversion model between remote sensing data and hexagonal grid data is established.
Hexagonal grids of different levels have corresponding areas of hexagonal elements. The best division area when dividing the grid is: Where SR is the area of remote sensing image, MR is the size of remote sensing image, m is preset size of remote sensing image after subdivision. When converting a rectangular grid into a hexagonal grid, the spatial resolution of the hexagonal grid (the size of the cells) can be determined according to the spatial resolution of the rectangular grid. Generally, the resolution of a rectangular grid corresponds to the hexagonal grid between two adjacent layers, rather than the area of a certain layer of hexagonal units. In this way, it is necessary to determine which layer should be converted to and the six sides of the corresponding layer. The absolute value of the difference between the area of the rectangular element and the area of the rectangular element should be the smallest.
Then, on a two-dimensional plane, fill the specific study area with hexagonal grid, and resample the pixels of the remote sensing image with the following resampling methods to determine the attribute value of the hexagonal grid unit.
(1) The nearest neighbor interpolation This method is aimed at a two-dimensional image, and takes the gray value of the nearest neighbor among the four neighboring pixels around the point to be sampled as the gray value of the point. Although this algorithm is simple to calculate, it only uses the gray value of the pixel that has the greatest impact on the sampling point (that is, the nearest) as the value of the point, and does not consider the influence (correlation) of other adjacent pixels, so resample the gray value of the latter image has obvious discontinuities, and the image distortion cannot be ignored.
(2) Bilinear interpolation As an improvement to the nearest neighbor method, this method is using the gray values of the surrounding 4 neighboring points to perform linear interpolation in two directions to obtain the gray value of the point to be sampled. That is, the corresponding weight is determined according to the distance between the point to be sampled and the adjacent point to calculate the gray value of the point to be sampled. According to Figure 5, the pixel value of P is determined by A, B, C, D by formula (3), and the attribute value of red-edged hexagon unit is determined by the value of P.
Compared with the nearest neighbor interpolation, the bilinear interpolation method takes the influence of the four direct neighbors around the sampling point on the sampling point into consideration, which basically overcomes the shortcomings of the former gray scale discontinuity, but the amount of calculation increased. However, because this method only considers the influence of the gray value of the four direct adjacent points, and does not consider the influence of the gray value change rate between each adjacent point, it has the nature of a low-pass filter to make the high-frequency components of the zoomed image with loss, the outline of the image becomes blurry. Compared with the original image, the image scaled by this method still has the problem of image quality degradation and accuracy reduction due to improper consideration of the calculation model.
(3) Cubic convolution method The cubic convolution method not only considers the influence of the gray value of the four direct neighboring points, but also the influence of the gray value change rate between the neighboring points. It uses 16 points to be sampled. The gray values of pixels in the larger surrounding neighborhood are interpolated three times. Three operations can get a zoom effect closer to high-resolution images, but it also leads to a sharp increase in the complexity of operations. This algorithm needs to select interpolation basis functions to fit the data.
The mathematical expression is as follows: The cubic convolution interpolation formula is as follows: Where A, B, and C are all matrices, and their forms are as follows: Where ( , ) represents the DN value of the pixel ( , ) in the original image. The sampled grid data needs to be stored as the coordinates of the center of the hexagon, the direction, size, and attribute values of the hexagon. When this information is read, an image based on hexagonal grid can be reproduced based on the information.

Accuracy evaluation of hexagonal grid data
Since the original remote sensing images are also obtained by sampling and quantization, the image itself has errors, and after the image is interpolated, the error is transferred to the new sampled points through interpolation, so the error transfer needs to be quantified and the hexagonal-based remote sensing image needs to be compared with the original image [30,31] by calculating the loss of information and the error of geographic coordinates in the process of expression on the hexagonal grid. Therefore, it is necessary to establish an appropriate error evaluation model to evaluate the error of the converted data.
The relevant elements involved in the comprehensive evaluation of a certain thing constitute the evaluation element set. A series of indicators used to evaluate the thing constitute the evaluation index set, of which the weights is not the same. The evaluation index set is a mapping of the evaluation elements set [29,32],where there are multiple mapping index sets for an evaluation element set. There are currently two typical standards for the principle of establishing an index system: one is comprehensive, non-overlapping (or redundant), of which indicators are easy to obtain; the second is scientificity, rationality and applicability [13,33]. According to the three criteria for evaluating the accuracy of spatial data conversion: maintaining the conservation of composition information, maintaining the conservation of area information, and maintaining the conservation of regional spatial pattern and morphological information, this research proposes evaluation indicators from three perspectives: basic evaluation indicators, evaluation indicators based on image features, and geometric evaluation index [34][35][36].

Basic evaluation indicators
The basic evaluation index is a set of evaluation index established based on statistical information such as the mean, variance, and covariance between the original remote sensing image and the grid remote sensing image. These indicators are mainly used to evaluate the image composition information before and after conversion, with clear connotation, simple model and convenient evaluation. However, the basic evaluation indicators only focus on the information amount and composition type of the image, and they cannot reflect the image structure and image feature information.
The basic evaluation indicators include Information entropy, Mutual information, Combination entropy, Structural similarity index (SSIM).

Figure 6. The basic evaluation indicators
Information entropy reflects the information richness of the image, and calculating the information entropy of the image before and after conversion can reflect the loss of information during the conversion process [37]. If the amount of change in information entropy before and after the data conversion is small, it means there is not much information during the process. The calculation formula is as follows: The mutual information is an important concept in information theory. It can be used as a measurement of the correlation between two variables, or a measurement of the amount of information contained in another variable. Suppose there are two random variables ( ) and ( ), their marginal probability distributions are A and B, and their joint probability density is ( , ). According to the related concepts of information theory, the mutual information between these two variables is: An image can be regarded as a two-dimensional random variable. The above concept can be easily extended to a two-dimensional space. For the two images A and B before and after conversion, the two images have the same gray level. Let the total gray level of the image be L, ( )and ( ) are the probability densities of the images before and after conversion. The probability density can be obtained by dividing the histogram of the image by the total number of pixels in the image. Therefore, the mutual information of the images before and after conversion can be defined as: The mutual information is an objective indicator that reflects the abundance of information before and after data conversion. The larger its value, the more abundant information the hexagonal grid data can obtain from remote sensing data, and it can accurately evaluate the accuracy of the conversion. Joint entropy is also a very important parameter in information theory. It can be used to measure the correlation between two images and reflect the joint information. The joint entropy of images A and B before and after conversion can be defined as: Where ( , ) represents the joint probability density of the converted image B and the original image A. Generally speaking, the larger the value of the joint entropy of the converted image and the original image, the richer the information contained in the image, so it can be used to evaluate the information richness of the image before and after the conversion.
The combination entropy of two images can be defined as: Where a represents the DN of image A, b represents the DN of image B, ( ) and ( ) represent the gray probability distribution density of A and B respectively, and ( , ) is the joint probability distribution of image A and B.
Structural similarity index establishes an image distortion evaluation model from the three factors of brightness, contrast and structure in terms of identifying things with human eyes. In this evaluation model, the mean is the estimated value of brightness, the standard deviation is the estimated value of image contrast, and the covariance can be used as a measure of the structural similarity of the image using the mean. This index can be used to compare the structural information of the image, thereby analyzing the distortion of the image, and objectively evaluating the image before and after conversion.

Evaluation index based on image construction
The edge structure similarity (ESSIM) can reflect the edge information of the hexagonal grid image obtained from the source image during the conversion process. It is assumed that after the original image A is filtered by Sobel operator, the pixel ( , ) gets the edge intensity ( , ) and its direction ( , ) are: Where Γ , Γ , k , k , σ , σ are adjustable constant.

Normalization of evaluation indicators and calculation of weights
In order to simplify the calculation results, the index value calculated for the images before and after the conversion is calculated, with 1 as the standard. The closer the index is to 1, the higher the similarity of the images before and after the conversion. In order to simplify the basic evaluation index system, we use the analytic hierarchy process to determine the weights of the five major evaluation index systems.
The first layer: the target layer. The purpose of the model is to evaluate the information retention before and after conversion of the hexagonal grid image and the similarity to the original image. Therefore, the degree of similarity is regarded as the predetermined goal of decision-making.
The second layer: the indicator layer. Through literature research, this paper selects five indicators (Information Entropy, Mutual Information, Combination Entropy, SSIM and ESSIM) as indicators to quantify the preservation of image information before and after conversion, the similarity with the original image, and the structural similarity. These five indicators are intermediate evaluation standards.
The third layer: program layer. The objects we analyze are the original remote sensing image and the sampled remote sensing image based on the hexagonal grid, so the basic decision plan can be divided into the original remote sensing image and the hexagonal grid image.
Based on the above problem analysis, we established the following analytic hierarchy model: A (21) According to the mutual influence of the five indicators, paired comparison matrixes are constructed as follows: (b). Calculate eigenvalues and eigenvectors MATLAB is used to find the maximum eigenvalue of matrix A and the corresponding matrix vector = ( 1 , 2 , … ) , and the following formula to normalize u.
= ∑ =0 (23) (c). Consistency inspection Calculate the consistency index CI: Calculate the consistency ratio CR: According to the calculation, we get CR=0.0145<0.1, which means that matrix has already passed the consistency check.
In terms of paired comparison matrixes 1 , 2 , 3 , 4 , 5 , we can find the weight vector of the hierarchical total sorting and perform the consistency test. Through calculation, it can be known that 1 , 2 , 3 , 4 , 5 passed the consistency test.
Then we calculated the total ranking weight and consistency test. According to the calculation, the weights of B on the overall goal are as follows:

Geometric evaluation index
In addition, this study also calculated the change in the DN value of the pixel unit and in line length before and after sampling.
In order to evaluate the credibility of the hexagonal grid image generated, this paper uses the checkpoint method. We randomly select 30 feature points in a scene image as the checkpoints, and use RMSE of pixel value to perform the accuracy evaluation remote sensing image based on the hexagonal grid. The formula is: Where ( = 1,2, … , ) is the pixel value of the checkpoint on the original image; is the pixel value of the remote sensing image based on the hexagonal grid; RMSE is the root mean square error of the pixel value.
In order to evaluate the retention of the geometric features of the image during the image conversion process, this paper performs a refinement operation on the local area of the Landsat 8 images before and after the conversion, and compares the differences between the two.
When refining the same target, the hexagonal refinement algorithm only needs six templates, and each template includes 7 units, while the square refinement requires eight templates, and each template requires 9 units. 2. 0 ( 1) = 1; 3. 2 × 4 × 6 = 0; In order to quantitatively analyze the results of the refinement of linear features, this paper uses the error band to indicate the degree of deviation between the original image refinement results and the hexagonal grid image refinement results.

= ∆ (27)
Where ∆ is the error area between the linear features extracted by the two image formats, and this parameter is the number of grid cells between the two linear features multiplied by the cell area. L is the average length of linear elements extracted by the two image formats, which is determined by the number of grid cells through which the linear elements pass. D is the average error width under this length.

Remote sensing data modeling based on hexagonal grid
The nearest neighbor interpolation, bilinear interpolation, and cubic convolution interpolation methods are used to convert remote sensing images of different resolutions (high, medium, and low resolution respectively, using GF-1, Landsat 8 images as examples) to data based on hexagonal grid.

The original data
Error area

Hexagonal data
Error band The original data D The conversion of remote sensing data to hexagonal grid data includes two stages: the first stage is the construction of the regional hexagonal grid; the second stage is the remote sensing data sampling based on the hexagonal grid. First, it is necessary to determine the four-corner coordinates of a scene of remote sensing data, and determine the scope of grid construction according to the four-corner coordinates. Secondly, the unit size of the hexagonal grid is determined according to the spatial resolution of the remote sensing image. For example, the spatial resolution of Landsat 8 is 30 meters, then according to formula (1) and Table 1, it can be determined that the grid level corresponding to Landsat 8 is 18 layers, and the area of the hexagonal unit of the grid on the 18th layer is 989.657 square meters. After determining the scope of grid construction, grid level, we can fill in the hexagonal grid of the corresponding level within the four corner coordinates of the remote sensing image.
(a) (b) (c) Figure 9. The hexagonal grid images converted from Landsat 8 images in the same area using different resampling methods. (a) is generated by nearest neighbor interpolation, (b) is generated by bilinear interpolation and (c) is generated by bicubic interpolation.
Using different resampling methods to convert remote sensing images in the same area into hexagonal grid images has different effects. Figure 6 shows the interpolation results of Landsat 8 images in the same area (Yellow River Estuary) using different interpolation methods. It can be clearly seen that, compared with nearest neighbor interpolation, bilinear interpolation and bicubic interpolation perform pixel value operations, and the image is smoother. The pixel value of the nearest interpolation has not been calculated, so the original characteristics of the remote sensing image can be better maintained, and the result of linear interpolation is smoother, for example, the boundary of the river channel does not appear jagged. The result of cubic interpolation can reflect more information and image features, but the calculation of a single pixel involves 16 adjacent pixels, which requires more work. The side length ratio of the hexagons between adjacent grid levels is 1:2. As shown in the figure 12, the grid unit side length ratio of the 19th level grid and the 18th level grid is 1:2, the 19th level Grid image can show more information and image details.

Basic evaluation indicators
This study calculated the basic evaluation indicators of Landsat 8 and GF-1 remote sensing image before ang after conversion. The calculation results are shown in Table 5. Comparing the values evaluated by comprehensive indicators before and after conversion of the images of the Landsat 8 and GF-1, it can be seen that the comprehensive evaluation result of the hexagonal image obtained using the nearest neighbor interpolation is closest to 1, Which shows that the hexagonal grid image obtained by nearest neighbor interpolation maintains the most information and structural features compared with the original image. The main reason is that in the process of using the nearest neighbor interpolation, the attribute value of the hexagonal unit has not undergone further calculations, so the characteristics and attributes of the original remote sensing data can be retained to the greatest extent.

Evaluation index based on image feature
(1) Accuracy evaluation for points The following are the results of random points selection of Landsat 8 and GF-1 remote sensing images:

Figure 13
Checkpoints randomly selected on remote sensing images The above is a schematic diagram of randomly selected check points for a scene image to calculate. It can be seen that selecting thirty checkpoints for a scene image can basically cover different types of features, so that degree of retention of the point feature of the hexagonal grid image can be calculated for the original image. The resolution of GF-1 image and Landsat 8 image is not the same. The spatial resolution of Landsat 8 OLI land imager is 30 meters, and the spatial resolution of GF-1 PMS image is 16m. As shown in Table 2, We calculated the RMSE of the pixel value of the checkpoint on images obtained by different sampling methods. Compared with bilinear interpolation and bicubic interpolation, the RMSE of the nearest neighbor interpolate on is the smallest. The reason is that the process of establishing a hexagonal grid is regular, and the nearest neighbor interpolation directly takes the pixel value of the nearest pixel in Euclidean geometry and assigns it to the target grid unit without linear calculation. Therefore, the nearest neighbor interpolation does not have a smooth process, which makes it easier to save the characteristic pixel points and land reflection information of the original pixels.
(2) Accuracy evaluation for the line As shown in the figure below, the sensitivity of the hexagonal grid image to angle changes is better than that of the original image, and it is better than the quadrilateral image in terms of maintaining traits, and can well maintain the curved characteristics of linear features. Hexagonal have consistent direction and better angular resolution than quadrilateral elements. The quadrilateral has two directions, the distance between these two directions is 1 and √2.  According to equation (27), it can be calculated that the width of the error band is 22.3m. The error band between the linear elements extracted from the 19-layer grid image and the current elements extracted from the original image is 17.6m. This shows that the higher the grid image level, the better the image can maintain the geometric characteristics of the original image.

Discussion
The hexagonal grid and the original remote sensing data are two different form of data management and data storage, and there are big differences in pixel units. This study tends to prove that the remote sensing data based on the hexagonal grid has practical applications. Availability requires quantitative indicators to prove the similarity between the hexagonal grid data and the original data. This paper used basic evaluation indicators to prove that the amount of information in the hexagonal grid data is within a controllable range compared to the original remote sensing data. There is inevitable data loss and smoothing, but the amount of lost data is within an acceptable range. This study compared different resampling methods, including nearest neighbor interpolation, linear interpolation and cubic interpolation. The weights calculated in this study are not the only indicator weights. When being used in different application scenarios, other appropriate judgment matrix can be selected according to the focus of remote sensing applications, so as to determine the weights of different indicators accordingly.
The calculation results of the quantization index show that the image obtained by the nearest interpolation is the closest to the original image and retains more information. In addition, based on the differences between hexagons and quadrilaterals, this paper proposed a new algorithm for thinning hexagonal grid images, and further compares the features of linear features obtained by the thinning algorithm. This also shows that the hexagonal grid image is available in remote sensing geometry applications. For example, large area yield estimation land use classification, river extraction, etc. based on hexagonal grid image. In addition, according to the feature that the level of the hexagonal grid can be infinitely divided, the grid can be used to store unstructured emerging spatial big data, including point, line, human activity, luminous etc.

Conclusions
This research brought up and implemented the conversion method from the original remote sensing image to the hexagonal discrete grid image, and is committed to finding a suitable way to express the remote sensing data on the discrete grid, and establish a suitable accuracy evaluation system to evaluate the accuracy of modeling aims at the remote sensing data based on the hexagonal grid. This proved its availability and provides support for the further use of remote sensing data based on hexagonal grids. The converted hexagonal data can basically maintain the basic and geometric characteristics of the original data. However, due to the geometrical characteristics of the hexagon compared to the quadrilateral, the direction and angle of the hexagonal data are more sensitive to line elements, which means it can keep the direction better. Among the three resampling methods, the image obtained by the nearest neighbor interpolation has the highest similarity to the original image. The main reason is that the pixel value in the nearest interpolation has not been calculated, and the geographic coordinates and the hexagonal grid coordinates are different. Because the correspondence is regular, so the nearest neighbor interpolation can better retain the characteristics of the original image.

Conflicts of Interest:
The authors declare no conflict of interest.