Submitted:
01 July 2025
Posted:
02 July 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Enhanced Geospatial Object Detection by Spatial Context: The study underscores the effectiveness of spatial contextual features for identifying diverse geospatial objects, embracing geographic theories, statistical methods, and deep learning advancements. It demonstrates the pivotal role of spatial statistics in enriching AI technologies for geospatial object detection from complex environments.
- Automate Contextual Representation Extraction: By developing a neural network-based encoder that effectively extracts spatially explicit contextual representations, this research showcases an innovative integration of AI in geospatial analysis. This approach simplifies the application of traditional semivariance estimations, offering a streamlined, and dataset-specific learning mechanism that enhances model accuracy and efficiency in geospatial object detection.
2. Literature Review
3. Methodology
3.1. Spatial Autocorrelation and Lag-Ordered Pairwise Differences
3.2. Architecture Design of the Spatial Autocorrelation Encoder
3.3. Feature Grouping to Embed the Encoder to a Neural Network Architecture
4. Dataset
5. Experiments
6. Results and Discussion
6.1. Investigating Effectiveness of Lag-Ordered Pairwise Difference
6.2. Performance of Spatial Autocorrelation Encoder
6.3. Comparative Analysis
7. Conclusions
Author Contributions
Data Availability Statement
Conflicts of Interest
Appendix A. Inference Performance of 10 Repetitions
| Treat | OA | mIoU | Man-made terrain | Natural terrain | High vegetation | Low vegetation | Buildings | Hardscape | Scanning artefacts |
| 1 | 85.7% | 57.7% | 93.0% | 79.6% | 67.0% | 28.4% | 84.3% | 28.3% | 26.4% |
| 2 | 85.6% | 57.3% | 92.5% | 78.8% | 66.2% | 28.3% | 84.1% | 28.9% | 25.2% |
| 3 | 85.8% | 58.4% | 92.8% | 80.4% | 66.8% | 29.7% | 84.4% | 28.3% | 26.9% |
| 4 | 85.6% | 57.5% | 92.8% | 78.4% | 66.5% | 25.8% | 84.0% | 27.2% | 28.6% |
| 5 | 85.3% | 57.4% | 92.7% | 76.4% | 66.3% | 25.7% | 83.8% | 27.6% | 29.0% |
| 6 | 85.4% | 56.9% | 92.4% | 78.0% | 68.3% | 25.7% | 84.7% | 26.1% | 27.4% |
| 7 | 85.4% | 57.6% | 92.8% | 77.3% | 63.9% | 31.2% | 84.2% | 28.3% | 27.0% |
| 8 | 85.1% | 57.1% | 92.5% | 77.2% | 66.0% | 29.4% | 83.8% | 26.5% | 25.8% |
| 9 | 85.8% | 58.0% | 93.1% | 80.4% | 64.8% | 33.2% | 83.0% | 28.3% | 23.9% |
| 10 | 85.7% | 57.9% | 93.1% | 80.2% | 65.4% | 30.8% | 84.1% | 28.0% | 28.1% |
| *Treat is for treatment IDs. *OA and mIoU: Overall Accuracy and mean Intersection over Union. The class label: Intersection over Union for each category. | |||||||||
| 1 | LiDAR stands for light detection and ranging. |
| 2 | A digital twin of New York City’s Urban Canopy. Link: https://labs.aap.cornell.edu/daslab/projects/treefolio
|
| 3 | End-to-end, in the domain machine learning, typically refers to a process or a model that takes raw data as input and directly produces the expected output, without demanding any manual intermediate steps operated by humans. |
| 4 | |
| 5 | Encoder is a module from a neural network to extract features and generate a representation of input data. See Klemmer, Safir and Neill (2023) as an example. |
References
- Guo, H., Goodchild, M., and Annoni, A., Manual of digital Earth. 2020: Springer Nature.
- Goodchild, M.F., Elements of an infrastructure for big urban data. Urban Informatics, 2022. 1(1): p. 3. [CrossRef]
- Batty, M., Digital Twins in City Planning. Nature Computational Science, 2023. 4(3). [CrossRef]
- Goodchild, M.F., Introduction to urban big data infrastructure. Urban Informatics, 2021: p. 543-545.
- Batty, M., Agents, Models, and Geodesign. 2013.
- Kwan, M.-P. and Lee, J., Emergency response after 9/11: The potential of real-time 3D GIS for quick emergency response in micro-spatial environments. Computers, Environment and Urban Systems, 2005. 29(2): p. 93-113.
- Evans, S., Hudson-Smith, A., and Batty, M., 3-D GIS: Virtual London and beyond. An exploration of the 3-D GIS experience involved in the creation of virtual London. Cybergeo: European Journal of Geography, 2006.
- Batty, M. and Hudson-Smith, A., Urban simulacra: London. Architectural Design, 2005. 75(178): p. 42-47. [CrossRef]
- Batty, M., The new urban geography of the third dimension. Environment and Planning B-Planning & Design, 2000. 27(4): p. 483-484. [CrossRef]
- Qi, C., Su, H., Mo, K., and Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
- Xie, J., Xu, Y., Zheng, Z., Zhu, S.-C., and Wu, Y.N. Generative pointnet: Deep energy-based learning on unordered point sets for 3d generation, reconstruction and classification. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
- Wu, C., Pfrommer, J., Beyerer, J., Li, K., and Neubert, B. Object detection in 3D point clouds via local correlation-aware point embedding. in 2020 Joint 9th International Conference on Informatics, Electronics & Vision (ICIEV) and 2020 4th International Conference on Imaging, Vision & Pattern Recognition (icIVPR). 2020. IEEE.
- Ren, D., Ma, Z., Chen, Y., Peng, W., Liu, X., Zhang, Y., and Guo, Y., Spiking PointNet: Spiking neural networks for point clouds. Advances in Neural Information Processing Systems, 2024. 36.
- Qian, G.C., Li, Y.C., Peng, H.W., Mai, J.J., Hammoud, H.A.A.K., Elhoseiny, M., and Ghanem, B., PointNeXt: Revisiting PointNet plus plus with Improved Training and Scaling Strategies. Advances in Neural Information Processing Systems 35, Neurips 2022, 2022. 35: p. 23192-23204.
- Goodchild, M.F., The Openshaw effect. International Journal of Geographical Information Science, 2022. 36(9): p. 1697-1698.
- Li, W., GeoAI and Deep Learning. The International Encyclopedia of Geography., 2021: p. 1-6.
- Chen, T., Tang, W., Allan, C., and Chen, S.-E., Explicit Incorporation of Spatial Autocorrelation in 3D Deep Learning for Geospatial Object Detection. Annals of the American Association of Geographers, 2024. 114(10): p. 2297-2316. [CrossRef]
- Matheron, G., Principles of geostatistics. Economic Geology, 1963. 58(8): p. 1246-1266. [CrossRef]
- Dowd, P.A., The Variogram and Kriging: Robust and Resistant Estimators, in Geostatistics for Natural Resources Characterization, G. Verly, et al., Editors. 1984, Springer Netherlands: Dordrecht. p. 91-106.
- Miranda, F.P. and Carr, J.R., Application of the semivariogram textural classifier (STC) for vegetation discrimination using SIR-B data of the guiana shield, northwestern brazil. Remote Sens. Rev., 1994. 10(1-3): p. 155-168.
- Miranda, F.P., Fonseca, L.E.N., and Carr, J.R., Semivariogram textural classification of JERS-1 (Fuyo-1) SAR data obtained over a flooded area of the Amazon rainforest. Int. J. Remote Sens., 1998. 19(3): p. 549-556. [CrossRef]
- Miranda, F.P., Fonseca, L.E.N., Carr, J.R., and Taranik, J.V., Analysis of JERS-1 (Fuyo-1) SAR data for vegetation discrimination in northwestern Brazil using the semivariogram textural classifier (STC). Int. J. Remote Sens., 1996. 17(17): p. 3523-3529. [CrossRef]
- Miranda, F.P., Macdonald, J.A., and Carr, J.R., Application of the semivariogram textural classifier (STC) for vegetation discrimination using SIR-B data of Borneo. Int. J. Remote Sens., 1992. 13(12): p. 2349-2354. [CrossRef]
- Wu, X., Peng, J., Shan, J., and Cui, W., Evaluation of semivariogram features for object-based image classification. Geo-spatial Information Science, 2015. 18(4): p. 159-170. [CrossRef]
- Mottaghi, R., Chen, X., Liu, X., Cho, N.-G., Lee, S.-W., Fidler, S., Urtasun, R., and Yuille, A. The role of context for object detection and semantic segmentation in the wild. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.
- Pohlen, T., Hermans, A., Mathias, M., and Leibe, B. Full-resolution residual networks for semantic segmentation in street scenes. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
- Engelmann, F., Kontogianni, T., Hermans, A., and Leibe, B. Exploring spatial context for 3D semantic segmentation of point clouds. in Proceedings of the IEEE international conference on computer vision workshops. 2017.
- Charles, R.Q., Liu, W., Wu, C., Su, H., and Leonidas, J.G. Frustum pointnets for 3D object detection from RGB-D data. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
- Klemmer, K., Safir, N.S., and Neill, D.B. Positional encoder graph neural networks for geographic data. in International Conference on Artificial Intelligence and Statistics. 2023. PMLR.
- Fan, S., Dong, Q., Zhu, F., Lv, Y., Ye, P., and Wang, F.-Y., SCF-net: Learning spatial contextual features for large-scale point cloud segmentation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021, IEEE. p. 14504-14513.
- Qi, C.R., Yi, L., Su, H., and Guibas, L.J., PointNet plus plus : Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Advances in Neural Information Processing Systems 30 (Nips 2017), 2017. 30.
- Boulch, A., ConvPoint: Continuous convolutions for point cloud processing. Computers & Graphics-Uk, 2020. 88: p. 24-34. [CrossRef]
- Haralick, R.M., Shanmugam, K., and Dinstein, I., Textural Features for Image Classification. Ieee Transactions on Systems Man and Cybernetics, 1973. Smc3(6): p. 610-621.
- Tso, B. and Olsen, R.C., Scene classification using combined spectral, textural and contextual information. Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery X, 2004. 5425: p. 135-146.
- Li, Z., Hodgson, M.E., and Li, W., A general-purpose framework for parallel processing of large-scale LiDAR data. International Journal of Digital Earth, 2018. 11(1): p. 26-47. [CrossRef]
- He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
- Hackel, T., Savinov, N., Ladicky, L., Wegner, J.D., Schindler, K., and Pollefeys, M., Semantic3D.net: A new large-scale point cloud classification benchmark. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. IV-1-W1. 2017. [CrossRef]
- Li, W., Batty, M., and Goodchild, M.F., Real-time GIS for smart cities. International Journal of Geographical Information Science, 2020. 34(2): p. 311-324.
- Batty, M., Virtual Reality in Geographic Information Systems. Handbook of Geographic Information Science, 2008: p. 317-334.
- Tang, W., Chen, S.-E., Diemer, J., Allan, C., Chen, T., Slocum, Z., Shukla, T., Chavan, V.S., and Shanmugam, N.S., DeepHyd: A deep learning-based artificial intelligence approach for the automated classification of hydraulic structures from LiDAR and sonar data. 2022, North Carolina Department of Transportation. Research and Development Unit.
- Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., and Bennamoun, M., Deep Learning for 3D Point Clouds: A Survey. IEEE Trans Pattern Anal Mach Intell, 2021. 43(12): p. 4338-4364. [CrossRef]
- Myint, S.W., Fractal approaches in texture analysis and classification of remotely sensed data: comparisons with spatial autocorrelation techniques and simple descriptive statistics. International Journal of Remote Sensing, 2003. 24(9): p. 1925-1947. [CrossRef]








| OA | 63% |
| mIOU | 28% |
| Man-made terrain | 39% |
| Natural terrain | 38% |
| High vegetation | 54% |
| Low vegetation | 8% |
| Buildings | 63% |
| Hard scape | 12% |
| Scanning artefacts | 0% |
| Cars | 11% |
| Statistics | Mean | Std. | Max | Min |
|---|---|---|---|---|
| OA | 85.5% | 0.2% | 85.1% | 85.8% |
| mIoU | 57.6% | 0.4% | 56.9% | 58.4% |
| Man-made terrain | 92.8% | 0.3% | 92.4% | 93.1% |
| Natural terrain | 78.7% | 1.4% | 76.4% | 80.4% |
| High vegetation | 66.1% | 1.2% | 63.9% | 68.3% |
| Low vegetation | 28.8% | 2.6% | 25.7% | 33.2% |
| Buildings | 84.0% | 0.4% | 83.0% | 84.7% |
| Hard scape | 27.7% | 0.9% | 26.1% | 28.9% |
| Scanning artefacts | 26.8% | 1.6% | 23.9% | 29.0% |
| Cars | 55.6% | 1.8% | 52.6% | 57.8% |
| Statistics | Without Spatial Autocorrelation | Without Encoder (gain, p value) | Our Study with Encoder (gain, p value) |
|---|---|---|---|
| OA | 81.95% | 83.32% (+1.37%, 0.01) | 85.55% (+3.60%, < 0.01) |
| mIOU | 51.58% | 54.05% (+2.47%, 0.00) | 57.58% (+6.00%, < 0.01) |
| Man-made terrain | 90.69% | 91.02% (+0.33%, 0.19) | 92.78% (+2.09%, < 0.01) |
| Natural terrain | 72.59% | 73.45% (+0.86%, 0.29) | 78.67% (+6.08%, < 0.01) |
| High vegetation | 57.01% | 60.15% (+3.14%, 0.01) | 66.10% (+9.09%, < 0.01) |
| Low vegetation | 24.12% | 26.41% (+2.29%, < 0.01) | 28.81% (+4.69%, < 0.01) |
| Buildings | 79.85% | 81.38% (+1.53%, 0.03) | 84.05% (+4.20%, < 0.01) |
| Hard scape | 21.55% | 23.19% (+1.64%, 0.07) | 27.73% (+6.18%, < 0.01) |
| Scanning artefacts | 20.25% | 25.10% (+4.85%, < 0.01) | 26.83% (+6.58%, < 0.01) |
| Cars | 46.55% | 51.72% (+5.17%, 0.01) | 55.64% (+9.09%, < 0.01) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).