Submitted:
05 August 2025
Posted:
06 August 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction: Foundation Models in Archaeological Remote Sensing
1.1. Vision Foundation Models in Remote Sensing
1.2. Applications in Archaeological Remote Sensing
2. Foundation Models Used in This Study
3. Experiments
3.1. Experiment 1: Detection of Castles in Bavaria, Germany in Satellite Imagery
3.1.1. Methodology

3.1.2. Quantitative Results
| Model | TP | FN | TN | FP | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|
| GPT-4.1 | 244 | 135 | 899 | 101 | 71 % | 64 % | 67 % |
| Gemini 2.0-flash | 144 | 235 | 998 | 2 | 99 % | 38 % | 55 % |
3.1.3. Qualitative Results and Preliminary Assessment




3.2. Experiment 2: Detection of Angkorian Temples in Satellite Imagery
3.2.1. Methodology

3.2.2. Quantitative Results
| Model | TP | FN | TN | FP | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|
| GPT-4.1 | 57 | 43 | 902 | 98 | 37 % | 57 % | 45 % |
| Gemini 2.0-flash | 32 | 68 | 977 | 23 | 58 % | 32 % | 41 % |
3.2.3. Qualitative Results and Preliminary Assessment




3.3. Experiment 3: Finding English Hillforts in LiDAR Imagery
3.3.1. Methodology

3.3.2. Quantitative Results
| Model | TP | FN | TN | FP | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|
| GPT-4.1 | 286 | 14 | 187 | 813 | 26 % | 95 % | 42 % |
| Gemini 2.0-flash | 149 | 151 | 934 | 66 | 69 % | 50 % | 58 % |
3.3.3. Qualitative Results and Preliminary Assessment




3.4. Experiment 4: Delineating the Dimensions of Archaeological Sites in LiDAR
3.4.1. Methodology
3.4.2. Results and Preliminary Assessment

3.5. Experiment 5: Finding Potsherds in Drone (UAV) Imagery
3.5.1. Methodology


3.5.2. Results and Preliminary Assessment

4. Discussion and Outlook
5. Data Availability
| Experiment | Data source | License | Available from |
|---|---|---|---|
| 1: Bavarian castles | Coordinates: Bayerische Schlösser- und Seenverwaltung / Bayerische Vermessungsverwaltung | Creative Commons (CC BY-ND) | https://gdk.gdi-de.org/geonetwork/srv/api/records/b1c27b44-f60d-497f-a8cf-b555033db245 |
| Imagery: Microsoft Bing Satellite | Microsoft Bing Maps terms of use available at https://www.microsoft com/en-us/maps/product/print-rights | Microsoft Bing Maps API (see https://learn.microsoft.com/en-us/bingmaps/rest-services/). |
|
| 2: Cambodian temples | Not publicly available due to the ethics of archaeological site protection | ||
| 3+4: English hillforts | Coordinates: Atlas of Hillforts in Britain and Ireland [28] | Creative Commons (CC BY-SA 4.0) |
https://hillforts.arch.ox.ac.uk |
| Imagery: British Environment Agency National LIDAR Programme | Open Government License (see https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/) | https://environment.data.gov.uk/dataset/2e8d0733-4f43-48b4-9e51-631c25d1b0a9 | |
| 5: Potsherds in drone video | Video: Author J.L. | Creative Commons (CC BY-SA 4.0) | GitHub of article (see above) |
Author Contributions
Funding
References
- Abate, N.; Visone, F.; Sileo, M.; Danese, M.; Amodio, A.M.; Lasaponara, R.; Masini, N. Potential Impact of Using ChatGPT-3.5 in the Theoretical and Practical Multi-Level Approach to Open-Source Remote Sensing Archaeology, Preliminary Considerations. Heritage 2023, 6, 7640–7659. [Google Scholar] [CrossRef]
- Achiam, J.; et al. GPT-4 Technical Report. 2023. [Google Scholar] [CrossRef]
- Agapiou, A.; Vionis, A.; Papantoniou, G. Detection of Archaeological Surface Ceramics Using Deep Learning Image-Based Methods and Very High-Resolution UAV Imageries. Land 2021, 10, 1365. [Google Scholar] [CrossRef]
- Arnold, T.B.; Tilton, L. Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models. Workshop on Computational Humanities Research 2024. [Google Scholar] [CrossRef]
- Bai, S.; Chen, K.; Liu, X.; Wang, J.; Ge, W.; Song, S.; et al. Qwen2.5 VL Technical Report. ArXiv 2025. [Google Scholar] [CrossRef]
- Bommasani, R.; Hudson, D.A.; Adeli, E.; Altman, R.; Arora, S.; Arx, S.V.; Bernstein, M.S.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. On the Opportunities and Risks of Foundation Models. ArXiv 2021. [Google Scholar] [CrossRef]
- Canedo, D.; Hipólito, J.; Fonte, J.; Dias, R.; Pereiro, T.D.; Georgieva, P.; Gonçalves-Seco, L.; Vázquez, M.; Pires, N.; Fábrega-Álvarez, P.; et al. The Synergy between Artificial Intelligence, Remote Sensing, and Archaeological Fieldwork Validation. Remote. Sens. 2024, 16, 1933. [Google Scholar] [CrossRef]
- Cheng, T.; Song, L.; Ge, Y.; Liu, W.; Wang, X.; Shan, Y. YOLO-World: Real-Time Open-Vocabulary Object Detection. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2024; pp. 16901-16911. [CrossRef]
- Ciccone, G. ChatGPT as a Digital Assistant for Archaeology: Insights from the Smart Anomaly Detection Assistant Development. Heritage 2024, 7, 5428–5445. [Google Scholar] [CrossRef]
- Cowley, D.C. Remote sensing for archaeology and heritage management – site discovery, interpretation and registration. Remote sensing for archaeological heritage management, Archaeolingua, Ed. D. C. Cowley. 2011. [Google Scholar]
- Gonzalez, R.C.; Woods, R.E. Digital image processing, 2nd ed.; Prentice Hall, 2002. [Google Scholar]
- Google DeepMind. Gemini Technical Overview. 2025. Available online: https://deepmind.
- Guo, J.; Zimmer-Dauphinee, J.; Nieusma, J.M.; Lu, S.; Liu, Q.; Deng, R.; Cui, C.; Yue, J.; Lin, Y.; Yao, T.; et al. DeepAndes: A Self-Supervised Vision Foundation Model for Multi-Spectral Remote Sensing Imagery of the Andes. Arxiv 2025. [Google Scholar] [CrossRef]
- Huang, L.; Yu, W.; Ma, W.; Zhong, W.; Feng, Z.; Wang, H.; Chen, Q.; Peng, W.; Feng, X.; Qin, B.; et al. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. ACM Trans. Inf. Syst. 2025, 43, 1–55. [Google Scholar] [CrossRef]
- Huo, C.; Chen, K.; Zhang, S.; Wang, Z.; Yan, H.; Shen, J.; Hong, Y.; Qi, G.; Fang, H.; Wang, Z. When Remote Sensing Meets Foundation Model: A Survey and Beyond. Remote. Sens. 2025, 17, 179. [Google Scholar] [CrossRef]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; Dollár, P. Segment Anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 2023; pp. 3992-4003. [CrossRef]
- Klassen, S.; Carter, A.K.; Evans, D.H.; Ortman, S.; Stark, M.T.; Loyless, A.A.; Polkinghorne, M.; Heng, P.; Hill, M.; Wijker, P.; et al. Diachronic modeling of the population within the medieval Greater Angkor Region settlement complex. Sci. Adv. 2021, 7, eabf8441. [Google Scholar] [CrossRef] [PubMed]
- Klassen, S.; Weed, J.; Evans, D.; Petraglia, M.D. Semi-supervised machine learning approaches for predicting the chronology of archaeological sites: A case study of temples from medieval Angkor, Cambodia. PLOS ONE 2018, 13, e0205649. [Google Scholar] [CrossRef] [PubMed]
- Kokalj, Ž.; Hesse, R. Airborne laser scanning raster data visualization – a guide to good practice. Prostor, Kraj, Čas 14. Ljubljana: ZRC SAZU. 2017. Available online: https://iaps.zrc-sazu.si/en/publikacije/airborne-laser-scanning-raster-data-visualization-1#v (accessed on 9 May 2025).
- Kokalj, Ž.; Hesse, R.; Somrak, M. Visualisation of LiDAR-derived relief models for detection of archaeological features. Journal of Archaeological Science 2019, 106, 101–112. [Google Scholar] [CrossRef]
- Kokalj, Ž.; Somrak, M. Why Not a Single Image? Combining Visualizations to Facilitate Fieldwork and On-Screen Mapping. Remote. Sens. 2019, 11, 747. [Google Scholar] [CrossRef]
- Landauer, J.; Maddison, S.; Fontana, G.; Posluschny, A.G. Archaeological Site Detection: Latest Results from a Deep Learning Based Europe Wide Hillfort Search. J. Comput. Appl. Archaeol. 2025, 8, 42–58. [Google Scholar] [CrossRef]
- Landauer, J.; Klassen, S.; Wijker, A.P.; van der Kroon, J.; Jaszkowski, A.; der Vaart, W.B.V.-V.; Petraglia, M.D. Beyond the Greater Angkor Region: Automatic large-scale mapping of Angkorian-period reservoirs in satellite imagery using deep learning. PLOS ONE 2025, 20, e0320452. [Google Scholar] [CrossRef]
- Lasaponara, R.; Masini, N. Satellite remote sensing in archaeology: past, present and future perspectives. J. Archaeol. Sci. 2011, 38, 1995–2002. [Google Scholar] [CrossRef]
- Lesiv, M.; See, L.; Bayas, J.C.L.; Sturn, T.; Schepaschenko, D.; Karner, M.; Moorthy, I.; McCallum, I.; Fritz, S. Characterizing the Spatial and Temporal Availability of Very High Resolution Satellite Imagery in Google Earth and Microsoft Bing Maps as a Source of Reference Data. Land 2018, 7, 118. [Google Scholar] [CrossRef]
- Li, W.; Lee, H.; Wang, S.; Hsu, C.; Arundel, S.T. Assessment of a new GeoAI foundation model for flood inundation mapping. Proceedings of the 6th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery 2023. [Google Scholar] [CrossRef]
- Liu, F.; Chen, D.; Guan, Z.; Zhou, X.; Zhu, J.; Ye, Q.; Fu, L.; Zhou, J. RemoteCLIP: A Vision Language Foundation Model for Remote Sensing. IEEE Trans. Geosci. Remote. Sens. 2023, 62, 1–16. [Google Scholar] [CrossRef]
- Lock, G.; Ralston, I. Atlas of hillforts of Britain and Ireland. 2024. Available online: https://hillforts.arch.ox.ac.uk.
- Lu, S.; Guo, J.; Zimmer-Dauphinee, J.R.; Nieusma, J.M.; Wang, X.; Vanvalkenburgh, P.; Wernke, S.A.; Huo, Y. Vision Foundation Models in Remote Sensing: A survey. IEEE Geosci. Remote. Sens. Mag. 2025, 2–27. [Google Scholar] [CrossRef]
- Mai, G.; Huang, W.; Sun, J.; Song, S.; Mishra, D.; Liu, N.; Lao, N. On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence. ArXiv 2023. [CrossRef]
- McCoy, M.D. Geospatial Big Data and archaeology: Prospects and problems too great to ignore. J. Archaeol. Sci. 2017, 84, 74–94. [Google Scholar] [CrossRef]
- Microsoft. Bing satellite imagery. Bing Maps. 2025. Available online: https://www.bing.com/maps.
- Mishra, M.; Zhang, K.; Mea, C.; Barazzetti, L.; Fassi, F.; Fiorillo, F.; Previtali, M. Deep Learning-Based AI-Assisted Visual Inspection Systems for Historic Buildings and their Comparative Performance with ChatGPT-4O. ISPRS - Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. XLVIII-2/W8-2024, 327–334. [CrossRef]
- Orengo, H.; Garcia-Molsosa, A. A brave new world for archaeological survey: Automated machine learning-based potsherd detection using high-resolution drone imagery. J. Archaeol. Sci. 2019, 112. [Google Scholar] [CrossRef]
- Osco, L.P.; de Lemos, E.L.; Gonçalves, W.N.; Ramos, A.P.M.; Junior, J.M. The Potential of Visual ChatGPT for Remote Sensing. Remote. Sens. 2023, 15, 3232. [Google Scholar] [CrossRef]
- Palatucci, M.; Pomerleau, D.; Hinton, G.; Mitchell, T. Zero-shot Learning with Semantic Output Codes. Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, 2009; 22, 1410–1418. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sutskever, I. Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, PMLR 2021, 139, 2021. [Google Scholar] [CrossRef]
- Steiner, A.; Pinto, A.S.; Tschannen, M.; Keysers, D.; Wang, X.; Bitton, Y.; Gritsenko, A.; Minderer, M.; Sherbondy, A.; Long, S.; et al. PaliGemma 2: A family of versatile VLMs for transfer. ArXiv 2024. [CrossRef]
- Tao, L.; Zhang, H.; Jing, H.; Liu, Y.; Yan, D.; Wei, G.; Xue, X. Advancements in Vision–Language Models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques. Remote. Sens. 2025, 17, 162. [Google Scholar] [CrossRef]
- Wernke, S.A.; Van Valkenburgh, P.; Zimmer-Dauphinee, J.; Whitlock, B.; Morrow, G.S.; Smith, R.; Smit, D.; Ortega, G.R.; Jara, K.R.; Plekhov, D.; et al. Large-scale, collaborative imagery survey in archaeology: the Geospatial Platform for Andean Culture, History and Archaeology (GeoPACHA). Antiquity 2024, 98, 155–171. [Google Scholar] [CrossRef]
- Wu, Z.; Chen, X.; Pan, Z.; Liu, X.; Liu, W.; Dai, D.; Gao, H.; Ma, Y.; Wu, C.; Wang, B.; et al. DeepSeek VL2: Mixture of Experts vision language models for advanced multimodal understanding. ArXiv 2024. [CrossRef]
- Xiao, A.; Xuan, W.; Wang, J.; Huang, J.; Tao, D.; Lu, S.; Yokoya, N. Foundation models for remote sensing and earth observation: A survey. IEEE Geoscience and Remote Sensing Magazine. [CrossRef]
- Xiao, B.; Wu, H.; Xu, W.; Dai, X.; Hu, H.; Lu, Y.; Zeng, M.; Liu, C.; Yuan, L. Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 48182024; 4818–4829. Available online: https://doi.ieeecomputersociety.org/10.1109/CVPR52733.2024.00461.
- Zhao, X.; Ding, W.; An, Y.; Du, Y.; Yu, T.; Li, M.; Tang, M.; Wang, J. Fast Segment Anything. ArXiv 2023. [CrossRef]
- Zimmer-Dauphinee, J.; VanValkenburgh, P.; Wernke, S.A. Eyes of the machine: AI-assisted satellite archaeological survey in the Andes. Antiquity 2024, 98, 245–259. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).