Submitted:
27 January 2026
Posted:
28 January 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Data Perception. The agent perceives multimodal data from multiple domains (handheld, vehicle, drone, satellite) to comprehensively perceive the physical world, marking the starting point of the data lifecycle.
- Data Management. Task-driven storage architectures (such as graph, vector, and spatio-temporal databases) organize and query massive, heterogeneous urban perception data, laying a solid foundation for subsequent fusion and interaction.
- Data Fusion. In this stage, the fusion strategies address the data gaps and construct a unified urban cognition.
- Task Application. The structured data, after being processed through fusion, supports advanced UrbanEA tasks such as Urban Scene Question-Answering (SQA), Vision-Language Navigation (VLN), and Human-Agent Collaboration (HAC), and generates a positive social impact.
- Environmental Variability. Urban environments are inherently dynamic and uncertain, unlike controlled indoor settings. Data Perception in outdoor scenes must handle dramatic variations in illumination (diurnal cycles, sunlight-shadow contrasts) and challenging weather conditions (rain, snow, fog), all of which degrade perception performance.
- Limited Observability. Urban environments are vast, but individual sensors have limited coverage. Any single sensor (e.g., a vehicle-mounted camera or LiDAR) suffers from blind spots and occlusions due to its limited Field of View (FoV) and detection range. This results in spatially incomplete perceptual information, making it impossible to achieve a globally consistent scene understanding during Data Fusion.
- Interaction Complexity. An urban environment is beyond a collection of physical spaces but a complex social environment composed of numerous intelligent agents. The behavior of these agents is not simple physical motion but is driven by a multi-layered set of rules. Their behavior follows both explicit rules (e.g., traffic laws) and implicit social norms (e.g., driving habits, pedestrian etiquette, intentions signaled through body language). Understanding these interactions requires interpreting subtle cues like posture, gaze, and intent, which are far harder to capture and model during the Data Fusion and Task Application stages than simple physical motion.
2. Data Perception: Sensing and Simulation
2.1. How to Perceive the City?
2.1.1. Vision Perception
2.1.2. Multi-sensory Perception
2.2. Where to Perceive the City?
2.3. City Scene Simulators
| Environment | Year | Kinematics | Platform | Category | Modality | Data Source | Engine | |||
| RGB | Depth | Radar | Lidar | |||||||
| Cityscapes [46] | 2016 | ✗ | Terrain | Open-Loop | ✓ | ✗ | ✗ | ✗ | Street View | - |
| CARLA [16] | 2017 | ✓ | Terrain | Closed-Loop | ✓ | ✓ | ✓ | ✓ | Vehicle | UE 4 |
| xView [47] | 2018 | ✗ | Aviation | Open-Loop | ✓ | ✗ | ✗ | ✗ | Satellite | - |
| TouchDown [48] | 2019 | ✗ | Terrain | Open-Loop | ✓ | ✗ | ✗ | ✗ | Street View | - |
| Nuscenes [49] | 2020 | ✗ | Terrain | Open-Loop | ✓ | ✗ | ✓ | ✓ | Vehicle | Nuscenes-Kit |
| Waymo [50] | 2020 | ✗ | Terrain | Open-Loop | ✓ | ✗ | ✓ | ✓ | Vehicle | Waymax |
| KITTI-360 [51] | 2022 | ✗ | Terrain | Open-Loop | ✓ | ✗ | ✗ | ✓ | Vehicle | - |
| STPLS3D [52] | 2022 | ✗ | Aviation | Open-Loop | ✗ | ✗ | ✗ | ✓ | Drone | - |
| SensatUrban [53] | 2022 | ✗ | Aviation | Open-Loop | ✗ | ✗ | ✗ | ✓ | Drone | - |
| UrbanBIS [54] | 2023 | ✗ | Aviation | Open-Loop | ✓ | ✗ | ✗ | ✓ | Drone | - |
| AerialVLN [55] | 2023 | ✓ | Aviation | Open-Loop | ✓ | ✓ | ✗ | ✗ | Drone | UE 4 |
| GRUTopia [56] | 2024 | ✓ | Terrain | Closed-Loop | ✓ | ✗ | ✗ | ✗ | Virtuality | Isaac Sim |
| OpenUAV [57] | 2024 | ✓ | Aviation | Open-Loop | ✓ | ✓ | ✗ | ✗ | Drone | UE 4 |
| UnrealZoo [58] | 2024 | ✓ | Terrain | Closed-Loop | ✓ | ✗ | ✗ | ✗ | Virtuality | UE 4/5 |
| MetaUrban [59] | 2025 | ✓ | Terrain | Closed-Loop | ✓ | ✓ | ✗ | ✓ | Virtuality | Gym |
| OpenFly [60] | 2025 | ✓ | Aviation | Closed-Loop | ✓ | ✓ | ✗ | ✓ | Drone | UE 4, Google Earth, GTA V |
2.3.1. Open-Loop Simulator
2.3.2. Closed-Loop Simulator
2.4. Discussion
3. Data Management: Storage and Querying
3.1. General & Unified Architectures
3.2. Semantic & Relational Architectures
3.3. Spatio-temporal & Dynamic Architectures
3.4. Discussion
4. Data Fusion: Bridging Domain Gaps
4.1. The Domain Gap between Multi-Domain Data

4.2. Fusion Strategies for Multi-Domain Data
5. Task Application: From Perception to Social Interaction
5.1. Urban SQA

5.2. Vision-Language Navigation

5.3. Human-Agent Collaboration

5.4. Discussion
6. Social Impact
6.0.1. Transportation
6.0.2. Energy
6.0.3. Climate Change
6.0.4. Healthy Care
7. Outlook and Discussion
7.1. Method-Level Challenges
7.1.1. Robust Fusion with Imbalanced Multi-domain Data
7.1.2. Continual Learning and Incremental City Updates
7.2. System-Level Challenges
7.2.1. High-Fidelity Simulators and the Sim-to-Real Loop
7.2.2. Multi-Agent Collaboration and Swarm Intelligence
7.3. Societal-Level Challenges
7.3.1. Causal Reasoning by Fusing Social Knowledge
7.3.2. Fairness and Data Bias
8. Conclusions
References
- Bettencourt, L.M. The origins of scaling in cities. science 2013, 340, 1438–1441.
- Dong, L.; Duarte, F.; Duranton, G.; Santi, P.; Barthelemy, M.; Batty, M.; Bettencourt, L.; Goodchild, M.; Hack, G.; Liu, Y.; et al. Defining a city — delineating urban areas using cell-phone data. Nat. Cities 2024, 1, 117–125. [CrossRef]
- Yang, L.; Luo, Z.; Zhang, S.; Teng, F.; Li, T. Continual Learning for Smart City: A Survey. IEEE Trans. Knowl. Data Eng. 2024, 36, 7805–7824. [CrossRef]
- Cengiz, B.; Adam, I.Y.; Ozdem, M.; Das, R. A survey on data fusion approaches in IoT-based smart cities: Smart applications, taxonomies, challenges, and future research directions. Inf. Fusion 2025, 121. [CrossRef]
- Bibri, S.E.; Huang, J. Artificial intelligence of things for sustainable smart city brain and digital twin systems: Pioneering Environmental synergies between real-time management and predictive planning. Environ. Sci. Ecotechnology 2025, 26, 100591. [CrossRef]
- Xu, F.; Zhang, J.; Gao, C.; Feng, J.; Li, Y. Urban generative intelligence (ugi): A foundational platform for agents in embodied city environment. arXiv preprint arXiv:2312.11813 2023.
- Song, Y.; Sun, P.; Liu, H.; Li, Z.; Song, W.; Xiao, Y.; Zhou, X. Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI. IEEE Trans. Knowl. Data Eng. 2024, 36, 6962–6976. [CrossRef]
- Zhang, Y.; Ma, Z.; Li, J.; Qiao, Y.; Wang, Z.; Chai, J.; Wu, Q.; Bansal, M.; Kordjamshidi, P. Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models. Transactions on Machine Learning Research.
- Jin, G.; Liang, Y.; Fang, Y.; Shao, Z.; Huang, J.; Zhang, J.; Zheng, Y. Spatio-Temporal Graph Neural Networks for Predictive Learning in Urban Computing: A Survey. IEEE Trans. Knowl. Data Eng. 2023, 36, 5388–5408. [CrossRef]
- Zhang, W.; Han, J.; Xu, Z.; Ni, H.; Liu, H.; Xiong, H. Urban Foundation Models: A Survey. KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Spain; pp. 6633–6643.
- Lu, Y.; Tang, H. Multimodal Data Storage and Retrieval for Embodied AI: A Survey. arXiv preprint arXiv:2508.13901 2025.
- Liang, Y.; Wen, H.; Xia, Y.; Jin, M.; Yang, B.; Salim, F.; Wen, Q.; Pan, S.; Cong, G. Foundation Models for Spatio-Temporal Data Science: A Tutorial and Survey. KDD '25: The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Canada; pp. 6063–6073.
- Zou, X.; Yan, Y.; Hao, X.; Hu, Y.; Wen, H.; Liu, E.; Zhang, J.; Li, Y.; Li, T.; Zheng, Y.; et al. Deep learning for cross-domain data fusion in urban computing: Taxonomy, advances, and outlook. Inf. Fusion 2024, 113. [CrossRef]
- Song, S.; Li, X.; Li, S.; Zhao, S.; Yu, J.; Ma, J.; Mao, X.; Zhang, W.; Wang, M. How to Bridge the Gap Between Modalities: Survey on Multimodal Large Language Model. IEEE Trans. Knowl. Data Eng. 2025, 37, 5311–5329. [CrossRef]
- Liu, H.; Tong, Y.; Han, J.; Zhang, P.; Lu, X.; Xiong, H. Incorporating Multi-Source Urban Data for Personalized and Context-Aware Multi-Modal Transportation Recommendation. IEEE Trans. Knowl. Data Eng. 2020, 34, 723–735. [CrossRef]
- Dosovitskiy, A.; Ros, G.; Codevilla, F.; Lopez, A.; Koltun, V. CARLA: An open urban driving simulator. In Proceedings of the Conference on robot learning. PMLR, 2017, pp. 1–16.
- Bisio, I.; Delfino, A.; Grattarola, A.; Lavagetto, F.; Sciarrone, A. Ultrasounds-Based Context Sensing Method and Applications Over the Internet of Things. IEEE Internet Things J. 2018, 5, 3876–3890. [CrossRef]
- Phipps, A.; Ouazzane, K.; Vassilev, V. Enhancing Cyber Security Using Audio Techniques: A Public Key Infrastructure for Sound. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). China; pp. 1428–1436.
- Li, K.; Liu, M. Combined influence of multi-sensory comfort in winter open spaces and its association with environmental factors: Wuhan as a case study. Build. Environ. 2023, 248. [CrossRef]
- Yin, C.; Chen, P.-Y.; Yao, B.; Wang, D.; Caterino, J.; Zhang, P. SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing. KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Spain; pp. 6158–6168.
- Chen, C.; Jain, U.; Schissler, C.; Gari, S.V.A.; Al-Halah, Z.; Ithapu, V.K.; Robinson, P.; Grauman, K. SoundSpaces: Audio-Visual Navigation in 3D Environments. European Conference on Computer Vision. United Kingdom; pp. 17–36.
- Chen, C.; Schissler, C.; Garg, S.; Kobernik, P.; Clegg, A.; Calamia, P.; Batra, D.; Robinson, P.; Grauman, K. Soundspaces 2.0: A simulation platform for visual-acoustic learning. Advances in Neural Information Processing Systems 2022, 35, 8896–8911.
- Clarke, S.; Gao, R.; Wang, M.; Rau, M.; Xu, J.; Wang, J.-H.; James, D.L.; Wu, J. REALIMPACT: A Dataset of Impact Sound Fields for Real Objects. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Canada; pp. 1516–1525.
- Gan, C.; Gu, Y.; Zhou, S.; Schwartz, J.; Alter, S.; Traer, J.; Gutfreund, D.; Tenenbaum, J.B.; McDermott, J.H.; Torralba, A. Finding Fallen Objects Via Asynchronous Audio-Visual Integration. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 10513–10523.
- Gao, R.; Li, H.; Dharan, G.; Wang, Z.; Li, C.; Xia, F.; Savarese, S.; Fei-Fei, L.; Wu, J. Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear. 2023 IEEE International Conference on Robotics and Automation (ICRA). United Kingdom; pp. 704–711.
- Narang, Y.; Sundaralingam, B.; Macklin, M.; Mousavian, A.; Fox, D. Sim-to-Real for Robotic Tactile Sensing via Physics-Based Simulation and Learned Latent Projections. 2021 IEEE International Conference on Robotics and Automation (ICRA). China; pp. 6444–6451.
- Gao, R.; Si, Z.; Chang, Y.-Y.; Clarke, S.; Bohg, J.; Fei-Fei, L.; Yuan, W.; Wu, J. ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 10588–10598.
- Gao, R.; Dou, Y.; Li, H.; Agarwal, T.; Bohg, J.; Li, Y.; Fei-Fei, L.; Wu, J. The Object Folder Benchmark : Multisensory Learning with Neural and Real Objects. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Canada; pp. 17276–17286.
- Gao, R.; Chang, Y.Y.; Mall, S.; Fei-Fei, L.; Wu, J. ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations. In Proceedings of the Conference on Robot Learning, 2021.
- Calandra, R.; Owens, A.; Jayaraman, D.; Lin, J.; Yuan, W.; Malik, J.; Adelson, E.H.; Levine, S. More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch. IEEE Robot. Autom. Lett. 2018, 3, 3300–3307. [CrossRef]
- Hong, Y.; Zheng, Z.; Chen, P.; Wang, Y.; Li, J.; Gan, C. MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 26396–26406.
- Zhang, W.; Han, J.; Xu, Z.; Ni, H.; Liu, H.; Xiong, H. Towards urban general intelligence: A review and outlook of urban foundation models. arXiv preprint arXiv:2402.01749 2024.
- Fadhel, M.A.; Duhaim, A.M.; Saihood, A.; Sewify, A.; Al-Hamadani, M.N.; Albahri, A.; Alzubaidi, L.; Gupta, A.; Mirjalili, S.; Gu, Y. Comprehensive systematic review of information fusion methods in smart cities and urban environments. Inf. Fusion 2024, 107. [CrossRef]
- El-Omari, S.; Moselhi, O. Integrating 3D laser scanning and photogrammetry for progress measurement of construction work. Autom. Constr. 2008, 18, 1–9. [CrossRef]
- Navares-Vázquez, J.C.; Qiu, Z.; Arias, P.; Balado, J. HoloLens 2 performance analysis for indoor/outdoor 3D mapping. J. Build. Eng. 2025, 108. [CrossRef]
- Rashdi, R.; Garrido, I.; Balado, J.; Del Río-Barral, P.; Rodríguez-Somoza, J.L.; Martínez-Sánchez, J. Comparative Evaluation of LiDAR systems for transport infrastructure: case studies and performance analysis. Eur. J. Remote. Sens. 2024, 57. [CrossRef]
- Seifert, E.; Seifert, S.; Vogt, H.; Drew, D.; van Aardt, J.; Kunneke, A.; Seifert, T. Influence of Drone Altitude, Image Overlap, and Optical Sensor Resolution on Multi-View Reconstruction of Forest Images. Remote. Sens. 2019, 11, 1252. [CrossRef]
- Girindran, R.; Boyd, D.S.; Rosser, J.; Vijayan, D.; Long, G.; Robinson, D. On the Reliable Generation of 3D City Models from Open Data. Urban Sci. 2020, 4, 47. [CrossRef]
- Zhang, H.K.; Roy, D.P.; Yan, L.; Li, Z.; Huang, H.; Vermote, E.; Skakun, S.; Roger, J.-C. Characterization of Sentinel-2A and Landsat-8 top of atmosphere, surface, and nadir BRDF adjusted reflectance and NDVI differences. Remote. Sens. Environ. 2018, 215, 482–494. [CrossRef]
- Xiao, C.; Zhou, J.; Xiao, Y.; Huang, J.; Xiong, H. ReFound: Crafting a Foundation Model for Urban Region Understanding upon Language and Visual Foundations. KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Spain; pp. 3527–3538.
- Duan, J.; Yu, S.; Tan, H.L.; Zhu, H.; Tan, C. A Survey of Embodied AI: From Simulators to Research Tasks. IEEE Trans. Emerg. Top. Comput. Intell. 2022, 6, 230–244. [CrossRef]
- Savva, M.; Kadian, A.; Maksymets, O.; Zhao, Y.; Wijmans, E.; Jain, B.; Straub, J.; Liu, J.; Koltun, V.; Malik, J.; et al. Habitat: A Platform for Embodied AI Research. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). South Korea; pp. 9338–9346.
- Puig, X.; Undersander, E.; Szot, A.; Cote, M.D.; Yang, T.Y.; Partsey, R.; Desai, R.; Clegg, A.; Hlavac, M.; Min, S.Y.; et al. Habitat 3.0: A Co-Habitat for Humans, Avatars, and Robots. In Proceedings of the The Twelfth International Conference on Learning Representations.
- Szot, A.; Clegg, A.; Undersander, E.; Wijmans, E.; Zhao, Y.; Turner, J.; Maestre, N.; Mukadam, M.; Chaplot, D.S.; Maksymets, O.; et al. Habitat 2.0: Training home assistants to rearrange their habitat. Advances in neural information processing systems 2021, 34, 251–266.
- Dai, A.; Chang, A.X.; Savva, M.; Halber, M.; Funkhouser, T.; Niessner, M. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 2432–2443.
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27-30 June 2016; pp. 3213–3223. [CrossRef]
- Lam, D.; Kuzma, R.; McGee, K.; Dooley, S.; Laielli, M.; Klaric, M.; Bulatov, Y.; McCord, B. xview: Objects in context in overhead imagery. arXiv preprint arXiv:1802.07856 2018.
- Chen, H.; Suhr, A.; Misra, D.; Snavely, N.; Artzi, Y. TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA; pp. 12530–12539.
- Caesar, H.; Bankiti, V.; Lang, A.H.; Vora, S.; Liong, V.E.; Xu, Q.; Krishnan, A.; Pan, Y.; Baldan, G.; Beijbom, O. nuScenes: A Multimodal Dataset for Autonomous Driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11618–11628. [CrossRef]
- Sun, P.; Kretzschmar, H.; Dotiwalla, X.; Chouard, A.; Patnaik, V.; Tsui, P.; Guo, J.; Zhou, Y.; Chai, Y.; Caine, B.; et al. Scalability in Perception for Autonomous Driving: Waymo Open Dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020; pp. 2446–2454.
- Liao, Y.; Xie, J.; Geiger, A. KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 3292–3310. [CrossRef]
- Chen, M.; Hu, Q.; Yu, Z.; Thomas, H.; Feng, A.; Hou, Y.; McCullough, K.; Ren, F.; Soibelman, L. STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset. In Proceedings of the 33rd British Machine Vision Conference Proceedings, BMVC 2022, 2022.
- Hu, Q.; Yang, B.; Khalid, S.; Xiao, W.; Trigoni, N.; Markham, A. SensatUrban: Learning Semantics from Urban-Scale Photogrammetric Point Clouds. Int. J. Comput. Vis. 2022, 130, 316–343. [CrossRef]
- Yang, G.; Xue, F.; Zhang, Q.; Xie, K.; Fu, C.-W.; Huang, H. UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building Instance Segmentation. SIGGRAPH '23: Special Interest Group on Computer Graphics and Interactive Techniques Conference. United States; pp. 1–11.
- Liu, S.; Zhang, H.; Qi, Y.; Wang, P.; Zhang, Y.; Wu, Q. AerialVLN: Vision-and-Language Navigation for UAVs. 2023 IEEE/CVF International Conference on Computer Vision (ICCV). France; pp. 15338–15348.
- Wang, H.; Chen, J.; Huang, W.; Ben, Q.; Wang, T.; Mi, B.; Huang, T.; Zhao, S.; Chen, Y.; Yang, S.; et al. Grutopia: Dream general robots in a city at scale. arXiv preprint arXiv:2407.10943 2024.
- Wang, X.; Yang, D.; Wang, Z.; Kwan, H.; Chen, J.; Wu, W.; Li, H.; Liao, Y.; Liu, S. Towards realistic uav vision-language navigation: Platform, benchmark, and methodology. arXiv preprint arXiv:2410.07087 2024.
- Zhong, F.; Wu, K.; Wang, C.; Chen, H.; Ci, H.; Li, Z.; Wang, Y. UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI. arXiv preprint arXiv:2412.20977 2024.
- Wu, W.; He, H.; He, J.; Wang, Y.; Duan, C.; Liu, Z.; Li, Q.; Zhou, B. MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility. International Conference on Learning Representation 2025.
- Gao, Y.; Li, C.; You, Z.; Liu, J.; Li, Z.; Chen, P.; Chen, Q.; Tang, Z.; Wang, L.; Yang, P.; et al. OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation. arXiv preprint arXiv:2502.18041 2025.
- Mirowski, P.; Banki-Horvath, A.; Anderson, K.; Teplyashin, D.; Hermann, K.M.; Malinowski, M.; Grimes, M.K.; Simonyan, K.; Kavukcuoglu, K.; Zisserman, A.; et al. The streetlearn environment and dataset. arXiv preprint arXiv:1903.01292 2019.
- Liu, Y.; Liu, S.; Chen, B.; Yang, Z.-X.; Xu, S. Fusion-Perception-to-Action Transformer: Enhancing Robotic Manipulation With 3-D Visual Fusion Attention and Proprioception. IEEE Trans. Robot. 2025, 41, 1553–1567. [CrossRef]
- Liu, Y.; Chen, W.; Bai, Y.; Liang, X.; Li, G.; Gao, W.; Lin, L. Aligning Cyber Space With Physical World: A Comprehensive Survey on Embodied AI. IEEE/ASME Trans. Mechatronics 2025, 30, 7253–7274. [CrossRef]
- Zheng, Y.; Yao, L.; Su, Y.; Wang, Y.; Zhao, S.; Zhang, Y.; Chau, L.-P. A Survey of Embodied Learning for Object-centric Robotic Manipulation. Mach. Intell. Res. 2025, 22, 588–626. [CrossRef]
- Warren, J.; Marz, N. Big Data: Principles and best practices of scalable realtime data systems; Simon and Schuster, 2015.
- Kreps, J. Questioning the Lambda Architecture. O’Reilly Radar, 2014.
- Azzabi, S.; Alfughi, Z.; Ouda, A. Data Lakes: A Survey of Concepts and Architectures. Computers 2024, 13, 183. [CrossRef]
- Tahara, D.; Diamond, T.; Abadi, D.J. Sinew: a SQL system for multi-structured data. In Proceedings of the Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 2014, pp. 815–826.
- Bugiotti, F.; Cabibbo, L.; Atzeni, P.; Torlone, R. Database Design for NoSQL Systems. International Conference on Conceptual Modeling. COUNTRY; pp. 223–231.
- Zhang, C.; Lu, J.; Xu, P.; Chen, Y. UniBench: A Benchmark for Multi-model Database Management Systems. Technology Conference on Performance Evaluation and Benchmarking. Brazil; pp. 7–23.
- Neo4j, Inc.. Neo4j, 2010.
- The JanusGraph Project. JanusGraph, 2017.
- Mlodzian, L.; Sun, Z.; Berkemeyer, H.; Monka, S.; Wang, Z.; Dietze, S.; Halilaj, L.; Luettin, J. nuScenes Knowledge Graph - A comprehensive semantic representation of traffic scenes for trajectory prediction. 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). France; pp. 42–52.
- Sun, P.; Song, Y.; Liu, X.; Yang, X.; Wang, Q.; Li, T.; Yang, Y.; Chu, X. 3D Question Answering for City Scene Understanding. MM '24: The 32nd ACM International Conference on Multimedia. Australia; pp. 2156–2165.
- Meta AI Research. FAISS, 2017.
- Wang, J.; Yi, X.; Guo, R.; Jin, H.; Xu, P.; Li, S.; Wang, X.; Guo, X.; Li, C.; Xu, X.; et al. Milvus: A purpose-built vector data management system. In Proceedings of the Proceedings of the 2021 international conference on management of data, 2021, pp. 2614–2627.
- Malkov, Y.A.; Yashunin, D.A. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 42, 824–836. [CrossRef]
- Jegou, H.; Douze, M.; Schmid, C. Product Quantization for Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 117–128. [CrossRef]
- Pelkonen, T.; Franklin, S.; Teller, J.; Cavallaro, P.; Huang, Q.; Meza, J.; Veeraraghavan, K. Gorilla: A fast, scalable, in-memory time series database. Proceedings of the VLDB Endowment 2015, 8, 1816–1827.
- Wang, C.; Qiao, J.; Huang, X.; Song, S.; Hou, H.; Jiang, T.; Rui, L.; Wang, J.; Sun, J. Apache IoTDB: A Time Series Database for IoT Applications. Proc. Acm Manag. Data 2023, 1, 1–27. [CrossRef]
- Timescale. TimescaleDB, 2018.
- Zimányi, E.; Sakr, M.; Lesuisse, A. MobilityDB: A mobility database based on PostgreSQL and PostGIS. ACM Transactions on Database Systems (TODS) 2020, 45, 1–42.
- OSGeo, P.P.. PostGIS. https://postgis.net/, 2025.
- Li, R.; He, H.; Wang, R.; Ruan, S.; He, T.; Bao, J.; Zhang, J.; Hong, L.; Zheng, Y. TrajMesa: A Distributed NoSQL-Based Trajectory Data Management System. IEEE Trans. Knowl. Data Eng. 2021, PP, 1–1. [CrossRef]
- He, H.; Xu, Z.; Li, R.; Bao, J.; Li, T.; Zheng, Y. TMan: A High-Performance Trajectory Data Management System Based on Key-Value Stores. 2024 IEEE 40th International Conference on Data Engineering (ICDE). Netherlands; pp. 4951–4964.
- Sawadogo, P.; Darmont, J. On data lake architectures and metadata management. J. Intell. Inf. Syst. 2020, 56, 97–120. [CrossRef]
- Hai, R.; Koutras, C.; Quix, C.; Jarke, M. Data Lakes: A Survey of Functions and Systems. IEEE Trans. Knowl. Data Eng. 2023, 35, 12571–12590. [CrossRef]
- Liu, R.; Isah, H.; Zulkernine, F. A Big Data Lake for Multilevel Streaming Analytics. 2020 1st International Conference on Big Data Analytics and Practices (IBDAP). Thailand; pp. 1–6.
- Inmon, B. Data Lake Architecture: Designing the Data Lake and avoiding the garbage dump; Technics Publications, LLC, 2016.
- Jarke, M.; Quix, C. On warehouses, lakes, and spaces: the changing role of conceptual modeling for data integration. Concept. Model. Perspect. 2017, pp. 231–245.
- Ravat, F.; Zhao, Y. Data Lakes: Trends and Perspectives. International Conference on Database and Expert Systems Applications. Austria; pp. 304–313.
- Lu, J.; Holubová, I. Multi-model Data Management: What’s New and What’s Next? In Proceedings of the EDBT, 2017.
- Yeo, J.; Cho, H.; Park, J.-W.; Hwang, S.-W. Multimodal KB Harvesting for Emerging Spatial Entities. IEEE Trans. Knowl. Data Eng. 2017, 29, 1073–1086. [CrossRef]
- Kosmerl, I.; Rabuzin, K.; Sestak, M. Multi-Model Databases - Introducing Polyglot Persistence in the Big Data World. 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO). Croatia; pp. 1724–1729.
- Khine, P.P.; Wang, Z. A Review of Polyglot Persistence in the Big Data World. Information 2019, 10, 141. [CrossRef]
- Kolev, B.; Valduriez, P.; Bondiombouy, C.; Jiménez-Peris, R.; Pau, R.; Pereira, J. CloudMdsQL: querying heterogeneous cloud data stores with a common language. Distrib. Parallel Databases 2015, 34, 463–503. [CrossRef]
- Bimonte, S.; Gallinucci, E.; Marcel, P.; Rizzi, S. Data variety, come as you are in multi-model data warehouses. Inf. Syst. 2022, 104. [CrossRef]
- Mihai, G. Multi-Model Database Systems: The State of Affairs. Ann. Dunarea de Jos Univ. Galati. Fascicle I. Econ. Appl. Informatics 2020, 26, 211–215. [CrossRef]
- Xiao, C.; Zhou, J.; Huang, J.; Zhu, H.; Xu, T.; Dou, D.; Xiong, H. A Contextual Master-Slave Framework on Urban Region Graph for Urban Village Detection. 2023 IEEE 39th International Conference on Data Engineering (ICDE). United States; pp. 736–748.
- Fang, Z.; Long, Q.; Song, G.; Xie, K. Spatial-Temporal Graph ODE Networks for Traffic Flow Forecasting. arXiv 2021, arXiv:2106.12931.
- Guo, S.; Lin, Y.; Wan, H.; Li, X.; Cong, G. Learning Dynamics and Heterogeneity of Spatial-Temporal Graph Data for Traffic Forecasting. IEEE Trans. Knowl. Data Eng. 2021, 34, 5415–5428. [CrossRef]
- Wang, Z.; Han, F.; Zhao, S. A Survey on Knowledge Graph Related Research in Smart City Domain. ACM Trans. Knowl. Discov. Data 2024, 18, 1–31. [CrossRef]
- Kaliyar, R.K. Graph databases: A survey. 2015 International Conference on Computing, Communication & Automation (ICCCA). India; pp. 785–790.
- Robinson, I.; Webber, J.; Eifrem, E. Graph databases: new opportunities for connected data; " O’Reilly Media, Inc.", 2015.
- Desai, M.; G Mehta, R.; P Rana, D. Issues and challenges in big graph modelling for smart city: an extensive survey. International Journal of Computational Intelligence & IoT 2018, 1.
- Sun, J.; Zhang, J.; Li, Q.; Yi, X.; Liang, Y.; Zheng, Y. Predicting Citywide Crowd Flows in Irregular Regions Using Multi-View Graph Convolutional Networks. IEEE Trans. Knowl. Data Eng. 2020, 34, 2348–2359. [CrossRef]
- Liu, Y.; Ding, J.; Fu, Y.; Li, Y. UrbanKG: An Urban Knowledge Graph System. ACM Trans. Intell. Syst. Technol. 2023, 14, 1–25. [CrossRef]
- Lv, C.; Qi, M.; Liu, L.; Ma, H. T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 17197–17206.
- Liu, Y.; Ding, J.; Li, Y. KnowSite: Leveraging Urban Knowledge Graph for Site Selection. SIGSPATIAL '23: 31st ACM International Conference on Advances in Geographic Information Systems. Germany; pp. 1–12.
- Liu, J.; Li, T.; Ji, S.; Xie, P.; Du, S.; Teng, F.; Zhang, J. Urban flow pattern mining based on multi-source heterogeneous data fusion and knowledge graph embedding. IEEE Trans. Knowl. Data Eng. 2021, PP, 1–1. [CrossRef]
- Zareian, A.; Karaman, S.; Chang, S.-F. Bridging Knowledge Graphs to Generate Scene Graphs. European Conference on Computer Vision. United Kingdom; pp. 606–623.
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Transactions on Neural Networks 2008, 20, 61–80.
- Sun, Z.; Wang, Z.; Halilaj, L.; Luettin, J. SemanticFormer: Holistic and Semantic Traffic Scene Representation for Trajectory Prediction Using Knowledge Graphs. IEEE Robot. Autom. Lett. 2024, 9, 7381–7388. [CrossRef]
- He, H.; Li, R.; Ruan, S.; He, T.; Bao, J.; Li, T.; Zheng, Y. TraSS: Efficient Trajectory Similarity Search Based on Key-Value Data Stores. 2022 IEEE 38th International Conference on Data Engineering (ICDE). Malaysia; pp. 2306–2318.
- Sun, F.; Qi, J.; Chang, Y.; Fan, X.; Karunasekera, S.; Tanin, E. Urban Region Representation Learning with Attentive Fusion. 2024 IEEE 40th International Conference on Data Engineering (ICDE). Netherlands; pp. 4409–4421.
- Lim, J.-H.; Kang, W.J.; Singh, S.; Narasimhalu, D. Learning similarity matching in multimedia content-based retrieval. IEEE Trans. Knowl. Data Eng. 2001, 13, 846–850. [CrossRef]
- Chen, Y.; Sampathkumar, H.; Luo, B.; Chen, X.-W. iLike: Bridging the Semantic Gap in Vertical Image Search by Integrating Text and Visual Features. IEEE Trans. Knowl. Data Eng. 2012, 25, 2257–2270. [CrossRef]
- Qian, T.; Chen, J.; Zhuo, L.; Jiao, Y.; Jiang, Y.-G. NuScenes-QA: A Multi-Modal Visual Question Answering Benchmark for Autonomous Driving Scenario. Proc. AAAI Conf. Artif. Intell. 2024, 38, 4542–4550. [CrossRef]
- Park, S.; Lee, M.; Kang, J.; Choi, H.; Park, Y.; Cho, J.; Lee, A.; Kim, D. VLAAD: Vision and Language Assistant for Autonomous Driving. 2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW). United States; pp. 980–987.
- Sima, C.; Renz, K.; Chitta, K.; Chen, L.; Zhang, H.; Xie, C.; Beißwenger, J.; Luo, P.; Geiger, A.; Li, H. DriveLM: Driving with Graph Visual Question Answering. European Conference on Computer Vision. Italy; pp. 256–274.
- Ragab, M.; Gong, P.; Eldele, E.; Zhang, W.; Wu, M.; Foo, C.-S.; Zhang, D.; Li, X.; Chen, Z. Evidentially Calibrated Source-Free Time-Series Domain Adaptation With Temporal Imputation. IEEE Trans. Knowl. Data Eng. 2025, 38, 290–306. [CrossRef]
- Tang, D.; Shang, Z.; Elmore, A.J.; Krishnan, S.; Franklin, M.J. CrocodileDB in action: resource-efficient query execution by exploiting time slackness. Proceedings of the VLDB Endowment 2020, 13, 2937–2940.
- Gao, C.; Zhao, B.; Zhang, W.; Mao, J.; Zhang, J.; Zheng, Z.; Man, F.; Fang, J.; Zhou, Z.; Cui, J.; et al. EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment. arXiv preprint arXiv:2410.09604 2024.
- Li, R.; He, H.; Wang, R.; Huang, Y.; Liu, J.; Ruan, S.; He, T.; Bao, J.; Zheng, Y. JUST: JD Urban Spatio-Temporal Data Engine. 2020 IEEE 36th International Conference on Data Engineering (ICDE). United States; pp. 1558–1569.
- Guo, Y.; Wang, T.; Chen, Z.; Shao, Z. A Storage Model with Fine-Grained In-Storage Query Processing for Spatio-Temporal Data. 2025 IEEE 41st International Conference on Data Engineering (ICDE). China; pp. 669–682.
- Shi, H.; Du, S.; Yang, Y.; Zhang, J.; Li, T.; Zheng, Y. A Knowledge-Guided Pre-Training Temporal Data Analysis Foundation Model for Urban Computing. IEEE Trans. Knowl. Data Eng. 2025, 37, 6259–6271. [CrossRef]
- Chen, J.; Zhang, A. On Hierarchical Disentanglement of Interactive Behaviors for Multimodal Spatiotemporal Data with Incompleteness. KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. United States; pp. 213–225.
- Vitale, V.N.; Di Martino, S.; Peron, A.; Russo, M.; Battista, E. How to manage massive spatiotemporal dataset from stationary and non-stationary sensors in commercial DBMS?. Knowl. Inf. Syst. 2023, 66, 2063–2088. [CrossRef]
- Chen, M.; Li, Z.; Huang, W.; Gong, Y.; Yin, Y. Profiling Urban Streets: A Semi-Supervised Prediction Model Based on Street View Imagery and Spatial Topology. KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Spain; pp. 319–328.
- Vasudevan, A.B.; Dai, D.; Van Gool, L. Talk2Nav: Long-Range Vision-and-Language Navigation with Dual Attention and Spatial Memory. Int. J. Comput. Vis. 2020, 129, 246–266. [CrossRef]
- Mao, Y.; Zhou, H.; Chen, L.; Qi, R.; Sun, Z.; Rong, Y.; He, X.; Chen, M.; Mumtaz, S.; Frascolla, V.; et al. A Survey on Spatio-Temporal Prediction: From Transformers to Foundation Models. ACM Comput. Surv. 2025, 58, 1–36. [CrossRef]
- Xie, P.; Ma, M.; Li, T.; Ji, S.; Du, S.; Yu, Z.; Zhang, J. Spatio-Temporal Dynamic Graph Relation Learning for Urban Metro Flow Prediction. IEEE Trans. Knowl. Data Eng. 2023, 35, 9973–9984. [CrossRef]
- Li, Z.; Xia, L.; Tang, J.; Xu, Y.; Shi, L.; Xia, L.; Yin, D.; Huang, C. UrbanGPT: Spatio-Temporal Large Language Models. KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Spain; pp. 5351–5362.
- Liu, Y.; Chen, W.; Bai, Y.; Liang, X.; Li, G.; Gao, W.; Lin, L. Aligning Cyber Space With Physical World: A Comprehensive Survey on Embodied AI. IEEE/ASME Trans. Mechatronics 2025, 30, 7253–7274. [CrossRef]
- Huang, J.; Yong, S.; Ma, X.; Linghu, X.; Li, P.; Wang, Y.; Li, Q.; Zhu, S.C.; Jia, B.; Huang, S. An embodied generalist agent in 3D world. In Proceedings of the Proceedings of the 41st International Conference on Machine Learning, 2024, pp. 20413–20451.
- Li, S.; Tang, H. Multimodal alignment and fusion: A survey. arXiv preprint arXiv:2411.17040 2024.
- Christodoulides, A.; Tam, G.K.; Clarke, J.; Smith, R.; Horgan, J.; Micallef, N.; Morley, J.; Villamizar, N.; Walton, S. Survey on 3D Reconstruction Techniques: Large-Scale Urban City Reconstruction and Requirements. IEEE Trans. Vis. Comput. Graph. 2025, PP, 1–20. [CrossRef]
- zyeşil, O.; Voroninski, V.; Basri, R.; Singer, A. A survey of structure from motion*. Acta Numerica 2017, 26, 305–364.
- Lorensen, W.E.; Cline, H.E. Marching cubes: A high resolution 3D surface construction algorithm. In Seminal graphics: pioneering efforts that shaped the field; 1998; pp. 347–353.
- Xu, L.; Xiangli, Y.; Peng, S.; Pan, X.; Zhao, N.; Theobalt, C.; Dai, B.; Lin, D. Grid-guided Neural Radiance Fields for Large Urban Scenes. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Canada; pp. 8296–8306.
- Zhang, Q.; Wei, Y.; Han, Z.; Fu, H.; Peng, X.; Deng, C.; Hu, Q.; Xu, C.; Wen, J.; Hu, D.; et al. Multimodal fusion on low-quality data: A comprehensive survey. arXiv preprint arXiv:2404.18947 2024.
- Wolff, K.; Kim, C.; Zimmer, H.; Schroers, C.; Botsch, M.; Sorkine-Hornung, O.; Sorkine-Hornung, A. Point Cloud Noise and Outlier Removal for Image-Based 3D Reconstruction. 2016 Fourth International Conference on 3D Vision (3DV). USA; pp. 118–127.
- Melas-Kyriazi, L.; Rupprecht, C.; Vedaldi, A. PC2: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Canada; pp. 12923–12932.
- Henderson, P.; Ferrari, V. Learning Single-Image 3D Reconstruction by Generative Modelling of Shape, Pose and Shading. Int. J. Comput. Vis. 2019, 128, 835–854. [CrossRef]
- Wu, C.; Liu, Y.; Dai, Q.; Wilburn, B. Fusing Multiview and Photometric Stereo for 3D Reconstruction under Uncalibrated Illumination. IEEE Trans. Vis. Comput. Graph. 2010, 17, 1082–1095. [CrossRef]
- Kerl, C.; Souiai, M.; Sturm, J.; Cremers, D. Towards Illumination-Invariant 3D Reconstruction Using ToF RGB-D Cameras. 2014 2nd International Conference on 3D Vision (3DV). Japan; pp. 39–46.
- Bai, K.; Zhang, L.; Chen, Z.; Wan, F.; Zhang, J. Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation. 2024 IEEE International Conference on Robotics and Automation (ICRA). Japan; pp. 17035–17041.
- Stoffregen, T.; Scheerlinck, C.; Scaramuzza, D.; Drummond, T.; Barnes, N.; Kleeman, L.; Mahony, R. Reducing the Sim-to-Real Gap for Event Cameras. European Conference on Computer Vision. United Kingdom; pp. 534–549.
- Kohler, T.; Batz, M.; Naderi, F.; Kaup, A.; Maier, A.; Riess, C. Toward Bridging the Simulated-to-Real Gap: Benchmarking Super-Resolution on Real Data. IEEE Trans. Pattern Anal. Mach. Intell. 2019, PP, 1–1. [CrossRef]
- Fink, L.; Rückert, D.; Franke, L.; Keinert, J.; Stamminger, M. LiveNVS: Neural View Synthesis on Live RGB-D Streams. SA '23: SIGGRAPH Asia 2023. Australia; pp. 1–11.
- Stier, N.; Ranjan, A.; Colburn, A.; Yan, Y.; Yang, L.; Ma, F.; Angles, B. FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction. 2023 IEEE/CVF International Conference on Computer Vision (ICCV). France; pp. 18377–18386.
- Rematas, K.; Liu, A.; Srinivasan, P.; Barron, J.; Tagliasacchi, A.; Funkhouser, T.; Ferrari, V. Urban Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 12922–12932.
- Chu, T.; Zhang, P.; Liu, Q.; Wang, J. BUOL: A Bottom-Up Framework with Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction From a Single Image. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Canada; pp. 4937–4946.
- Huang, Z.; Jampani, V.; Thai, A.; Li, Y.; Stojanov, S.; Rehg, J.M. ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-Based Consistency. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Canada; pp. 12912–12922.
- Wang, C.; Jiang, R.; Chai, M.; He, M.; Chen, D.; Liao, J. NeRF-Art: Text-Driven Neural Radiance Fields Stylization. IEEE Trans. Vis. Comput. Graph. 2023, 30, 4983–4996. [CrossRef]
- Mittal, P.; Cheng, Y.-C.; Singh, M.; Tulsiani, S. AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 306–315.
- Zhao, F.; Zhang, C.; Geng, B. Deep Multimodal Data Fusion. ACM Comput. Surv. 2024, 56, 1–36. [CrossRef]
- Meng, L.; Tan, A.-H.; Xu, D. Semi-Supervised Heterogeneous Fusion for Multimedia Data Co-Clustering. IEEE Trans. Knowl. Data Eng. 2013, 26, 2293–2306. [CrossRef]
- Muturi, T.W.; Kyem, B.A.; Asamoah, J.K.; Owor, N.J.; Dzinyela, R.; Danyo, A.; Adu-Gyamfi, Y.; Aboah, A. Prompt-guided spatial understanding with rgb-d transformers for fine-grained object relation reasoning. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 5280–5288.
- Shang, Y.; Lin, Y.; Zheng, Y.; Fan, H.; Ding, J.; Feng, J.; Chen, J.; Tian, L.; Li, Y. Urbanworld: An urban world model for 3d city generation. arXiv preprint arXiv:2407.11965 2024.
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE on compute Rvision and pattern recognition, 2015, abs/1512.03385.
- Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical image computing and computer-assisted intervention. Springer 2015; 234-241.
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations.
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
- Turki, H.; Zhang, J.Y.; Ferroni, F.; Ramanan, D. SUDS: Scalable Urban Dynamic Scenes. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Canada; pp. 12375–12385.
- Zheng, Z.; Zhou, M.; Shang, Z.; Wei, X.; Pu, H.; Luo, J.; Jia, W. GAANet: Graph Aggregation Alignment Feature Fusion for Multispectral Object Detection. IEEE Trans. Ind. Informatics 2025, 21, 8282–8292. [CrossRef]
- Lin, J.; Li, Z.; Tang, X.; Liu, J.; Liu, S.; Liu, J.; Lu, Y.; Wu, X.; Xu, S.; Yan, Y.; et al. VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 5166–5175.
- Liu, Y.; Luo, C.; Fan, L.; Wang, N.; Peng, J.; Zhang, Z. CityGaussian: Real-Time High-Quality Large-Scale Scene Rendering with Gaussians. European Conference on Computer Vision. Italy; pp. 265–282.
- Jiang, L.; Ren, K.; Yu, M.; Xu, L.; Dong, J.; Lu, T.; Zhao, F.; Lin, D.; Dai, B. Horizon-Gs: Unified 3D Gaussian Splatting for Large-Scale Aerial-To-Ground Scenes. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 26789–26799.
- Vuong, K.; Ghosh, A.; Ramanan, D.; Narasimhan, S.; Tulsiani, S. AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 21674–21684.
- Huang, J.; Stoter, J.; Peters, R.; Nan, L. City3D: Large-Scale Building Reconstruction from Airborne LiDAR Point Clouds. Remote. Sens. 2022, 14, 2254. [CrossRef]
- Zhang, C.; Cao, Y.; Zhang, L. CrossView-GS: Cross-view Gaussian Splatting For Large-scale Scene Reconstruction. arXiv preprint arXiv:2501.01695 2025.
- Feng, J.; Liu, T.; Du, Y.; Guo, S.; Lin, Y.; Li, Y. CityGPT: Empowering Urban Spatial Cognition of Large Language Models. KDD '25: The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Canada; pp. 591–602.
- Li, Y.; Pan, Y.; Zhu, G.; He, S.; Xu, M.; Xu, J. Charging-Aware Task Assignment for Urban Logistics with Electric Vehicles. IEEE Trans. Knowl. Data Eng. 2025, PP, 1–14. [CrossRef]
- Lee, L.-H.; Braud, T.; Hosio, S.; Hui, P. Towards Augmented Reality Driven Human-City Interaction: Current Research on Mobile Headsets and Future Challenges. ACM Comput. Surv. 2021, 54, 1–38. [CrossRef]
- Ding, X.; Han, J.; Xu, H.; Liang, X.; Zhang, W.; Li, X. Holistic Autonomous Driving Understanding by Bird'View Injected Multi-Modal Large Models. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 13668–13677.
- Wu, D.; Han, W.; Liu, Y.; Wang, T.; Xu, C.-Z.; Zhang, X.; Shen, J. Language Prompt for Autonomous Driving. Proc. AAAI Conf. Artif. Intell. 2025, 39, 8359–8367. [CrossRef]
- Wang, J.; Zheng, Z.; Chen, Z.; Ma, A.; Zhong, Y. EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering. Proc. AAAI Conf. Artif. Intell. 2024, 38, 5481–5489. [CrossRef]
- Feng, J.; Zhang, J.; Liu, T.; Zhang, X.; Ouyang, T.; Yan, J.; Du, Y.; Guo, S.; Li, Y. CityBench: Evaluating the Capabilities of Large Language Models for Urban Tasks. KDD '25: The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Canada; pp. 5413–5424.
- Bieri, V.; Zamboni, M.; Blumer, N.S.; Chen, Q.; Engelmann, F. OpenCity3D: What do Vision-Language Models Know About Urban Environments?. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). United States; pp. 5147–5155.
- Yasuki, S.; Miyanishi, T.; Inoue, N.; Kurita, S.; Sakamoto, K.; Azuma, D.; Taki, M.; Matsuo, Y. GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields. arXiv preprint arXiv:2506.23352 2025.
- Wang, J.; Ma, A.; Chen, Z.; Zheng, Z.; Wan, Y.; Zhang, L.; Zhong, Y. EarthVQANet: Multi-task visual question answering for remote sensing image understanding. ISPRS J. Photogramm. Remote. Sens. 2024, 212, 422–439. [CrossRef]
- Zhao, Y.; Xu, K.; Zhu, Z.; Hu, Y.; Zheng, Z.; Chen, Y.; Ji, Y.; Gao, C.; Li, Y.; Huang, J. CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. China; pp. 12476–12491.
- Gu, J.; Stefani, E.; Wu, Q.; Thomason, J.; Wang, X. Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Ireland; pp. 7606–7623.
- Schumann, R.; Riezler, S. Generating Landmark Navigation Instructions from Maps as a Graph-to-Text Problem. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). COUNTRY; pp. 489–502.
- Li, J.; Padmakumar, A.; Sukhatme, G.; Bansal, M. VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation. Proc. AAAI Conf. Artif. Intell. 2024, 38, 18517–18526. [CrossRef]
- Xu, Y.; Pan, Y.; Liu, Z.; Wang, H. FLAME: Learning to Navigate with Multimodal LLM in Urban Environments. Proc. AAAI Conf. Artif. Intell. 2025, 39, 9005–9013. [CrossRef]
- Schumann, R.; Zhu, W.; Feng, W.; Fu, T.-J.; Riezler, S.; Wang, W.Y. VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View. Proc. AAAI Conf. Artif. Intell. 2024, 38, 18924–18933. [CrossRef]
- Xiang, J.; Wang, X.; Wang, W.Y. Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation. Findings of the Association for Computational Linguistics: EMNLP 2020. COUNTRY; pp. 699–707.
- Zhu, W.; Wang, X.; Fu, T.-J.; Yan, A.; Narayana, P.; Sone, K.; Basu, S.; Wang, W.Y. Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. COUNTRY; pp. 1207–1221.
- Wang, X.; Yang, D.; Wang, Z.; Kwan, H.; Chen, J.; Wu, W.; Li, H.; Liao, Y.; Liu, S. Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology. In Proceedings of the The Thirteenth International Conference on Learning Representations.
- Lee, J.; Miyanishi, T.; Kurita, S.; Sakamoto, K.; Azuma, D.; Matsuo, Y.; Inoue, N. CityNav: A Large-Scale Dataset for Real-World Aerial Navigation. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 5912–5922.
- Sautenkov, O.; Yaqoot, Y.; Lykov, A.; Mustafa, M.A.; Tadevosyan, G.; Akhmetkazy, A.; Cabrera, M.A.; Martynov, M.; Karaf, S.; Tsetserukou, D. UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation. 2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI). Australia; pp. 1588–1592.
- Fan, Y.; Chen, W.; Jiang, T.; Zhou, C.; Zhang, Y.; Wang, X. Aerial Vision-and-Dialog Navigation. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023, pp. 3043–3061.
- Tran, K.T.; Dao, D.; Nguyen, M.D.; Pham, Q.V.; O’Sullivan, B.; Nguyen, H.D. Multi-agent collaboration mechanisms: A survey of llms. arXiv preprint arXiv:2501.06322 2025.
- Feng, X.; Chen, Z.-Y.; Qin, Y.; Lin, Y.; Chen, X.; Liu, Z.; Wen, J.-R. Large Language Model-based Human-Agent Collaboration for Complex Task Solving. Findings of the Association for Computational Linguistics: EMNLP 2024. United States; pp. 1336–1357.
- Zou, H.P.; Huang, W.C.; Wu, Y.; Miao, C.; Li, D.; Liu, A.; Zhou, Y.; Chen, Y.; Zhang, W.; Li, Y.; et al. A Call for Collaborative Intelligence: Why Human-Agent Systems Should Precede AI Autonomy. arXiv preprint arXiv:2506.09420 2025.
- Fu, J.; Han, H.; Su, X.; Fan, C. Towards human-AI collaborative urban science research enabled by pre-trained large language models. Urban Informatics 2024, 3, 1–15. [CrossRef]
- Han, J.; Ning, Y.; Yuan, Z.; Ni, H.; Liu, F.; Lyu, T.; Liu, H. Large Language Model Powered Intelligent Urban Agents: Concepts, Capabilities, and Applications. arXiv preprint arXiv:2507.00914 2025.
- Wu, W.; He, H.; Zhang, C.; He, J.; Zhao, S.Z.; Gong, R.; Li, Q.; Zhou, B. Towards Autonomous Micromobility through Scalable Urban Simulation. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). United States; pp. 27553–27563.
- Wu, W.; He, H.; He, J.; Wang, Y.; Duan, C.; Liu, Z.; Li, Q.; Zhou, B. Metaurban: An embodied ai simulation platform for urban micromobility. arXiv preprint arXiv:2407.08725 2024.
- Zheng, Y.; Lin, Y.; Zhao, L.; Wu, T.; Jin, D.; Li, Y. Spatial planning of urban communities via deep reinforcement learning. Nat. Comput. Sci. 2023, 3, 748–762. [CrossRef]
- Ali, M.I.; Gao, F.; Mileo, A. Citybench: A configurable benchmark to evaluate rsp engines using smart city datasets. In Proceedings of the International semantic web conference. Springer, 2015, pp. 374–389.
- Romeu-Guallart, P.; Zamora-Martinez, F. SML2010. UCI Machine Learning Repository 2014.
- Xu, H.; Yuan, J.; Zhou, A.; Xu, G.; Li, W.; Ban, X.; Ye, X. Genai-powered multi-agent paradigm for smart urban mobility: Opportunities and challenges for integrating large language models (llms) and retrieval-augmented generation (rag) with intelligent transportation systems. arXiv preprint arXiv:2409.00494 2024.
- Li, A.; Wang, Z.; Zhang, J.; Li, M.; Qi, Y.; Chen, Z.; Zhang, Z.; Wang, H. UrbanVLA: A Vision-Language-Action Model for Urban Micromobility. arXiv preprint arXiv:2510.23576 2025.
- Zhang, Z.; Chen, M.; Zhu, S.; Han, T.; Yu, Z. MMCNav: MLLM-empowered Multi-agent Collaboration for Outdoor Visual Language Navigation. ICMR '25: International Conference on Multimedia Retrieval. United States; pp. 1767–1776.
- Chen, W.; Yu, X.; Shang, L.; Xi, J.; Jin, B.; Zhao, S. Urban Emergency Rescue Based on Multi-Agent Collaborative Learning: Coordination Between Fire Engines and Traffic Lights. arXiv preprint arXiv:2502.16131 2025.
- Wang, X.; Yang, D.; Liao, Y.; Zheng, W.; Dai, B.; Li, H.; Liu, S.; et al. UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning. arXiv preprint arXiv:2505.15725 2025.
- Jiang, K.; Cai, X.; Cui, Z.; Li, A.; Ren, Y.; Yu, H.; Yang, H.; Fu, D.; Wen, L.; Cai, P. KoMA: Knowledge-Driven Multi-Agent Framework for Autonomous Driving With Large Language Models. IEEE Trans. Intell. Veh. 2024, 10, 4655–4668. [CrossRef]
- Zheng, Y.; Xu, F.; Lin, Y.; Santi, P.; Ratti, C.; Wang, Q.R.; Li, Y. Urban planning in the era of large language models. Nature Computational Science 2025, pp. 1–10.
- Gao, C.; Lan, X.; Li, N.; Yuan, Y.; Ding, J.; Zhou, Z.; Xu, F.; Li, Y. Large language models empowered agent-based modeling and simulation: a survey and perspectives. Humanit. Soc. Sci. Commun. 2024, 11, 1–24. [CrossRef]
- Lin, Z.; Gao, K.; Wu, N.; Suganthan, P.N. Scheduling Eight-Phase Urban Traffic Light Problems via Ensemble Meta-Heuristics and Q-Learning Based Local Search. IEEE Trans. Intell. Transp. Syst. 2023, 24, 14415–14426. [CrossRef]
- Ouyang, K.; Liang, Y.; Liu, Y.; Tong, Z.; Ruan, S.; Zheng, Y.; Rosenblum, D.S. Fine-Grained Urban Flow Inference. IEEE Trans. Knowl. Data Eng. 2020, 34, 2755–2770. [CrossRef]
- Mouratidis, K. Time to challenge the 15-minute city: Seven pitfalls for sustainability, equity, livability, and spatial analysis. Cities 2024, 153. [CrossRef]
- Hamissi, A.; Dhraief, A. A Survey on the Unmanned Aircraft System Traffic Management. ACM Comput. Surv. 2023, 56, 1–37. [CrossRef]
- Ahmed, A.; Outay, F.; Farooq, M.U.; Saeed, S.; Adnan, M.; Ismail, M.A.; Qadir, A. Real-time road occupancy and traffic measurements using unmanned aerial vehicle and fundamental traffic flow diagrams. Pers. Ubiquitous Comput. 2023, 27, 1669–1680. [CrossRef]
- Yu, X.; Wang, J.; Yang, Y.; Huang, Q.; Qu, K. BIGCity: A Universal Spatiotemporal Model for Unified Trajectory and Traffic State Data Analysis. 2025 IEEE 41st International Conference on Data Engineering (ICDE). China; pp. 4455–4469.
- Liu, A.; Zhang, Y. CrossST: An Efficient Pre-Training Framework for Cross-District Pattern Generalization in Urban Spatio-Temporal Forecasting. 2025 IEEE 41st International Conference on Data Engineering (ICDE). China; pp. 2935–2948.
- Perera, A.T.D.; Javanroodi, K.; Mauree, D.; Nik, V.M.; Florio, P.; Hong, T.; Chen, D. Challenges resulting from urban density and climate change for the EU energy transition. Nat. Energy 2023, 8, 397–412. [CrossRef]
- Jin, X.; Zhang, C.; Xiao, F.; Li, A.; Miller, C. A review and reflection on open datasets of city-level building energy use and their applications. Energy Build. 2023, 285. [CrossRef]
- Wang, L.; Shao, J.; Ma, Y. Does China's low-carbon city pilot policy improve energy efficiency?. Energy 2023, 283. [CrossRef]
- Lindahl, J.; Johansson, R.; Lingfors, D. Mapping of decentralised photovoltaic and solar thermal systems by remote sensing aerial imagery and deep machine learning for statistic generation. Energy AI 2023, 14. [CrossRef]
- Gasparyan, H.A.; Davtyan, T.A.; Agaian, S.S. A Novel Framework for Solar Panel Segmentation From Remote Sensing Images: Utilizing Chebyshev Transformer and Hyperspectral Decomposition. IEEE Trans. Geosci. Remote. Sens. 2024, 62, 1–11. [CrossRef]
- Lodhi, M.K.; Tan, Y.; Wang, X.; Masum, S.M.; Khan, M.; Ullah, N. Harnessing rooftop solar photovoltaic potential in Islamabad, Pakistan: A remote sensing and deep learning approach. Energy 2024. [CrossRef]
- Golestani, Z.; Borna, R.; Khaliji, M.A.; Mohammadi, H.; Ghalehteimouri, K.J.; Asadian, F. Impact of Urban Expansion on the Formation of Urban Heat Islands in Isfahan, Iran: A Satellite Base Analysis (1990–2019). J. Geovisualization Spat. Anal. 2024, 8, 1–15. [CrossRef]
- Fan, X.; Ji, T.; Jiang, C.; Li, S.; Jin, S.; Song, S.; Wang, J.; Hong, B.; Chen, L.; Zheng, G.; et al. Mousi: Poly-visual-expert vision-language models. arXiv preprint arXiv:2401.17221 2024.
- Elgendy, H.; Sharshar, A.; Aboeitta, A.; Ashraf, Y.; Guizani, M. Geollava: Efficient fine-tuned vision-language models for temporal change detection in remote sensing. arXiv preprint arXiv:2410.19552 2024.
- Zhuo, L.; ZHANG, E.; Shuo, P.; Sichun, L.; Ying, L.; WITLOX, F. Assessing urban emergency medical services accessibility for older adults considering ambulance trafficability using a deep learning approach. Sustainable Cities and Society 2025, p. 106804.
- Li, J.; Wang, S.; Zhang, J.; Miao, H.; Zhang, J.; Yu, P. Fine-grained Urban Flow Inference with Incomplete Data. IEEE Trans. Knowl. Data Eng. 2022, PP, 1–1. [CrossRef]
- Yang, M.; Li, X.; Xu, B.; Nie, X.; Zhao, M.; Zhang, C.; Zheng, Y.; Gong, Y. STDA: Spatio-Temporal Deviation Alignment Learning for Cross-City Fine-Grained Urban Flow Inference. IEEE Trans. Knowl. Data Eng. 2025, 37, 4833–4845. [CrossRef]
- Kennedy, J. Swarm intelligence. In Handbook of nature-inspired and innovative computing: integrating classical models with emerging technologies; Springer, 2006; pp. 187–219.
- Han, X.; Zhu, C.; Zhu, H.; Zhao, X. Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework. KDD '25: The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Canada; pp. 814–825.
- Gao, C.; Xu, F.; Chen, X.; Wang, X.; He, X.; Li, Y. Simulating Human Society with Large Language Model Agents: City, Social Media, and Economic System. WWW '24: The ACM Web Conference 2024. Singapore; pp. 1290–1293.
- Liu, Y.; Zhang, X.; Ding, J.; Xi, Y.; Li, Y. Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction. WWW '23: The ACM Web Conference 2023. United States; pp. 4150–4160.
- Akhtar, Z.; Qazi, U.; Sadiq, R.; El-Sakka, A.; Sajjad, M.; Ofli, F.; Imran, M. Mapping Flood Exposure, Damage, and Population Needs Using Remote and Social Sensing: A Case Study of 2022 Pakistan Floods. WWW '23: The ACM Web Conference 2023. United States; pp. 4120–4128.
- Yan, A.; Howe, B. Fairness-Aware Demand Prediction for New Mobility. Proc. AAAI Conf. Artif. Intell. 2020, 34, 1079–1087. [CrossRef]
- Wang, G.; Zhang, Y.; Fang, Z.; Wang, S.; Zhang, F.; Zhang, D. FairCharge: A data-driven fairness-aware charging recommendation system for large-scale electric taxi fleets. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2020, 4, 1–25.
- Rong, C.; Feng, J.; Ding, J. GODDAG: Generating Origin-Destination Flow for New Cities Via Domain Adversarial Training. IEEE Trans. Knowl. Data Eng. 2023, 35, 10048–10057. [CrossRef]
- Zhou, Q.; Wu, J.; Zhu, M.; Zhou, Y.; Xiao, F.; Zhang, Y. LLM-QL: A LLM-Enhanced Q-Learning Approach for Scheduling Multiple Parallel Drones. IEEE Trans. Knowl. Data Eng. 2025, 37, 5393–5406. [CrossRef]






| Survey | Year | Venue | City Platform | Multi-domain Data | Embodied Agent | Data Life-cycle | Primary Perspective |
|---|---|---|---|---|---|---|---|
| Xu et al. [6] | 2023 | arXiv | ✓ | Model-centric | |||
| Jin et al. [9] | 2023 | IEEE TKDE | ✓ | Model-centric | |||
| Yang et al. [3] | 2024 | IEEE TKDE | ✓ | ✓ | Model-centric | ||
| Zhang et al. [10] | 2024 | ACM KDD | ✓ | ✓ | Model-centric | ||
| Cengiz et al. [4] | 2025 | Information Fusion | ✓ | ✓ | Model-centric | ||
| Lu et al. [11] | 2025 | arxiv | ✓ | ✓ | Data-centric | ||
| Liang et al. [12] | 2025 | ACM KDD | ✓ | ✓ | Model-centric | ||
| Zou et al. [13] | 2025 | Information Fusion | ✓ | ✓ | Model-centric | ||
| Song et al. [14] | 2025 | IEEE TKDE | ✓ | Model-centric | |||
| Our Work | - | - | ✓ | ✓ | ✓ | ✓ | Data-centric |
| Architecture | Core Abstraction | Core Capabilities | Performance Profile | Typical Systems | |||||
| Heterogeneity | Relational Semantics |
Semantic Search |
Temporal Analysis |
Read Latency |
Write Throughput |
Scalability | |||
| Data Lakes | Raw Files | ✓ | ✗ | ✗ | ✗ | High | High | High | Lambda Arch. [65], Kappa Arch. [66], Lakehouse [67] |
| Multi-model DBs | Unified Model | ✓ | ✓ | ✓ | ✓ | Variable | Medium | Medium | Sinew [68], NoAM [69], UniBench [70] |
| Graph DBs | Nodes & Edges | ✗ | ✓ | ✗ | ✗ | Low | Low | High | Neo4j [71], JanusGraph [72], nSKG [73], Sg-CityU [74] |
| Vector DBs | High-dim Vectors | ✗ | ✗ | ✓ | ✗ | Low | Low | High | FAISS [75], Milvus [76], HNSW [77], PQ [78] |
| Time-Series DBs | Time/Value Pairs | ✗ | ✗ | ✗ | ✓ | Low | High | High | Gorilla [79], Apache IoTDB [80], TimescaleDB [81] |
| Spatio-Temporal DBs | Time/Spatial Info | ✗ | ✓ | ✗ | ✓ | Variable | Medium | High | MobilityDB [82], PostGIS [83], TrajMesa [84],TMan [85] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
