Submitted:
28 May 2026
Posted:
29 May 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Digital Twins for Simulation and Decision Support
1.2. Generative AI and Scenario Generation
1.3. Conversational Interfaces for Complex Systems
1.4. Research Gap and Motivation
1.5. Contributions
- 1.
- Conceptual framework: We introduce the Conversational Digital Twin Framework (hereinafter referred to as CDTF), a layered architecture that systematically integrates data ingestion, simulation engines, conversational AI layers, and visualization components for operational planning applications.
- 2.
- Technical implementation: We present a Technology Readiness Level 4 (TRL-4) prototype that validates the framework through a real-world case study: operational planning for large-scale sporting events in tourist destinations. The prototype demonstrates natural language-based configuration of simulation parameters using a state-of-the-art LLM (Gemini 2.5 Flash Lite).
- 3.
- Validation methodology: We define and execute a comprehensive validation protocol that evaluates the framework across multiple dimensions: conversational interaction quality, configuration accuracy, simulation performance, and end-to-end system latency.
- 4.
- Generalizability analysis: We discuss the applicability of the framework beyond event planning, highlighting its potential for other domains requiring simulation-based decision support under uncertainty (e.g., logistics, industrial management, urban planning).
1.6. Paper Organization
2. The Conversational Digital Twin Framework
2.1. Conceptual Architecture
- 1.
- Data layer: Responsible for data acquisition, integration, and preprocessing from heterogeneous sources including historical databases, real-time APIs, and external data providers.
- 2.
- Simulation engine (digital twin core): Encapsulates the computational models that represent the physical system’s behavior, including discrete-event simulation components and machine learning models for prediction. In this proof of concept, the simulation pipeline was adapted to the specific operational planning requirements of football events at the Gran Canaria Stadium. However, the architecture and simulation workflow were designed with modularity and extensibility in mind, enabling their adaptation to different operational domains and event-planning use cases beyond the stadium context.
- 3.
- Conversational AI layer: Provides a natural language interface powered by generative AI models, enabling users to configure simulations, query system state, and interpret results through dialogue.
- 4.
- Visualization layer: Presents simulation outputs through interactive dashboards, reports, and graphical representations tailored to different stakeholder needs.
2.2. Data Layer
2.2.1. Data Sources and Integration
- Historical data: Time-series records of past system behavior stored in relational databases, including operational metrics, resource consumption patterns, and contextual variables (e.g., weather conditions, attendance records).
- Real-time data: Current system state obtained through IoT sensors, monitoring systems, or operational databases that reflect live conditions.
- External contextual data: Information from third-party APIs that provide environmental context, such as weather forecasts, social media trends, public transportation schedules, or ticketing systems.
2.2.2. Data Versioning and Simulation Identification
- Traceability: Each simulation run can be reconstructed by retrieving all data associated with its identifier.
- Comparison: Different scenarios can be evaluated against the baseline or against each other.
- Reproducibility: Simulation results can be validated by re-running configurations with identical parameters.
2.3. Simulation Engine (Digital Twin Core)
2.3.1. Model-Agnostic Architecture
- Input schema: Expected format for configuration parameters and input features.
- Execution protocol: Method signatures for training, prediction, and state updates.
- Output schema: Structured format for returning predictions and diagnostic information.
2.3.2. Machine Learning Models for Prediction
- 1.
- Feature engineering: Transform raw input variables (date, weather, event characteristics) into model-ready features.
- 2.
- Model loading: Retrieve pre-trained models from persistent storage, avoiding retraining overhead during simulation.
- 3.
- Prediction generation: Apply models to the configured scenario parameters to generate forecasts.
2.4. Conversational AI Layer
2.4.1. Large Language Model Integration
- Instruction following: Ability to understand and execute complex instructions that require multiple actions.
- Contextual understanding: Capacity to maintain multi-turn dialogues.
- Structured output generation: Support for generating formatted data (e.g., YAML) alongside natural language.
2.4.2. Natural Language Understanding Pipeline
2.5. System Integration Architecture
2.5.1. Configuration File Management
- Event metadata: Date, time, location, event type.
- Attendance projections: Expected visitor counts, demographic distributions.
- Environmental conditions: Weather forecasts, external events.
- Operational parameters: Resource availability, staff schedules, pricing policies.
2.5.2. RESTful API Communication
2.6. Visualization Layer
2.6.1. Multi-Modal Result Presentation and Dashboard Design
2.7. Technology Readiness Level 4 (TRL4): Implementation Details
2.7.1. Validation Mechanisms
- Input validation: Pydantic schemas enforce type checking, range constraints, and structural requirements on all API inputs, preventing injection attacks and malformed requests [34].
- Database access control: PostgreSQL roles and permissions restrict write access to authorized services, with read-only credentials used for visualization layers.
2.7.2. Framework Extensibility and Generalization
- Data source integration: New ETL connectors can be added to the data layer without modifying downstream components, enabling connection to domain-specific databases or APIs.
- Model diversity: The Simulation Engine’s standardized interface allows substitution of Random Forest models with alternative algorithms (e.g., neural networks, agent-based models) appropriate for different prediction tasks.
- Conversational capabilities: The prompt engineering approach can be customized for specific domains by updating system instructions and few-shot examples, without changing the underlying LLM integration.
- Visualization templates: Power BI dashboards are can be adapted with domain-specific metrics and visual designs.
3. Validation Case Study: Sporting Event Simulation
3.1. Case Study Context
3.2. Conversational Agent Validation
3.3. Scope of the Validation
3.4. Simulation Outputs and Predictions
- Attendance distribution: Total spectators, demographic breakdown (local vs. tourist), arrival time patterns.
- Resource consumption: Food items (sandwiches, snacks), beverages (soft drinks, beer), merchandising (jerseys, scarves).
- Service Utilization: Queue lengths at concession stands, restroom occupancy, parking lot capacity.
- Operational costs: Staffing requirements for cleaning, security, and customer service.
3.4.1. Performance Analysis
3.5. Scenario Exploration and Operational Plausibility
3.6. User Feedback and Qualitative Assessment
- Accessibility: Non-technical users were able to configure and execute simulations without directly editing configuration files or calling API endpoints.
- Transparency: The agent provided explicit feedback on interpreted parameters and default value imputation, helping users understand how their requests were translated into simulation inputs.
- Actionability: The generated outputs were perceived as sufficiently granular to support discussions about procurement, staffing, parking, and security planning.
- Iterative exploration: Users were able to modify assumptions across turns, enabling comparison of alternative operational scenarios within the same conversational session.
4. Discussion
4.1. Interpretation of Validation Findings
4.2. The Democratization of Simulation Through Conversation
4.3. Framework Generalizability and Transfer Potential
- Venue and event management: concerts, festivals, trade fairs, and conference centres, where planners must estimate attendance, queues, staffing, security, cleaning, and resource consumption.
- Tourism and destination management: planning services around seasonal demand, weather conditions, accommodation occupancy, visitor flows, and concurrent events.
- Transport and mobility hubs: airports, ports, railway stations, or bus terminals, where passenger flow, parking, staffing, and congestion management are central planning problems [10].
- Healthcare capacity planning: hospital departments or emergency units, where managers need to explore patient flow, staff allocation, bed availability, and waiting times [11].
- Logistics and warehousing: facilities where demand variability, stock levels, workforce planning, and service times can be analysed through scenario-based simulation [15].
4.4. Limitations and Boundary Conditions
4.4.1. Technology Readiness Level
- Access to real operational data: integration with point-of-sale systems, access-control logs, parking sensors, staffing systems, cleaning records, and incident management platforms.
- Field validation: comparison of simulated outputs with observations collected during real matches or events.
- Production infrastructure: deployment in a scalable and monitored environment with logging, backup mechanisms, and failure recovery.
- Real-time integration: Connectivity to live data streams rather than periodic batch updates.
- Multi-user operation: support for different stakeholder roles, permissions, and simultaneous planning sessions.
- Data governance and compliance: alignment with GDPR requirements, especially if future versions process personal, behavioural, or location-related data.
4.4.2. Use of Synthetic Operational Data
4.4.3. Economic Indicators and Cost Modelling
4.4.4. Conversational Understanding Limits and Tool Use
4.4.5. Scope of Automation
4.4.6. Computational and Deployment Requirements
4.5. Implications for Digital Twin Evolution
4.6. Limitations and Future Work
5. Conclusions
5.1. Summary of Contributions
5.2. Impact and Implications
5.3. Concluding Remarks
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. System Prompt and Configuration
Appendix A.1. Agent System Prompt
| Listing A1: System Prompt of the Conversational Agent |
![]()
|
Appendix A.2. Simulation Engine Configuration File
| Listing A2: Structure of the Configuration File in YAML |
![]() ![]()
|
References
- Pott, C.; Spiekermann, C.; Breuer, C.; et al. Managing logistics in sport: A comprehensive systematic literature review. Manag Rev. Q. 2024, 74, 2341–2400. [Google Scholar] [CrossRef]
- Rabadi, G.; Khallouli, W.; Salem, M.; Ghoniem, A. Planning and management of major sporting events: A survey. Int. J. Plan Sched. 2015, 2, 154. [Google Scholar] [CrossRef]
- Fuller, A.; Fan, Z.; Day, C.; Barlow, C. Digital twin: Enabling technologies, challenges and open research. IEEE Access. 2020, 8, 108952–108971. [Google Scholar] [CrossRef]
- Singh, M.; Fuenmayor, E.; Hinchy, E.P.; Qiao, Y.; Murray, N.; Devine, D. Digital twin: Origin to future. Appl. Syst. Innov. 2021, 4, 36. [Google Scholar] [CrossRef]
- Liu, M.; Fang, S.; Dong, H.; Xu, C. Review of digital twin about concepts, technologies, and industrial applications. J. Manuf. Syst. 2021, 58, 346–361. [Google Scholar] [CrossRef]
- Madni, A.M.; Madni, C.C.; Lucero, S.D. Leveraging digital twin technology in model-based systems engineering. Systems 2019, 7(1), 7. [Google Scholar] [CrossRef]
- Kunath, M.; Winkler, H. Integrating the digital twin of the manufacturing system into a decision support system for improving the order management process. Procedia CIRP 2018, 72, 225–231. [Google Scholar] [CrossRef]
- Barricelli, B.R.; Casiraghi, E.; Fogli, D. A survey on digital twin: Definitions, characteristics, applications, and design implications. IEEE Access. 2019, 7, 167653–167671. [Google Scholar] [CrossRef]
- Tao, F.; Cheng, J.; Qi, Q.; Zhang, M.; Zhang, H.; Sui, F. Digital twin-driven product design, manufacturing and service with big data. Int. J. Adv. Manuf. Technol. 2018, 94. [Google Scholar] [CrossRef]
- Deng, T.; Zhang, K.; Shen, Z.-J. A systematic review of a digital twin city: A new pattern of urban governance toward smart cities. J. Manag Sci. Eng. 2021, 6(2), 125–134. [Google Scholar] [CrossRef]
- Croatti, A.; Gabellini, M.; Montagna, S.; Ricci, A. On the integration of agents and digital twins in healthcare. J. Med. Syst. 2020, 44(9), 161. [Google Scholar] [CrossRef] [PubMed]
- Rasheed, A.; San, O.; Kvamsdal, T. Digital twin: Values, challenges and enablers from a modeling perspective. IEEE Access. 2020, 8, 21980–22012. [Google Scholar] [CrossRef]
- Jones, D.; Snider, C.; Nassehi, A.; Yon, J.; Hicks, B. Characterising the Digital Twin: A systematic literature review. CIRP J. Manuf. Sci. Technol. 2020, 29 Pt A, 36–52. [Google Scholar] [CrossRef]
- Kritzinger, W.; Karner, M.; Traar, G.; Henjes, J.; Sihn, W. Digital Twin in manufacturing: A categorical literature review and classification. IFAC-PapersOnLine 2018, 51(11), 1016–1022. [Google Scholar] [CrossRef]
- Lu, Y.; Liu, C.; Wang, K.-I.K.; Huang, H.; Xu, X. Digital Twin-driven smart manufacturing: Connotation, reference model, applications and research issues. Robot. Comput.-Integr. Manuf. 2020, 61, 101837. [Google Scholar] [CrossRef]
- Sharma, A.; Kosasih, E.; Zhang, J.; Brintrup, A.; Calinescu, A. Digital twins: State of the art theory and practice, challenges, and open research questions. J. Ind. Inf. Integr. 2022, 30, 100383. [Google Scholar] [CrossRef]
- Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language Models are Few-Shot Learners. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems. 2020;33:1877–1901. Available from: Https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
- Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, Bernstein MS, Bohg J, Bosselut A, Brunskill E, et al. On the Opportunities and Risks of Foundation Models. arXiv. 2021. Available from: Https://crfm.stanford.edu/assets/report.pdf.
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. OpenAI Technical Report 2019. Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language Models are Unsupervised Multitask Learners. 2019. Available from: Https://api.semanticscholar.org/CorpusID:160025533.
- Zhou, Z.; Lin, Y.; Jin, D.; Li, Y. Large Language Model for Participatory Urban Planning. arXiv 2024. [Google Scholar] [CrossRef]
- Elbasheer M, Laili Y, Longo F, et al. Natural language-driven production planning: Integrating large language models with automatic simulation model generation in manufacturing systems. J Intell Manuf. 2025. https://doi.org/10.1007/s10845-025-02732-z.
- Jauhiainen, J.S.; Hakanpää, S.; et al. Generative AI in participatory urban planning: Synthetic inhabitants and experts. Land. 2026, 15(3), 407. [Google Scholar] [CrossRef]
- Narechania A, Srinivasan A, Stasko JT. NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries. CoRR. 2020;abs/2008.10723. Available from: Https://arxiv.org/abs/2008.10723.
- Chen M, Tworek J, Jun H, Yuan Q, Pinto HPdO, Kaplan J, Edwards H, et al. Evaluating Large Language Models Trained on Code. CoRR. 2021;abs/2107.03374. Available from: Https://arxiv.org/abs/2107.03374.
- Menzel T, Bagschik G, Isensee L, Schomburg A, Maurer M. From functional to logical scenarios: Detailing a keyword-based scenario description for execution in a simulation environment. In: 2019 IEEE Intelligent Vehicles Symposium (IV). 2019. p. 2383–2390. [CrossRef]
- Dengler G, Bazan P, German R, Lalbakhsh P, Liebmann A. A conversational human-computer interface for smart energy system simulation environments. In: 2023 Winter Simulation Conference (WSC). 2023. p. 2978–2989. [CrossRef]
- Tur G, De Mori R, editors. Spoken language understanding: Systems for extracting semantic information from speech. Hoboken (NJ): John Wiley & Sons; 2011. [CrossRef]
- Young, S.; Gašić, M.; Thomson, B.; Williams, J.D. POMDP-based statistical spoken dialog systems: A review. Proc. IEEE 2013, 101(5), 1160–1179. [Google Scholar] [CrossRef]
- Liu X, David I. AI simulation by digital twins: Systematic survey, reference framework, and mapping to a standardized architecture. Softw Syst Model. 2025. [CrossRef]
- Grieves M, Vickers J. Digital Twin: Mitigating Unpredictable, Undesirable Emergent Behavior in Complex Systems. In: Kahlen FJ, Flumerfelt S, Alves A, editors. Transdisciplinary Perspectives on Complex Systems: New Findings and Approaches. Cham: Springer; 2017. p. 85–113. [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- FastAPI. FastAPI Framework, High Performance, Easy to Learn, Fast to Code, Ready for Production. Available online: Https://fastapi.tiangolo.com/ (accessed on 20 December 2025).
- Microsoft Power BI Documentation. Microsoft. Available online: Https://docs.microsoft.com/en-us/power-bi/ (accessed on 29 April 2026).
- Pydantic: Data Validation Using Python Type Hints. Pydantic. Available online: Https://docs.pydantic.dev/ (accessed on 29 April 2026).
- Sculley, D.; Holt, G.; Golovin, D.; Davydov, E.; Phillips, T.; Ebner, D.; Chaudhary, V.; Young, M.; Crespo, J.F.; Dennison, D. Hidden Technical Debt in Machine Learning Systems. In Advances in Neural Information Processing Systems; Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2015; Volume 28, pp. 2503–2511.
- McMillan-Major, A.; Aschenbrenner, S.; Bhattacharjee, A.; Dash, S.; Ovalle, A.; Judd, N.; Mitchell, M. Whose Opinions Do Language Models Reflect? In Proceedings of the 40th International Conference on Machine Learning; PMLR: Honolulu, HI, USA, 2023; Volume 202, pp. 24094–24111.
- Wang, X.; Li, J.; Liu, Y.; Chen, H. Conversational AI for Industrial Automation: A Survey. IEEE Trans. Autom. Sci. Eng. 2023, 20, 1821–1837. [Google Scholar]


| User input | System response |
|---|---|
| "Create configuration for Las Palmas vs Sevilla match with 32,400 attendance" | Configuration created with simulation_id=12. Attendance set to 32,400. Other parameters initialized to historical averages. Simulation ready to execute. |
| "Simulate 10% more tourists than usual with rainy weather" | Configuration updated: tourist_attendance increased by 10%, weather_condition set to "rain" (probability 0.8). simulation_id=13 assigned. |
| "What happens if it’s a high-risk match on a sunny day with 12,300 spectators?" | Configuration created: attendance=12300, risk_level=high, weather=sunny. Note: Other parameters will be imputed from historical data. Proceed with simulation? |
| "Show me the predictions for simulation 13" | Simulation 13 status: finished. Downloading Excel report with predictions. Dashboard link: [PowerBI URL]. |
| Metric | Value | Evaluation setup |
|---|---|---|
| Intent recognition accuracy | 93.3% | 30 user prompts |
| Parameter extraction accuracy | 100% | 50 extracted parameters |
| Invalid input detection rate | 100% | 20 cases asking for invalid values |
| Full simulation workflow completion rate | 100% | 10 scenarios |
| Metric | Value | Evaluation setup |
|---|---|---|
| Average conversational response time | <6 s | 50 cases |
| Average parameter modification time | 5 s | 30 cases |
| Average complete simulation workflow time | 1:07 min | 10 scenarios |
| Excel report generation time | 2 s | 10 scenarios |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).




