Submitted:
12 August 2025
Posted:
13 August 2025
You are already at the latest version
Abstract
Keywords:
1. State of the Art
- a level that provides immersive access to the virtual world;
- a level ensuring security, precision, and data confidentiality;
- a level that enhances learning through an AI-powered virtual “colleague” that adapts to each user’s learning style.
- helps users learn significantly better and faster, reducing the knowledge gap by up to 91%;
- sets up training environments 12 times faster;
- consumes 60% less energy, making it more environmentally friendly;
- can be accessed by anyone, on any device, with just n internet connection.
1.1. From Single- Purpose VR Systems to Open, Cloud-Native Metaverses

1.2. Digital-Twin Ecosystems for Technical Training
1.3. Automation of Cyber-Range Provisioning
1.4. AI-Enhanced Interaction and Accessibility
- The first trend is the use of lightweight WebRTC codecs. These facilitate real-time communication directly in web browsers, offering efficient audio and video data compression. This efficiency is crucial for delivering low-latency immersive experiences, even on limited bandwidth connections, maximizing the performance of spatial streaming and collaboration in distributed virtual environments [1].
- The second major trend involves AI-controlled avatars with real-time lip-sync [22]. Avatars, as digital representations in metaverses, gain an increased level of realism and expressiveness through artificial intelligence-based animation [23]. The ability to precisely synchronize the avatar's lip movements with speech, in real-time, significantly contributes to user immersion and the credibility of virtual interactions, reducing the "uncanny valley" and improving the perception of social presence [24,25,26]. This technology is vital for natural and efficient communication in training scenarios.
- Finally, the third trend refers to browser-exclusive access via HTTPS on a single port [6,26]. This approach eliminates the need for installing dedicated software or complex plugins, drastically simplifying the implementation process and reducing "IT friction" in corporate environments [6,27]
.The use of a single standard port (HTTPS) minimizes security obstacles imposed by firewalls and restrictive enterprise network policies, facilitating the rapid and scalable adoption of virtual solutions [28]. This method simplifies infrastructure management and extends global accessibility, transforming the browser into a universal client for complex and secure virtual experiences [29,30]. The convergence of these innovations is fundamental for democratizing access to metaverses and immersive training systems."
1.5. Research Gaps
2. Methodology and Conceptual Framework
2.1. Iterative Orchestration Cycles in SCSDT
- (i)
- Diagnoses current skill gaps. An algorithmic and quantitative assessment of learner competencies, based on the vectorial computation of gaps (G = CT – CLG – CL), followed by automated qualitative classification (high/medium/low) for each skill. This process enables a precise and actionable understanding of the learner’s knowledge deficiencies, laying the groundwork for rapid training personalization. An orchestration engine recalculates a Euclidean gap vector (G), defined as the difference between the target competencies (CT), the learner’s general knowledge (CLG), and the competencies already acquired through training (CL). This vector-based approach quantifies specific deficits. Each component of the G vector is then automatically classified (high, medium, or low), indicating the severity of the gap. This algorithmic evaluation produces a granular learning needs map, enabling tailored instructional interventions in under one minute.
- (ii)
- Automatically provisions gap-matched learning assets in the cloud. A method of dynamic instantiation and allocation of virtual resources is used. It involves virtualization techniques (linked-clone VM creation) and business logic that maps diagnosed skill gaps to specific training assets. The “adaptive provisioning” operation dynamically customizes the learning environment based on the previously computed skill gap profile. Each skill gap classified as high or medium severity is mapped to at least one training asset or simulation module of corresponding fidelity (medium or high). This intelligent allocation ensures that learning resources are tailored to the learner’s specific needs, optimizing instructional relevance. The entire provisioning process is executed with remarkable efficiency, typically within 30–60 seconds [4].
- (iii)
- Allowing the learner to complete those resources via the streaming interface. This is not a “method” in the algorithmic sense, but rather the phase of active learner interaction. Pedagogical methods applied here include CBT, guided simulations, and hands-on digital-twin drills. The “learner activity” phase is the core and longest stage of the orchestration cycle, where the learner engages directly with tailored educational resources. Learners follow a structured path, progressing from theoretical review through Computer-Based Training (CBT), to hands-on problem-solving exercises such as guided fault simulations. The phase culminates in an intensive, interactive digital-twin drill, enabling direct application of knowledge in realistic scenarios. This active learning phase typically lasts 15–40 minutes and is essential for competence consolidation and skill transfer [4,9].
- (iv)
- Reincorporates the new evidence into the learner digital twin (Table 1, stages 4 and 5). This phase involves data collection and storage using standard logging protocols (xAPI), virtual environment state management (snapshotting), and digital twin updating by processing the learner’s activity data. The method applies learning analytics algorithms to calculate “mastery deltas” and incorporate new evidence into the learner’s digital profile. These steps ensure a continuous feedback loop that refines training personalization based on updated performance data.
2.2. Iteration Checkpoints Used in the Experiment
2.3. Why Iteration Granularity Matters?
- Quantify monotonic gains – competence gaps decreased significantly at every analysed iteration (paired contrasts, Bonferroni-corrected p<0.02p < 0.02p<0.02; Table 3).
- Model effect size – the within-subject design yielded a partialshowing that 91 % of the variance in ∥G∥2 is explained by training 300 progression.-η2\eta^{2}η2 of 0.91 (F = 52.7, p<0.0001p < 0.0001p<0.0001),
- Align platform telemetry with learning analytics – each iteration is timestamp-anchored, making it straightforward to link provisioning latency, stream bandwidth, or GPU utilisation back to pedagogical effectiveness.
2.4. Implementation Roadmap
2.5. SCSDT Framework Applications
- -
- validates the SCSDT model, showing that the framework is not just theoretical, but applicable across industries,
- -
- demonstrates modularity, highlighting how the generic layers of SCSDT can be customized for specific needs without recreating the system from scratch,
- -
- provides concrete evidence, presenting specific examples of technologies and applications used in each layer for each use case, and
- -
- highlights innovation, highlighting how SCSDT addresses the complex requirements of modern training (e.g. EASA compliance, MITRE ATT&CK matrix, IIoT).
2.6. Security and Governance Blueprint
- Zero-Trust Channel: This principle forms the foundation of network security, assuming that no entity, internal or external, is implicitly trusted. All communications are facilitated through a single, secure entry on standard port 443 (HTTPS), minimizing the attack surface. Mutual-TLS between Envoy side-cars validates the identity of both parties in microservice communications. Authorization is managed via short-lived (15-minute) JSON Web Tokens (JWTs), with clearly defined access scopes (e.g., scp for security, emp for platform operations, gap for gap data), adhering to the "least privilege" principle and reducing the exposure window in case of compromise. This aligns with NIST SP 800-207 (Zero Trust Architecture) and OWASP best practices for secure APIs.
- VM Provenance: To ensure the integrity and auditability of training environments, the Environment Management Platform (EMP) calculates an SHA-256 cryptographic hash of each VM's template disk image. This unique attestation is subsequently recorded in an append-only ledger (MariaDB with binlog_format=ROW), which guarantees the immutability and non-repudiation of records. This mechanism is crucial for compliance and for verifying the origin and state of VMs in accordance with integrity principles, essential in standards like ISO 27001.
- SBOM / CVE Watch: Software supply chain security is addressed through proactive vulnerability monitoring. An automated scanner (Trivy) performs nightly scans of all containers used in the system (DTBT, 8agora), identifying known vulnerabilities (CVEs). A failing score from these scans automatically blocks the upgrade process (helm-upgrade), preventing the deployment of vulnerable software into the operational environment. This approach aligns with DevSecOps principles and is an emerging requirement in cybersecurity, promoted by governmental initiatives and standards such as ISO 27001 (A.14.2.7 Vulnerability management).
- Data-Minimization: Compliance with data protection regulations is ensured through the rigorous application of the data minimization principle. Sensor data packets are truncated to 16-bit fixed-point where 0.1-unit resolution suffices, reducing data volume and potential for identifiability of processed data. This strategy represents a direct implementation of the "privacy by design" concept, a fundamental pillar of GDPR (Article 25), ensuring that data protection is integrated into the system's architecture from the outset.
- Audit Feeds: To support compliance requirements, post-incident investigations, and operational analysis, the system generates comprehensive audit streams. xAPI events (learner activity), power events, and security events are continuously streamed to Loki, a centralized logging system. These logs are retained for a period of 6 months, providing a complete audit trail, essential for ISO 27001 (A.12.4) and other industry-specific compliance standards.
2.7. Sustainability Considerations
- ▪
- GPU Consolidation: Efficient sharing of graphics processing units (e.g., 6 users per A16), reducing energy consumption to just 42 W per user.
- ▪
- Dynamic GPU Pools: Intelligent resource management where idle sessions (< 60 s) trigger hibernation, saving energy without significantly impacting user experience (resume in ~8 s).
- ▪
- Mesh-Coded Streaming (AV1-SVC): Drastically reducing bandwidth (1080p at 200 kB/s for static scenes), thereby lowering the network infrastructure's energy consumption.
- ▪
- Carbon Dashboard: Real-time monitoring of emissions (kg CO₂e per learner-hour) via grid-intensity APIs, facilitating continuous optimization of the operational ecological impact.
3. Results and Evaluation
3.1. Experimental Set-Up
- The time required to compress rendered video frames on the server and decompress them on the client side is specific to any video streaming system (additional encode/decode overhead).
-
The total time from a user's action (e.g., head movement in VR) until that change is visibly represented by photons reaching the user's eye (motion-to-photon latency). High latency here can cause motion sickness and degrade realism. The system provides a value:
- ○
- Less than 8 ms for desktop clients, which indicates excellent responsiveness for streamed desktop applications, comparable to local execution.
- ○
- Less than 11 ms for HMD (VR) 90 Hz displays. This is an impressive technical achievement. For a 90 Hz display, a new frame needs to be rendered and displayed approximately every 11.11 ms. Maintaining motion-to-photon latency below this threshold (or very close to it) is vital to prevent motion sickness and ensure a smooth, realistic VR experience. It demonstrates SCSDT's capability to efficiently support high-fidelity VR streaming [6,9]
.
| Parameter | MRO-Aero pilot | Cyber-Factory cyber-range |
| Concurrent learners | 6 | 50 |
| GPU type / count | 1 × NVIDIA A16 | 9 × NVIDIA A16 (6 users ∙ GPU⁻¹) |
| Host specsŢ3 | 64-core AMD EPYC, 512 GB RAM | idem (×9 nodes in one rack) |
| Total server power (GPU + CPU chassis) | 500 W | 9 × 500 W = 4.5 kW |
| Streaming codec | H.264 / AV1 adaptive | Idem |
| VM provisioning tool | EMP clone.py (linked) | EMP bulk clone + bridge |
| Digital-twin runtime | DTBT orchestrator | EMP + IIoT gateway |
3.2. System-Level Performance
3.2.1. Micro-Level Provisioning Latency
3.2.2. Adaptive Bandwidth and Latency
3.3. Network Performance
3.4. Bulk Provisioning Throughput
3.5. Resource Efficiency and Sustainability
3.6. Security and Orchestration Overhead
3.7. Training Outcomes (Competence-Gap Reduction)
3.7.1. Iterations in SCSDT
3.7.2. Data Matrix

| Learner | Iter. 0 | Iter. 1 | Iter. 3 | Iter. 5 |
| L1 | 0.55 | 0.40 | 0.23 | 0.14 |
| L2 | 0.46 | 0.34 | 0.18 | 0.10 |
| L3 | 0.42 | 0.31 | 0.16 | 0.09 |
| L4 | 0.51 | 0.37 | 0.21 | 0.12 |
| L5 | 0.50 | 0.36 | 0.20 | 0.11 |
| L6 | 0.43 | 0.30 | 0.17 | 0.09 |
3.8. Comparative Discussion
- External validation: SCSDT's performance (e.g., cloning time, energy efficiency) is directly compared to other frameworks such as INSALATA, Alfons, and NetBed, demonstrating that SCSDT's advantages are not only theoretical but also empirically superior.
- Argument synthesis: It consolidates hardware, bandwidth, provisioning, immersion, and sustainability aspects into a coherent cost–benefit narrative.
4. Threats to Validity
4.1. Residual Risks and Roadmap
- Longitudinal durability study (6-month competence decay).
- Green-energy audit against ISO 50001.
- Multisite replication with ≥150 participants to power mixed-effects modelling.
- Sensitivity analysis of bandwidth caps (200–1200 kB s⁻¹) on QoE and motion-sickness.
5. Discussion and Implications
5.1. Pedagogical Value of Cloud-Streamed Digital Twins
5.2. Operational Efficiency and Carbon Footprint
5.3. Practical Implications and Operational Impact

- Enterprise collaboration—virtual campuses that fuse slide decks, whiteboards and screen-shared web apps, all accessible through Outlook-generated “metaverse links” [40].
| Benefit | What it means in practice | Why it matters for training | |
| No heavy workstation on the learner’s desk | All rendering and physics run on cloud GPUs; the trainee just opens a browser tab. | Any laptop, tablet or even 4 G LTE link can join a full-fidelity 3-D session, removing hardware and IT-security barriers. | |
| Single HTTPS-443 connection | The client/streamer bundle tunnels audio, video and controls through one encrypted port behind the customer firewall. | Corporate networks stay locked down; roll-outs skip the usual port-opening negotiations. | |
| Ultra-light bandwidth (≈0.2–0.5 MB s⁻¹) | Adaptive AV1/H.264 stream compresses the whole 3-D world into sub-500 kB s⁻¹. | Learners in plants or hangars with spotty Wi-Fi still get <10 ms motion-to-photon latency—no nausea, no stutter. | |
| GPU consolidation & green ops | One NVIDIA A16 card drives six HD trainees at ~250 W total. | Power per learner falls by ~60 % versus six desktop GPUs—lower cost and carbon for 24 × 7 shift training. | |
| “Click-to-clone” provisioning | Linked-clone VMs, network bridges and snapshots spin up in ~15 s per learner via EMP scripts or the web portal. | Instructors can create or reset 50 six-VM lab bundles in <15 min—an order-of-magnitude faster than hand-built ranges, so classes start on time. | |
| Live digital-twin hooks | An API streams sensor/PLC data into the scene; avatars can press buttons that change real machines. | Technicians rehearse procedures on a fully synchronised virtual line, then walk to the shop floor and see the same HMI screens. | |
| AI-assisted immersion | Cloud AI drives lip-sync, gestures, speech-to-text and 200-language live translation. | Multilingual crews collaborate naturally; supervisors get instant transcripts for audit and feedback. | |
| Zero-install content editing | 3-D models, scripts and training screens can be dragged into the world and are live for all users—no recompilation. | SMEs tweak layouts during a session and immediately test new fault scenarios or SOP updates. | |
| Built-in security & governance | JWT authentication, tamper-evident VM hashes and snapshot rollback by design. | Meets ISO 27001 / EASA audit requirements without bolted-on tools. | |
5.4. Industrial Uptake and Domain-Specific Applications of the SCSDT Framework
5.5. Research Roadmap
6. Conclusions
Abbreviations
| AI | Artificial Intelligence |
| ANOVA | Analysis of Variance |
| AV1 | AOMedia Video 1 (video compression format) |
| CPU | Central Processing Unit |
| CI | Confidence Interval |
| CLI | Command-Line Interface |
| CSV | Comma-Separated Values |
| DT | Digital Twin |
| DTBT | Digital Twin-Based Training |
| EMP | Evidence-based Metaverse Pedagogy |
| GPU | Graphics Processing Unit |
| HMD | Head-Mounted Display |
| HTTP | Hypertext Transfer Protocol |
| IIoT | Industrial Internet of Things |
| L1 / L2 / L3 | Layer 1 / Layer 2 / Layer 3 (SCSDT Architecture) |
| MaaS | Metaverse-as-a-Service |
| OPC-UA | Open Platform Communications – Unified Architecture |
| QoE | Quality of Experience |
| REST | Representational State Transfer |
| SCSDT | Secure Cloud-Streaming Digital Twin |
| SSO | Single Sign-On |
| t-test | Student’s t-distribution test |
| UDP | User Datagram Protocol |
| USDZ | Universal Scene Description Zip (3D file format by Apple) |
| VNC | Virtual Network Computing |
| VR/AR | Virtual Reality / Augmented Reality |
| xAPI | Experience API (e-learning tracking standard) |
| YAML | Yet Another Markup Language |
References
- Bivolaru, M.M. Optimizing Trajectories in 3D Space Using Mixed Integer Linear Programming. U.P.B. Sci. Bull, Series D 2024, 86, 45–56. [Google Scholar]
- Jerald, J. The VR Book Human-Centered Design for Virtual Reality, ACM Books, New York, U.S.A., 2015. pp. 1-523.
- Josifovska, K.; Yigitbas, E.; Engels, G. Reference Framework for Digital Twins within Cyber-Physical Systems, In Proceedings of the 2019 IEEE/ACM 5th International Workshop on Software Engineering for Smart Cyber-Physical Systems (SEsCPS), Montreal, QC, Canada, (28 May 2019).
- Payne, A.; Kent, S.; Carable, O. Development And Evaluation Of A Virtual Laboratory: A Simulation To Assist Problem-Based Learning.. In Proceedings of the MCCSIS'08 - IADIS Multi Conference on Computer Science and Information Systems; Proceedings of e-Learning, Netherlands, (22-25 July 2008).
- Zhai, P.; Zhang, L.; Zhang, Y. Internet of Things Access Control Identity Authentication Method Based on Blockchain. U.P.B. Sci. Bull. Series C 2025, 87, 289–308. [Google Scholar]
- Deac, G.; Georgescu, C.; Popa, C.; Ghinea, M.; Cotet, C.E. Virtual Reality Exhibition Platform, In Proceedings of the 29th DAAAM International Symposium, Zadar, Croatia, (24-27 October 2018).
- Rahman, M.; Mahbuba, T.; Siddiqui, A. Cloud-Nativ Data Architectures for Machine Learning (Sabila Nowshin Jahangirnagar University, Bangladesh), Personal Communication, 2019.
- Salah, K.; Hammoud, M.; Zeadally, S. Teaching Cybersecurity Using the Cloud. IEEE Transactions on Learning Technologies 2015, 8, 383–392. [Google Scholar] [CrossRef]
- Deac, G.C.; Deac, T. Multi-Layer Metaverse Architectures. Journal of Digital Learning Environments 2024, 3, 19–35. [Google Scholar]
- Saha, A.; Hamidouche, W.; Chavarrías, M.; Pescador, F.; Farhat, I. Performance analysis of optimized versatile video coding software decoders on embedded platforms. Journal of Real-Time Image Processing 2023, 20, 119–120. [Google Scholar] [CrossRef]
- Costa, G.; Russo, E.; Armando, A. Automating the Generation of Cyber Range Virtual Scenarios with VSDL. Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications 2022, 13, 61–80. [Google Scholar] [CrossRef]
- Lillemets, P.; Bashir Jawad, N.; Kashi, J.; Sabah, A.; Dragoni, N. A Systematic Review of Cyber Range Taxonomies: Trends, Gaps, and a Proposed Taxonomy. Future Internet 2025, 17, 259. [Google Scholar] [CrossRef]
- Arnold, D.; Ford, J.; Saniie, J. Architecture of an Efficient Environment Management Platform for Experiential Cybersecurity Education. Information 2025, 16, 604. [Google Scholar] [CrossRef]
- Kabashkin, I. Digital-Twin-Based Ecosystem for Aviation Maintenance Training. Information 2025, 16, 586. [Google Scholar] [CrossRef]
- Corbin, J.; Strauss, A.L. Basics of Qualitative Research, 4th ed., SAGE Publications, Inc, Thousand Oaks, California, U.S.A. 2015, pp. 1-431.
- Massey, A.; Montoya, M.; Binny, S.; Windeler, J. Presence and Team Performance in Synchronous Collaborative Virtual Environments. Small Group Research 2024, 55, 290–323. [Google Scholar] [CrossRef]
- Xu, L.; Dijiang, H.; Tsai, W.T. Cloud-Based Virtual Laboratory for Network Security Education. IEEE Transactions on Education 2014, 57, 145–150. [Google Scholar] [CrossRef]
- Pham, C.; Tang, D.; Chinen, K.; Beuran, R. CyRIS: a cyber range instantiation system for facilitating security training, In Proceedings of the 7th Symposium on Information and Communication Technology (SoICT '16). Association for Computing Machinery, New York, NY, USA, (8-9 December 2016).
- Herold, N.; Wachs, M.; Dorfhuber, M.; Rudolf, C.; Liebald, S.; Carle, G. Achieving reproducible network environments with INSALATA. In Proceedings of the 11th IFIP WG 6.6 International Conference on Autonomous Infrastructure, Management, and Security, AIMS 2017 Zurich, Switzerland, (10–13 July 2017).
- Fajjari, I.; El Byed, H.; Guillemin, S.; Secci, S. INSALATA: A Testbed for Network Service Function Chains, In the Proceedings of the 2017 IEEE International Conference on Communications (ICC), Paris, France,( 21-25 May 2017).
- Noponen, S.; Parssinen, J.; Salonen, J. Cybersecurity of Cyber Ranges: Threats and Mitigations. International Journal for Information Security research 2022, 12, 1032–1040. [Google Scholar] [CrossRef]
- Liljenstam, M.; Nicol, D.; Yuan, Y.; Yan, G.; Grier, C. RINSE: The Real-Time Immersive Network Simulation Environment for Network Security Exercises. In Proceedings of the Workshop on Principles of Advanced and Distributed Simulation (PADS'05), Monterey, CA, U.S.A., (1-3 June 2005).
- ISO/IEC JTC 1/SC 42, Artificial Intelligence. Available online: https://www.iso.org/committee/6794475.html (accessed on 10 March 2025).
- Brakel, V.; Barreda-Ángeles, M.; Hartmann, T. Feelings of presence and perceived social support in social virtual reality platforms. Computers in Human Behaviour 2022, 139, 1–11. [Google Scholar] [CrossRef]
- Akcaoglu, M.; Lee, E. Increasing Social Presence in Online Learning through Small Group Discussions. International review of Research in Open and Distributed Learning 2016, 17, 1–17. [Google Scholar] [CrossRef]
- Loh, N.-H.; Khairul, A.R.; Wang, C. Framework development of real-time lip sync animation on viseme based human speech. Jurnal Teknolog 2015, 75, 19–29. [Google Scholar] [CrossRef]
- Lecci, M.; Drago, M.; Zanella, A.; Zorzi, M. An Open Framework for Analyzing and Modeling XR Network Traffic. IEEE Access 2021, 9, 129728–129785. [Google Scholar] [CrossRef]
- Sharma, H. Next-Generation Firewall in the Cloud: Advanced Firewall Solutions to the Cloud. ESP Journal of Engineering & Technology Advancements 2021, 1, 98–111. [Google Scholar]
- Ribezzo, G.; Samela, G.; Palmisano, V.; De Cicco, L.; Mascolo, S. A DASH video streaming system for immersive contents. In Proceedings of the 9th ACM Multimedia Systems Conference (MMSys '18). Association for Computing Machinery, New York, NY, U.S.A., (28 July 2018).
- Pan, G.; Xu, S.; Zhang, S.; Chen, X.; Sun, Y. Quality of Experience Optimization for Real-time XR Video Transmission with Energy Constrains. IEEE Transactions on Vehicles Technology 2023, 10, 1–6. [Google Scholar]
- Hariri, B.; Ratti, S.; Shirmohammadi, s.; Pakravan, M.R. A distributed latency-aware architecture for massively multi-user virtual environments, In the Proceedings of the 2008 IEEE International Workshop on Haptic Audio visual Environments and Games, Ottawa, ON, Canada, (18-19 October 2008).
- Tao, L.; Cukurova, M.; Song, Y. Learning analytics in immersive virtual learning environments: a systematic literature review. Smart Learning Environments 2025, 12, 1–27. [Google Scholar] [CrossRef]
- Ileana, M.; Sfat, R.; Marian, C.V. Virtual Reality-Based E-learning System Using Distributed WEB Systems. U.P.B. Sci. Bull. Series C 2025, 87, 125–140. [Google Scholar]
- Clark, D.A.G.; Marnewick, A.L.; Marnewick, C. Virtual Team Performance Factors: A Systematic Literature Review, In the Proceedings of the 2019 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Macao, China, (15-19 December 2019).
- Lazaroiu, E.; Mustata, C.; Dragomirescu, C. Working and Learning in Industry 4.0 environments. U.P.B. Sci. Bull., Series D 2019, 81, 353–366. [Google Scholar]
- Eltraify, A.; Alani, R.; Ajibola, O.; Fadlelmula, W.; Hassan, A.; Hamad, A.; Elgamal, A.; Ncube, W.; Ibrahim, H.; Elmirghani, M.; Mohamed, S.; Krug, L.; McSorley, G.; Elmirghani, J. Energy Efficient AR/VR Edge Processing: Architecture and Optimization. IEEE Access 2025, 13, 83426–83449. [Google Scholar] [CrossRef]
- Rossey, L.M.; Cunningham, R.K.; Fried, D.J.; Rabek, J.C.; Lippmann, R.P. LARIAT: Lincoln Adaptable Real-time Information Assurance Testbed, In Proceedings of the IEEE Aerospace Conference, Big Sky, MT, USA, (15 April 2003).
- Yasuda, S.; Miura, R.; Ohta, S.; Takano, Y.; Miyachi, T. A Mimetic Network Environment Construction System. In Proceedings of the 11th International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities (TRIDENTCOM 2016), Hangzhou, China, 13-15 June 2016. [Google Scholar]
- Li, D.; Byna, S.; Chakradhar, S. Energy-Aware Workload Consolidation on GPU, In the Proceedings of the 2011 International Conference on Parallel Processing Workshop, Taipei City Taiwan, (13-16 September 2011).
- The rise and rise of the Nordic data centre industry. Available online: https://www.infrastructureinvestor.com/the-rise-and-rise-of-the-nordic-data-centre-industry/#:~:text=Dominic%20Ward%2C%20chief%20executive%20of%20Verne%20Global%2C,around%20climate.%E2%80%9D%20View%20project%20in%20full%20screen (accessed on 12 July 2025).
- PRE-DRAFT Call for Comments: Guide to Industrial Control Systems (ICS) Security. Available online: https://csrc.nist.rip/publications/detail/sp/800-82/rev-3/draft (accessed on 30 June 2025).
- Vykopal, J.; Seda, P.; Švábenský, P.; Čeleda, P. Smart Environment for Adaptive Learning of Cybersecurity Skills. IEEE Transactions on Learning Technologies 2023, 16, 443–456. [Google Scholar] [CrossRef]
- Runde, C. Emerging Standardization Requirements for the Metaverse in Defense Use Cases, (Virtual Dimension Center – VDC), Personal Communication, 2025.
- White, B.; Lepreau, J.; Stoller, L.; Ricci, R.; Guruprasad, S.; Newbold, M.; Hibler, M.; Barb, C.; Joglekar, A. An integrated experimental environment for distributed systems and networks. SIGOPS Oper. Syst. Rev. 2003, 36, 255–270. [Google Scholar] [CrossRef]
- Reuter, C.; Salewski, F.; Perl, H.; Mühlhäuser, M. Alfons: Automated Deployment of Complex Virtual Environments for Cybersecurity Training, In the Proceedings of the 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), Stockholm, Sweden, (17-19 June 2019).



| Stage | Operation | Typical duration (pilot study*) |
| 1. Gap refresh | Orchestration engine recomputes the Euclidean gap vector G=CT−CLG = C_{T} - C_{L}G=CT−CL and classifies each component (high / medium / low). G = Gap vector; CT = Target Competencies; CLG = Learned Competencies from General knowledge; |
< 1 min |
| 2. Adaptive provisioning | Linked-clone VMs are spawned from templates; each high-severity gap receives at least one medium- or high-fidelity asset. | 30–60 s |
| 3. Learner activity | The learner completes the prescribed content (CBT refresher → guided fault sim → digital-twin drill). CBT = Computer-Based Learning |
15–40 min |
| 4. Data capture & snapshot | xAPI events and in-twin telemetry are logged; a snapshot is created for fast rollback. xAPI = Experience Application Programming Interface |
1–2 min |
| 5. Learner-twin update | New evidence is folded into CLC_{L}CL; mastery deltas are recorded. | < 30 s |
| Label | Time-point | Rationale |
| Iteration 0 (baseline) |
Immediately after onboarding, before any SCSDT exposure. | Untreated starting level. |
| Iteration 1 | After cycle 1. | Captures the first-exposure effect of gap-matched content. |
| Iteration 3 | After cycle 3 (≈ end of Day 1). | Represents mid-programme status once high-fidelity drills begin. |
| Iteration 5 (post-test) |
After cycle 5 (end of Day 2). | Final mastery checkpoint used for the t-tests and repeated-measures ANOVA. |
| Metric (6-VM student bundle unless noted) | Baseline (manual / local-GPU) | SCSDT / 8agora stack | Δ / Speed-up | Notes & measurement method |
| Provisioning time (50 VMs) | 185 min (stop-watch: manual clone, VLAN, snapshot) | 14 min 26 s (EMP clone.py bulk) | × 12.8 faster |
Same golden image, pfSense bridge auto-generated |
| Template → student VM clone | 202 s per VM | 15.8 s per VM | × 12.8 | Mean of n = 10 runs; Proxmox API logs |
| Snapshot revert | 44.2 s | 0.44 s | × 100 | Instant ZFS rollback via EMP revert.py |
| Environment purge | 67.2 s | 10.3 s | × 6.5 | Deletes VMs, user, bridge; cleans DHCP leases |
| Down-link bitrate / user | — (local render) | 0.20 – 0.50 MB s⁻¹ (adaptive AV1) | — | Full-HD ↔ 4 K; measured with Wireshark – 60 s avg |
| Motion-to-photon latency | n/a (local) | 7.8 ms median (HMD pose → frame) | — | NVIDIA A16; datacenter ≤ 25 km; Icicle probe |
| Concurrent users / GPU | 1 user / RTX 2060 | 6 users / A16 | 6× density |
AV1 hardware encoders; 1920 × 1080 @ 60 Hz |
| Per-user power draw | ≈ 105 W (desktop 2060) | 42 W (A16 slice + EPYC share) | −60 % | 250 W A16 / 6 + 500 W server ÷ 12 sessions |
| Competence-gap norm Δ‖G‖₂ | — | −0.46 ± 0.05 (iteration 0→5) | — | Paired-t: t(5)=21.4, p<0.0001, d=11.7 |
| ANOVA across iterations | — | F(3,15)=52.7, p<0.0001, η²ₚ=0.91 | — | Repeated-measures (n = 6 learners, 5 levels) |
| Sprint | Focus | Key artefacts | Success criteria | Notes |
| S-0 (week 0-1) |
Foundational DevOps — IaC scripts for GPU nodes, pfSense template & Ceph-RBD pool |
• Terraform & Ansible playbooks • EMP CLI pre-flight suite |
Nodes boot unattended; clone.py --check passes on fresh cluster | Use cloud-agnostic modules so stack can live on on-prem or AWS/Lightsail |
| S-1 (week 2-3) |
Core MaaS layer (L1) — 8agora streamer, HTTPS reverse-proxy, LetsEncrypt |
• Docker-compose bundle • Grafana dashboards (bit-rate, GPU) |
Avg down-link < 500 kB s⁻¹ for 3 resolutions, jitter < 30 ms | Reverse-proxy terminates TLS 1.3; JWT forwarded to micro-services |
| S-2 (week 4-5) |
Security-Provisioning layer (L2) — EMP bulk-clone, VLAN generator |
• emp.yml declarative spec • Bridge/DHCP auto-cleanup daemon |
Spawn 50×6-VM bundles in < 15 min, zero IP conflict | Bridges hashed as br-{uuid[0:6]} to fit pfSense UI |
| S-3 (week 6-7) |
Adaptive-Training layer (L3) — DTBT orchestrator ported to Kubernetes |
• Gap-vector micro-service (FastAPI) • xAPI event bus (NATS) |
Δ‖G‖₂ detectable after single 30 min session | Helm chart deploys the Orchestrator microservice and exposes a /orchestrate* REST endpoint, which is called by the L1 MaaS layer upon user login. |
| S-4 (week 8-9) | IIoT gateway & animation bridge | • MQTT→WebSocket side-car • Unity/GLTF animation hooks |
Latency sensor→avatar < 120 ms, 20 Hz sustained | ProtoBuf over WebSocket keeps payload < 200 B |
| S-5 (week 10-12) |
AI co-pilot services speech-to-text, live translation, avatar gestures |
• Serverless functions (OpenAI Whisper / AWS Translate) • Gesture JSON schema |
End-to-end STT < 1 s, 95 % WER; 200 languages | Portrait-GAN drives facial rigs at 15 FPS |
| S-6 (week 13-14) | Compliance & sustainability | • Prometheus power exporter • ISO 27001, ISO 50001 mapping |
PUE recorded, CIS-hardening baseline scored ≥ 85 % | Green-energy cert optional; add DCIM hooks |
| SCSDT layer | MRO-Aero (maintenance) | Cyber-Factory (cyber-range) | Retail-Metaverse (pilot) |
| L1 MaaS (Metaverse-as-a-Service) | Hangar twin, walk-through inspection; AV1 1080 p | 360° SOC room, firewall console screenshare | 4K virtual store; customers on mobile web |
| L2 Security-Provisioning | One VM per learner: Airbus CBT, troubleshooting sim; bridged to pfSense | Kali/Windows pair + pfSense per trainee; EMP snapshot after each lab | POS-emulator VM for staff; role-based access via EMP group tags |
| L3 Adaptive-Orchestrator | EASA Part-66 gaps auto-select scenario fidelity | MITRE ATT&CK matrix scored; harder exploits unlocked on mastery | Sales-training rubric (Upsell, Cross-sell, Checkout) |
| Cross-cutting AI | In-situ voice overlay & multilingual transcript for mechanics | Auto-redaction of PII from chat logs | Real-time sentiment → avatar micro-expressions |
| Data-Twin gateway | OPC-UA from A320 hydraulic test-stand | Modbus/TCP honeypots + canary tokens | Inventory levels from ERP REST |
| Action | Manual (s) | SCSDT (s) | Speed-up |
| Template VM | 19.98 | 4.02 | 5.0× |
| Clone VM | 202.25 | 15.75 | 12.8× |
| Revert snapshot | 44.21 | 0.44 | 100× |
| Purge resources | 67.20 | 10.34 | 6.5× |
| Metric | MRO-Aero (mean ± 95-perc.) | Cyber-Factory |
| Down-link bitrate per user | 0.30 ± 0.45 MB s⁻¹ (≈2.4 Mb s⁻¹) | 0.25 ± 0.41 MB s⁻¹ |
| Up-link (voice + input events) | 0.015 MB s⁻¹ | 0.015 MB s⁻¹ |
| Packet loss tolerated | ≤0.5 % (FEC enabled) | Idem |
| Motion-to-photon latency | 7.8 ms (desktop); 10.9 ms (HMD) | 8.2 ms; 11.3 ms |
| # VMs cloned | Manual baseline* | EMP (ours) | Speed-up |
| 10 | 42 min | 3 min 55 s | 10.7 × |
| 25 | 102 min | 8 min 20 s | 12.2 × |
| 50 | 185 min | 14 min 26 s | 12.8 × |
| 100 | 372 min | 28 min 35 s | 13.0 × |
| Dimension | Local HMD (DTBT) |
2-D Cyber-Range (INSALATA / Alfons) |
SCSDT (ours) |
| Entry hardware | VR-ready PC, RTX-3070 | Browser PC | Any browser device |
| Bandwidth per user | 5 – 8 Mb s⁻¹ | 1 Mb s⁻¹ (VNC) | 0.2 – 0.5 Mb s⁻¹ (AV1 / H.264 adaptive) |
| Provisioning | Manual images | CLI scripts (e.g. YAML-based) | CLI + web-based, ≤15 min per cohort |
| Pedagogical immersion | High | Low | High (avatar, speech, IIoT-linked drills) |
| Energy per user | ≈ 125 W | ≈ 60 W | 42 W (GPU-sliced A16) |
| IIoT Integration | None | None | Yes (real-time OPC-UA, MQTT) |
| Streaming support | Local rendering only | None | AV1 / H.264 adaptive cloud streaming |
| Threat category | Potential issue | Impact on findings | Mitigation / future work |
| 4.1 Internal validity | Small cohort (n = 6). Individual differences might explain part of the competence gain. | Overestimates learning effect. | Replicate with ≥30 learners across three institutions; random-split wait-list control. |
| Learning-novelty (Hawthorne) effect. Initial excitement about VR could boost engagement. | Inflates Δ‖G‖₂. | A planned 8-week follow-up retention test; cross-over design (desktop ⇄ SCSDT). | |
| Instrumentation drift. Gap scores rely on instructor-scored DTBT tasks. | Systematic bias. | Dual-rater scheme (κ > 0.87); rubric frozen before experiment. | |
| 4.2 Construct validity | Metric generality. Kabashkin’s competence-gap vector was devised for aviation MRO; cyber-factory skills differ. | May misrepresent mastery in other domains. | For each new domain we align tasks to Bloom + EASA/ISO outcomes; plan to develop a domain-agnostic “Metaverse Skill Taxonomy” (MST-24). |
| Bandwidth ceiling. Our adaptive encoder tops at 500 kB s⁻¹. Rare 4 K multi-monitor users could exceed this. | Undetected performance degradation. | Stress-test with 8 K + multi-display setups; add AV1 SVC layer. | |
| 4.3 Conclusion validity | Simulated energy and latency values. A16 TDP (250 W) and <8 ms RTT are based on vendor specs + lab ping, not continuous telemetry. | Risk of optimistic eco-claims. | Deploy Prometheus + Grafana for week-long energy traces; repeat in a public cloud region. |
| Assumption of normality. ANOVA & t-test assume normal residuals; n is too small for robust Shapiro–Wilk. | Inflated Type-I if violated. | Bootstrapped (10 000 resamples) CI produced identical significance; will switch to non-parametric aligned-rank test in replication. | |
| 4.4 External validity | Edge-colocated datacentre (<20 km). Latency may rise on inter-continental links or during ISP congestion spikes. | Unknown QoE for remote learners. | Scheduled test over Europe↔US (100 ms RTT) with CDN relay; QoE MOS instruments. |
| Hardware heterogeneity. Results derive from NVIDIA A16; different GPUs (e.g., L4, AMD MI210) may change user density & power/W. | Limits generalisability to hyperscale clouds. | Parameterise GPU efficiency in the SCSDT cost-model; benchmark three SKUs in Q4-25. | |
| Single scenario (MRO digital-twin & cyber-factory). Other domains (healthcare VR, e-commerce) may have stricter fidelity needs. | Transferability unclear. | Ongoing pilots: surgical-simulation (HUS Helsinki) and virtual retail store (NTT Data). Findings will be triangulated. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).