Submitted:
08 October 2025
Posted:
09 October 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
2.1. Learning and Representation for Multimodal Systems
2.2. Causal and Reinforcement Learning Foundations
2.3. Transformer and Deep Learning Advances
2.4. Explainable and Robust AI Systems
2.5. System Service and Logging Optimization in AR/VR
3. Problem Definition
3.1. Multimodal Data Space
- Log sequence:, where each is a structured log event (template ID or semantic embedding) generated by module .
- Performance metrics:, where represents K numerical indicators such as CPU load, GPU temperature, FPS, or latency.
- Sensor/state data:, representing IMU readings, accelerometer, gyroscope, or environment tracking signals.
3.2. Root Cause Prediction Objective
3.3. Challenges
- High heterogeneity: Logs, metrics, and sensors differ in scale, sampling frequency, and semantics.
- Weak supervision: Failure annotations are sparse and costly to obtain.
- Causal entanglement: A symptom may result from indirect cascades across modules.
- Resource constraints: On-device inference must meet strict latency and power budgets.
4. Methodology
4.1. System Overview
- Preprocessing and Temporal Alignment
- Modality-specific Encoding
- Reliability-Aware Multimodal Fusion
- Causal Graph Learning
- Root Cause Propagation and Prediction
4.2. Step 1: Data Preprocessing and Alignment
4.3. Step 2: Modality-Specific Encoding
4.4. Step 3: Reliability-Aware Multimodal Fusion
4.5. Step 4: Causal Graph Learning
4.6. Step 5: Propagation-Based Root Cause Estimation
4.7. Algorithm
| Algorithm 1:Learning -Based Multimodal Root Cause Prediction |
|
4.8. Learning Objective

5. Experiments
5.1. Setup
5.2. Baselines
6. Discussion
6.1. Generalization and Adaptation
6.2. Industrial Relevance
7. Conclusions
References
- Wu, C.; Chen, H. Research on system service convergence architecture for AR/VR system. VR System (August 15, 2025) 2025.
- Wu, C.; Zhu, J.; Yao, Y. Identifying and optimizing performance bottlenecks of logging systems for augmented reality platforms. Available at SSRN 5433577 2025.
- Wang, C.; Quach, H.T. Exploring the effect of sequence smoothness on machine learning accuracy. In Proceedings of the International Conference On Innovative Computing And Communication. Springer Nature Singapore Singapore, 2024, pp. 475–494.
- Liu, M.; Sui, M.; Nian, Y.; Wang, C.; Zhou, Z. Ca-bert: Leveraging context awareness for enhanced multi-turn chat interaction. In Proceedings of the 2024 5th International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE). IEEE, 2024, pp. 388–392.
- Wang, C.; Sui, M.; Sun, D.; Zhang, Z.; Zhou, Y. Theoretical analysis of meta reinforcement learning: Generalization bounds and convergence guarantees. In Proceedings of the Proceedings of the International Conference on Modeling, Natural Language Processing and Machine Learning, 2024, pp. 153–159.
- Wang, C.; Yang, Y.; Li, R.; Sun, D.; Cai, R.; Zhang, Y.; Fu, C. Adapting llms for efficient context processing through soft prompt compression. In Proceedings of the Proceedings of the International Conference on Modeling, Natural Language Processing and Machine Learning, 2024, pp. 91–97.
- Quach, N.; Wang, Q.; Gao, Z.; Sun, Q.; Guan, B.; Floyd, L. Reinforcement Learning Approach for Integrating Compressed Contexts into Knowledge Graphs. In Proceedings of the 2024 5th International Conference on Computer Vision, Image and Deep Learning (CVIDL), 2024, pp. 862–866. [CrossRef]
- Sang, Y. Robustness of fine-tuned llms under noisy retrieval inputs. In Proceedings of the 2025 6th International Conference on Artificial Intelligence and Electromechanical Automation (AIEA). IEEE, 2025, pp. 417–420.
- Wu, T.; Wang, Y.; Quach, N. Advancements in natural language processing: Exploring transformer-based architectures for text understanding. In Proceedings of the 2025 5th International Conference on Artificial Intelligence and Industrial Technology Applications (AIITA). IEEE, 2025, pp. 1384–1388.
- Gao, Z. Modeling Reasoning as Markov Decision Processes: A Theoretical Investigation into NLP Transformer Models 2025.
- Gao, Z. Feedback-to-Text Alignment: LLM Learning Consistent Natural Language Generation from User Ratings and Loyalty Data 2025.
- Gao, Z. Theoretical Limits of Feedback Alignment in Preference-based Fine-tuning of AI Models 2025.
- Zhang, Z. Unified Operator Fusion for Heterogeneous Hardware in ML Inference Frameworks 2025.
- Ye, Z.; Chen, L.; Lai, R.; Lin, W.; Zhang, Y.; Wang, S.; Chen, T.; Kasikci, B.; Grover, V.; Krishnamurthy, A.; et al. FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving. arXiv preprint arXiv:2501.01005 2025.
- of ML Drift, A. ML Drift: Scaling On-Device GPU Inference for Large Generative Models. arXiv preprint arXiv:2505.00232 2025.
- Song, X.; Cai, Y.; et al. Deep Learning Inference on Heterogeneous Mobile Processors. In Proceedings of the 22nd ACM International Conference on Mobile Systems, Applications, and Services (MobiSys), 2024.
- Lia, Z.; et al. Inference latency prediction for CNNs on heterogeneous mobile platforms. In Proceedings of the 2024 IEEE or other appropriate conference (TBD), 2024.
- Sang, Y. Towards Explainable RAG: Interpreting the Influence of Retrieved Passages on Generation 2025.
- Wu, C.; Zhang, F.; Chen, H.; Zhu, J. Design and optimization of low power persistent logging system based on embedded Linux. Available at SSRN 5433575 2025.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).