Submitted:
19 April 2025
Posted:
21 April 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Background and Motivation
1.2. Research Challenges in Video Anomaly Detection
1.3. Contribution and Innovation
2. Related Work
2.1. Traditional Video Anomaly Detection Methods
2.2. Deep Learning-Based Methods
2.3. Video Analysis with Attention Mechanisms
2.4. Spatio-Temporal Feature Learning
3. Enhanced Spatio-Temporal Attention Framework
3.1. Framework Overview
3.2. Spatial Attention Module Design
3.3. Temporal Attention Module Design
3.4. Feature Fusion and Enhancement Strategy
3.5. Loss Function Design

4. Experiments and Analysis
4.1. Evaluation Metrics

4.2. Comparison with State-of-the-art Methods

4.3. Ablation Studies

4.4. Visual Analysis and Case Studies
5. Conclusions
5.1. Summary of Contributions
5.2. Limitations and Future Directions
6. Acknowledgment
References
- Mohanapriya, S., Saranya, S. M., Dinesh, K., Jawaharsrinivas, S., Lintheshwar, S., & Logeshwaran, A. (2024, June). Anomaly detection in video surveillance. In 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT) (pp. 1-5). IEEE.
- Priya, S., Nayak, R., & Pati, U. C. (2024, May). Deep Learning-based Weakly Supervised Video Anomaly Detection Methods for Smart City Applications. In 2024, 3rd International Conference on Artificial Intelligence For Internet of Things (AIIoT) (pp. 1-6). IEEE.
- Nasaoui, H., Bellamine, I., & Silkan, H. (2023, December). Improving Human Action Recognition in Videos with Two-Stream and Self-Attention Module. In 2023, 7th IEEE Congress on Information Science and Technology (CiSt) (pp. 215-220). IEEE.
- Prathibha, P. G. (2024, August). VAD-Lite: A LightWeight Video Anomaly Detection Framework Based on Attention Module. In 2024 IEEE 12th International Conference on Intelligent Systems (IS) (pp. 1-6). IEEE.
- Wang, C., Yao, Y., & Yao, H. (2021, January). The video anomaly detection method is based on future frame prediction and attention mechanisms. In 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0405-0407). IEEE.
- Ye, B., Xi, Y., & Zhao, Q. (2024). Optimizing Mathematical Problem-Solving Reasoning Chains and Personalized Explanations Using Large Language Models: A Study in Applied Mathematics Education. Journal of AI-Powered Medical Innovations (International online ISSN 3078-1930), 3(1), 67-83.
- Jin, M., Zhou, Z., Li, M., & Lu, T. (2024). A Deep Learning-based Predictive Analytics Model for Remote Patient Monitoring and Early Intervention in Diabetes Care. International Journal of Innovative Research in Engineering and Management, 11(6), 80-90.
- Zheng, S., Li, M., Bi, W., & Zhang, Y. (2024). Real-time Detection of Abnormal Financial Transactions Using Generative Adversarial Networks: An Enterprise Application. Journal of Industrial Engineering and Applied Science, 2(6), 86-96.
- Ma, D. (2024). Standardization of Community-Based Elderly Care Service Quality: A Multi-dimensional Assessment Model in Southern California. Journal of Advanced Computing Systems, 4(12), 15-27.
- Ma, X., Chen, C., & Zhang, Y. (2024). Privacy-Preserving Federated Learning Framework for Cross-Border Biomedical Data Governance: A Value Chain Optimization Approach in CRO/CDMO Collaboration. Journal of Advanced Computing Systems, 4(12), 1-14.
- Yu, P., Yi, J., Huang, T., Xu, Z., & Xu, X. (2024). Optimization of Transformer heart disease prediction model based on particle swarm optimization algorithm. arXiv preprint arXiv:2412.02801.
- Ma, D., Zheng, W., & Lu, T. (2024). Machine Learning-Based Predictive Model for Service Quality Assessment and Policy Optimization in Adult Day Health Care Centers. International Journal of Innovative Research in Engineering and Management, 11(6), 55-67.
- Wei, M., Wang, S., Pu, Y., & Wu, J. (2024). Multi-Agent Reinforcement Learning for High-Frequency Trading Strategy Optimization. Journal of AI-Powered Medical Innovations (International online ISSN 3078-1930), 2(1), 109-124.
- Wen, X., Shen, Q., Wang, S., & Zhang, H. (2024). Leveraging AI and Machine Learning Models for Enhanced Efficiency in Renewable Energy Systems. Applied and Computational Engineering, 96, 107-112.
- Hu, C., & Li, M. (2024). Leveraging Deep Learning for Social Media Behavior Analysis to Enhance Personalized Learning Experience in Higher Education: A Case Study of Computer Science Students. Journal of Advanced Computing Systems, 4(11), 1-14.
- Chen, Y., Li, M., Shu, M., Bi, W., & Xia, S. (2024). Multi-modal Market Manipulation Detection in High-Frequency Trading Using Graph Neural Networks. Journal of Industrial Engineering and Applied Science, 2(6), 111-120.
- Zhao, Q., Chen, Y., & Liang, J. (2024). Attitudes and Usage Patterns of Educators Towards Large Language Models: Implications for Professional Development and Classroom Innovation. Academia Nexus Journal, 3(2).
- Zhang, J., Xiao, X., Ren, W., & Zhang, Y. (2024). Privacy-Preserving Feature Extraction for Medical Images Based on Fully Homomorphic Encryption. Journal of Advanced Computing Systems, 4(2), 15-28.
- Zhang, H., Feng, E., & Lian, H. (2024). A Privacy-Preserving Federated Learning Framework for Healthcare Big Data Analytics in Multi-Cloud Environments. Spectrum of Research, 4(1).
- Xiao, X., Chen, H., Zhang, Y., Ren, W., Xu, J., & Zhang, J. (2025). Anomalous Payment Behavior Detection and Risk Prediction for SMEs Based on LSTM-Attention Mechanism. Academic Journal of Sociology and Management, 3(2), 43-51.
- Xiao, X., Zhang, Y., Chen, H., Ren, W., Zhang, J., & Xu, J. (2025). A Differential Privacy-Based Mechanism for Preventing Data Leakage in Large Language Model Training. Academic Journal of Sociology and Management, 3(2), 33-42.
- Chen, C., Zhang, Z., & Lian, H. (2025). A Low-Complexity Joint Angle Estimation Algorithm for Weather Radar Echo Signals Based on Modified ESPRIT. Journal of Industrial Engineering and Applied Science, 3(2), 33-43.
- Xu, K., & Purkayastha, B. (2024). Integrating Artificial Intelligence with KMV Models for Comprehensive Credit Risk Assessment. Academic Journal of Sociology and Management, 2(6), 19-24.
- Xu, K., & Purkayastha, B. (2024). Enhancing Stock Price Prediction through Attention-BiLSTM and Investor Sentiment Analysis. Academic Journal of Sociology and Management, 2(6), 14-18.
- Shu, M., Liang, J., & Zhu, C. (2024). Automated Risk Factor Extraction from Unstructured Loan Documents: An NLP Approach to Credit Default Prediction. Artificial Intelligence and Machine Learning Review, 5(2), 10-24.
- Shu, M., Wang, Z., & Liang, J. (2024). Early Warning Indicators for Financial Market Anomalies: A Multi-Signal Integration Approach. Journal of Advanced Computing Systems, 4(9), 68-84.
- Liu, Y., Bi, W., & Fan, J. (2025). Semantic Network Analysis of Financial Regulatory Documents: Extracting Early Risk Warning Signals. Academic Journal of Sociology and Management, 3(2), 22-32.
- Zhang, Y., Fan, J., & Dong, B. (2025). Deep Learning-Based Analysis of Social Media Sentiment Impact on Cryptocurrency Market Microstructure. Academic Journal of Sociology and Management, 3(2), 13-21.
- Zhou, Z., Xi, Y., Xing, S., & Chen, Y. (2024). Cultural Bias Mitigation in Vision-Language Models for Digital Heritage Documentation: A Comparative Analysis of Debiasing Techniques. Artificial Intelligence and Machine Learning Review, 5(3), 28-40.
- Ren, W., Xiao, X., Xu, J., Chen, H., Zhang, Y., & Zhang, J. (2025). Trojan Virus Detection and Classification Based on Graph Convolutional Neural Network Algorithm. Journal of Industrial Engineering and Applied Science, 3(2), 1-5.
- Zhang, Y., Zhang, H., & Feng, E. (2024). Cost-Effective Data Lifecycle Management Strategies for Big Data in Hybrid Cloud Environments. Academia Nexus Journal, 3(2).
- Wu, Z., Feng, E., & Zhang, Z. (2024). Temporal-Contextual Behavioral Analytics for Proactive Cloud Security Threat Detection. Academia Nexus Journal, 3(2).
- Ji, Z., Hu, C., Jia, X., & Chen, Y. (2024). Research on Dynamic Optimization Strategy for Cross-platform Video Transmission Quality Based on Deep Learning. Artificial Intelligence and Machine Learning Review, 5(4), 69-82.
- Zhang, K., Xing, S., & Chen, Y. (2024). Research on Cross-Platform Digital Advertising User Behavior Analysis Framework Based on Federated Learning. Artificial Intelligence and Machine Learning Review, 5(3), 41-54.
- Xiao, X., Zhang, Y., Chen, H., Ren, W., Zhang, J., & Xu, J. (2025). A Differential Privacy-Based Mechanism for Preventing Data Leakage in Large Language Model Training. Academic Journal of Sociology and Management, 3(2), 33-42.
- Xiao, X., Chen, H., Zhang, Y., Ren, W., Xu, J., & Zhang, J. (2025). Anomalous Payment Behavior Detection and Risk Prediction for SMEs Based on LSTM-Attention Mechanism. Academic Journal of Sociology and Management, 3(2), 43-51.
- Bi, Shuochen, Jue Xiao, and Tingting Deng. "The Role of AI in Financial Forecasting: ChatGPT's Potential and Challenges." Proceedings of the 4th Asia-Pacific Artificial Intelligence and Big Data Forum. 2024.
- Liu, Y., Feng, E., & Xing, S. (2024). Dark Pool Information Leakage Detection through Natural Language Processing of Trader Communications. Journal of Advanced Computing Systems, 4(11), 42-55.
- Chen, Y., Zhang, Y., & Jia, X. (2024). Efficient Visual Content Analysis for Social Media Advertising Performance Assessment. Spectrum of Research, 4(2).
- Diao, S., Wan, Y., Huang, D., Huang, S., Sadiq, T., Khan, M. S., ... & Mazhar, T. (2025). Optimizing Bi-LSTM networks for improved lung cancer detection accuracy. PloS one, 20(2), e0316136.
- Wu, Z., Wang, S., Ni, C., & Wu, J. (2024). Adaptive Traffic Signal Timing Optimization Using Deep Reinforcement Learning in Urban Networks. Artificial Intelligence and Machine Learning Review, 5(4), 55-68.
- Chen, J., & Zhang, Y. (2024). Deep Learning-Based Automated Bug Localization and Analysis in Chip Functional Verification. Annals of Applied Sciences, 5(1).
- Zhang, Y., Jia, G., & Fan, J. (2024). Transformer-Based Anomaly Detection in High-Frequency Trading Data: A Time-Sensitive Feature Extraction Approach. Annals of Applied Sciences, 5(1).
- Zhang, D., & Feng, E. (2024). Quantitative Assessment of Regional Carbon Neutrality Policy Synergies Based on Deep Learning. Journal of Advanced Computing Systems, 4(10), 38-54.
- Wang, Z., Shen, Q., Bi, S., & Fu, C. (2024). AI Empowers Data Mining Models for Financial Fraud Detection and Prevention Systems. Procedia Computer Science, 243, 891-899.
- Ju, C., Jiang, X., Wu, J., & Ni, C. (2024). AI-Driven Vulnerability Assessment and Early Warning Mechanism for Semiconductor Supply Chain Resilience. Annals of Applied Sciences, 5(1).
- 16.
- Rao, G., Trinh, T. K., Chen, Y., Shu, M., & Zheng, S. (2024). Jump Prediction in Systemically Important Financial Institutions' CDS Prices. Spectrum of Research, 4(2).
| Architecture | Model Size (M) | AUC Score | Training Time (h) | Inference Speed |
|---|---|---|---|---|
| ConvLSTM-AE | 42.1 | 77.0% | 24 | 45 fps |
| C3D | 78.4 | 85.2% | 36 | 38 fps |
| I3D | 28.0 | 87.3% | 28 | 42 fps |
| RTFM | 24.7 | 84.3% | 30 | 40 fps |
| Attention Type | Parameter Count | Memory Footprint | Attention Score | Detection Accuracy |
|---|---|---|---|---|
| Self-Attention | 1.2M | 2.4GB | 0.85 | 88.6% |
| Channel Attention | 0.8M | 1.8GB | 0.82 | 86.4% |
| Spatial Attention | 1.0M | 2.1GB | 0.84 | 87.8% |
| Temporal Attention | 1.4M | 2.6GB | 0.87 | 89.2% |
| Layer | Attention Heads | Feature Channels | Receptive Field | Memory Usage |
|---|---|---|---|---|
| SA-1 | 8 | 256 | 7×7 | 0.4 GB |
| SA-2 | 16 | 512 | 5×5 | 0.8 GB |
| SA-3 | 32 | 1024 | 3×3 | 1.6 GB |
| SA-Fusion | 4 | 2048 | Global | 0.2 GB |
| Component | Temporal Range | Sampling Rate | Attention Type | Computation Cost |
|---|---|---|---|---|
| Short-term | 8 frames | 1 | Local | 0.3 GFLOPS |
| Mid-term | 16 frames | 2 | Regional | 0.6 GFLOPS |
| Long-term | 32 frames | 4 | Global | 1.2 GFLOPS |
| Temporal Fusion | Full sequence | Adaptive | Hierarchical | 0.4 GFLOPS |
| Method | UCF-Crime AUC | CUHK Avenue AUC | ShanghaiTech AUC | UCSD Ped2 AUC |
|---|---|---|---|---|
| MPPCA | 50.0% | N/A | N/A | 69.3% |
| RTFM | 84.3% | 85.8% | 85.3% | 88.1% |
| Proposed | 89.7% | 87.6% | 88.2% | 92.4% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).