Version 1
: Received: 11 December 2021 / Approved: 14 December 2021 / Online: 14 December 2021 (15:09:15 CET)
How to cite:
ZHANG, Y.; XIAO, Q.; CHU, C.; XING, H. Multi-modal Data Fusion Method for Human Behavior Recognition Based on Two IA-Net and CHMM. Preprints2021, 2021120244. https://doi.org/10.20944/preprints202112.0244.v1
ZHANG, Y.; XIAO, Q.; CHU, C.; XING, H. Multi-modal Data Fusion Method for Human Behavior Recognition Based on Two IA-Net and CHMM. Preprints 2021, 2021120244. https://doi.org/10.20944/preprints202112.0244.v1
ZHANG, Y.; XIAO, Q.; CHU, C.; XING, H. Multi-modal Data Fusion Method for Human Behavior Recognition Based on Two IA-Net and CHMM. Preprints2021, 2021120244. https://doi.org/10.20944/preprints202112.0244.v1
APA Style
ZHANG, Y., XIAO, Q., CHU, C., & XING, H. (2021). Multi-modal Data Fusion Method for Human Behavior Recognition Based on Two IA-Net and CHMM. Preprints. https://doi.org/10.20944/preprints202112.0244.v1
Chicago/Turabian Style
ZHANG, Y., Chaoqin CHU and Heng XING. 2021 "Multi-modal Data Fusion Method for Human Behavior Recognition Based on Two IA-Net and CHMM" Preprints. https://doi.org/10.20944/preprints202112.0244.v1
Abstract
The multi-modal data fusion method based on IA-net and CHMM technical proposed is designed to solve the problem that the incompleteness of target behavior information in complex family environment leads to the low accuracy of human behavior recognition.The two improved neural networks(STA-ResNet50、STA-GoogleNet)are combined with LSTM to form two IA-Nets respectively to extract RGB and skeleton modal behavior features in video. The two modal feature sequences are input CHMM to construct the probability fusion model of multi-modal behavior recognition.The experimental results show that the human behavior recognition model proposed in this paper has higher accuracy than the previous fusion methods on HMDB51 and UCF101 datasets. New contributions: attention mechanism is introduced to improve the efficiency of video target feature extraction and utilization. A skeleton based feature extraction framework is proposed, which can be used for human behavior recognition in complex environment. In the field of human behavior recognition, probability theory and neural network are cleverly combined and applied, which provides a new method for multi-modal information fusion.
Keywords
Attention mechanism; CHMM; LSTM; Multi-modal fusion; Human behavior recognition
Subject
Computer Science and Mathematics, Information Systems
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.