Gionfrida, L.; Rusli, W.M.R.; Kedgley, A.E.; Bharath, A.A. A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition. Electronics2022, 11, 2427.
Gionfrida, L.; Rusli, W.M.R.; Kedgley, A.E.; Bharath, A.A. A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition. Electronics 2022, 11, 2427.
Gionfrida, L.; Rusli, W.M.R.; Kedgley, A.E.; Bharath, A.A. A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition. Electronics2022, 11, 2427.
Gionfrida, L.; Rusli, W.M.R.; Kedgley, A.E.; Bharath, A.A. A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition. Electronics 2022, 11, 2427.
Abstract
This paper introduces a multi-class hand gesture recognition model developed to identify a set of defined hand gesture sequences in two-dimensional RGB video recordings. The work presents an action detection classifier that looks at both appearance and spatiotemporal parameters of consecutive frames. The classifier utilizes a convolutional-based network combined with a long-short-term memory unit. To leverage the need for a large-scale dataset, the model uses an available dataset to then adopt a technique known as transfer learning to fine-tune the model on the hand gestures of relevance. Validation curves performed over a batch size of 64 indicate an accuracy of 93.95% (± 0.37) with a mean Jaccard index of 0.812 (± 0.105) for 22 participants. The presented model illustrates the possibility of training a model with a small set of data (113,410 fully labelled frames). The proposed pipeline embraces a small-sized architecture that could facilitate its adoption.
Keywords
hand gesture classification; transfer learning; three-dimensional convolutional; LSTM network
Subject
Computer Science and Mathematics, Computer Science
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.