Article
Version 1
Preserved in Portico This version is not peer-reviewed
Sign Language Recognition with Multimodal Sensors and Deep Learning Methods
Version 1
: Received: 18 September 2023 / Approved: 20 September 2023 / Online: 21 September 2023 (08:33:08 CEST)
A peer-reviewed article of this Preprint also exists.
Lu, C.; Kozakai, M.; Jing, L. Sign Language Recognition with Multimodal Sensors and Deep Learning Methods. Electronics 2023, 12, 4827. Lu, C.; Kozakai, M.; Jing, L. Sign Language Recognition with Multimodal Sensors and Deep Learning Methods. Electronics 2023, 12, 4827.
Abstract
Sign language recognition is essential in hearing-impaired people’s communication. Sign language recognition is an important concern in computer vision and has been developed with rapid progress in image recognition technology. However, sign language recognition using a general monocular camera has problems with occlusion and recognition accuracy in sign language recognition. In this research, we aim to improve accuracy by using a 2-axis bending sensor as an aid in addition to image recognition. We aim to achieve higher recognition accuracy by acquiring hand keypoint information of sign language actions captured by a monocular RGB camera and adding sensor assist. To improve sign language recognition, we need to propose new AI models. In addition, the amount of dataset is small because it uses the original data set of our laboratory. To learn using sensor data and image data, we used MediaPipe, CNN, and BiLSTM to perform sign language recognition. MediaPipe is a method for estimating the skeleton of the hand and face provided by Google. In addition, CNN is a method that can learn spatial information, and BiLSTM can learn time series data. Combining the CNN and BiLSTM methods yields higher recognition accuracy. We will use these techniques to learn hand skeletal information and sensor data. Additionally, the 2-axis Bending sensor glove data support training AI model. Using these methods, we aim to improve the recognition accuracy of sign language recognition by combining sensor data and hand skeleton data. Our method performed better than using skeletal information, achieving 96.5% accuracy in Top-1.
Keywords
Sign Language Recognition; sensor fusion
Subject
Computer Science and Mathematics, Computer Vision and Graphics
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment