Submitted:
14 November 2024
Posted:
15 November 2024
You are already at the latest version
Abstract
The exponential growth of the e-commerce and logistics industries in recent years has underscored the necessity for a warehouse management system (WMS) that is more efficient and intelligent. The prevailing WMSs are dependent on manual input and the use of handheld devices, which can result in inefficiencies and the potential for human error. In this work, we propose a human-computer interaction model for speech recognition optimized for the warehouse environment. This model integrates noise suppression technology based on an adaptive filter, which can dynamically detect and filter background noise, such as forklift operation sounds, human voices, and mechanical operation sounds. Furthermore, to address the variability in pronunciation, speech rate, and accent among different users, the system incorporates a speech enhancement model based on the variational autoencoder (VAE) technique. This approach enables the system to adaptively adjust the input speech features, thereby enhancing the robustness of recognition. In regard to natural language processing, a natural language understanding module based on bidirectional encoder representations from transformers (BERT) has been developed. The module is capable of semantic parsing of instructions in the user's voice and converting them into executable operation commands. Semantic slot filling technology enables the system to automatically identify the key entities in a task and perform linkage operations with the backend WMS database. The experimental analysis demonstrates that the proposed system is effective in an actual warehouse scenario. Compared with the traditional method, the task completion speed and accuracy are significantly improved.
Keywords:
Introduction
Related Work
Methodologies
Selecting a Template (Heading 2)
Maintaining the Integrity of the Specifications
Experimental
Experimental Setup
Experimental Analysis
Conclusions
References
- Çimen, Egemen Berki, et al. "A Hybrid Stock optimization Approach for Inventory Management." 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA). IEEE, 2021.
- Wang, Tingfei. "Construction and Data Integration of College Student Management Model Based on Human-Computer Interaction Data Acquisition and Monitoring System." Mobile Information Systems 2022.1 (2022): 9087983. [CrossRef]
- Döngül, Esra Sipahi, and Luigi Pio Leonardo Cavaliere. "Strategic management of platform business ecosystem using artificial intelligence supported human-computer interaction technology." Management and Information Technology in the Digital Era: Challenges and Perspectives. Emerald Publishing Limited, 2022. 47-61.
- Shi, Jihua. "Research on Optimization of Cross-Border e-Commerce Logistics Distribution Network in the Context of Artificial Intelligence." Mobile Information Systems 2022.1 (2022): 3022280. [CrossRef]
- Manogaran, Gunasekaran, Chandu Thota, and Daphne Lopez. "Human-computer interaction with big data analytics." Research Anthology on Big Data Analytics, Architectures, and Applications. IGI global, 2022. 1578-1596.
- Qi, Xiaoxuan, et al. "Intelligent retrieval method of power system service user satisfaction based on human-computer interaction." Journal of Interconnection Networks 22.Supp05 (2022): 2147012. [CrossRef]
- Zhang, Zixing, et al. "Deep learning for environmentally robust speech recognition: An overview of recent developments." ACM Transactions on Intelligent Systems and Technology (TIST) 9.5 (2018): 1-28.
- Gajic, Bojana, and Kuldip K. Paliwal. "Robust speech recognition in noisy environments based on subband spectral centroid histograms." IEEE Transactions on Audio, Speech, and Language Processing 14.2 (2006): 600-608. [CrossRef]
- Qian, Yanmin, et al. "Very deep convolutional neural networks for noise robust speech recognition." IEEE/ACM Transactions on Audio, Speech, and Language Processing 24.12 (2016): 2263-2276. [CrossRef]
- Kenton, Jacob Devlin Ming-Wei Chang, and Lee Kristina Toutanova. "Bert: Pre-training of deep bidirectional transformers for language understanding." Proceedings of naacL-HLT. Vol. 1. 2019.



Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).