Preprint Brief Report Version 2 Preserved in Portico This version is not peer-reviewed

Precision Location Keyword Detection Using Offline Speech Recognition Technique

Version 1 : Received: 6 October 2023 / Approved: 11 October 2023 / Online: 11 October 2023 (06:53:39 CEST)
Version 2 : Received: 6 November 2023 / Approved: 6 November 2023 / Online: 7 November 2023 (02:34:57 CET)

How to cite: Imam, M.; Gupta, G. Precision Location Keyword Detection Using Offline Speech Recognition Technique . Preprints 2023, 2023100690. https://doi.org/10.20944/preprints202310.0690.v2 Imam, M.; Gupta, G. Precision Location Keyword Detection Using Offline Speech Recognition Technique . Preprints 2023, 2023100690. https://doi.org/10.20944/preprints202310.0690.v2

Abstract

This study introduces an original comprehensive system centered on identifying specific terms that indicate a user's position, particularly the discrete values representing latitude and longitude. This system not only detects these terms but also retrieves the corresponding numerical data for accurate and efficient determination of locations. The importance of this study can be applied various fields, notably aiding offline operations of military personnel, who often lack internet access. In such scenarios, precise awareness of location is vital for strategic manoeuvres, rescue operations, and navigating unfamiliar landscapes. The system allows these personnel by allowing them to extract exact location coordinates from spoken terms, thereby enhancing their awareness even in challenging surroundings. Apart from its military utility, the project holds broader significance. Teams responding to emergencies, personnel involved in disaster management, and exploratory missions can all gain from this technology during disruptions in communication infrastructure. Furthermore, travelers, adventurers, and outdoor enthusiasts can utilize this system to accurately determine their positions in remote areas without relying on online maps. We used offline speech recognition techniques to precisely transcribe spoken terms, achieving an accuracy of over 91.3% and a word error rate of 4.2%. For sound recognition, the OpenAI Whisper model was used, and a conversion process from SpeechRecognition to AudioSegmentation was implemented, followed by transforming the audio into .wav format, we have also developed the interface of the app to use it efficiently using Streamlit. This was done to ensure seamless compatibility with the Whisper model and uninterrupted audio input. By training the system to identify specific linguistic linked to location, it achieves robust detection and extraction of relevant terms. This approach eliminates the necessity for constant internet connectivity, rendering it exceptionally useful in remote, offline, and resource-limited situations.

Keywords

Keyword Detection; Audio Models; Speech Processing

Subject

Computer Science and Mathematics, Other

Comments (1)

Comment 1
Received: 7 November 2023
Commenter: Mohsin Imam
Commenter's Conflict of Interests: Author
Comment: Updated abstract
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.