Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Enhancing Amharic Speech Recognition in Noisy Conditions through End-to-End Deep Learning

Version 1 : Received: 12 February 2024 / Approved: 13 February 2024 / Online: 13 February 2024 (14:26:24 CET)

How to cite: Ejigu, Y.A.; Asfaw, T.T. Enhancing Amharic Speech Recognition in Noisy Conditions through End-to-End Deep Learning. Preprints 2024, 2024020754. https://doi.org/10.20944/preprints202402.0754.v1 Ejigu, Y.A.; Asfaw, T.T. Enhancing Amharic Speech Recognition in Noisy Conditions through End-to-End Deep Learning. Preprints 2024, 2024020754. https://doi.org/10.20944/preprints202402.0754.v1

Abstract

Speech recognition, also known as automatic speech recognition (ASR), is a technology that enables software to transcribe spoken language into text. However, existing Amharic ASR methods require multiple separate blocks, such as language, acoustic, and pronunciation models with dictionaries, which can be time-consuming and influence performance. This study proposes an approach that replaces much of the speech pipeline with a single recurrent neural network (RNN) architecture. Our proposed architecture is based on a hybrid approach that combines a convolutional neural network (CNN) with a recurrent neural network (RNN) and a connectionist temporal classification (CTC) loss function. We conducted several experiments with noisy audio data that contain 20,000 valid sentences. The model was evaluated using the word error rate (WER) metric, achieving impressive results of 7% WER on noisy data. This approach has significant implications for the field of speech recognition, as it reduces the human effort required to create dictionaries and improves the efficiency and accuracy of ASR systems, making them more practical for real-world applications.

Keywords

Automatic speech recognition, Convolutional Neural Network, Connectionist Temporal Classification, End-to-End, Neural network, Noisy, Recurrent Neural Network, Subspace filtering, Spectral Subtraction

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.