Preprint
Article

Bilingual Asr Model With Language Identification for Brazilian Portuguese and South-American Spanish

This version is not peer-reviewed.

Submitted:

28 October 2022

Posted:

31 October 2022

You are already at the latest version

A peer-reviewed article of this preprint also exists.

Abstract
This paper documents the development of a special case of multilingual Automatic Speech Recognition model, specifically tailored to attend two languages spoken by the majority of Latin America, Portuguese and Spanish. The bilingual model combines Language Identification and Speech Recognition developed with the Wav2Vec2.0 architecture and trained on several open and private speech datasets. In this model, the feature encoder is trained jointly for all tasks and different context encoders are trained for each task. The model is evaluated separately on two tasks: language identification and speech recognition. The results indicate that this model achieves good performance on speech recognition and average performance on language identification, training on a low quantity of speech material. The average accuracy of the language identification module on the MLS dataset is 66.75%. The average Word Error Rate in the same scenario is 13.89%, which is better than average 22.58% achieved by the commercial speech recognizer developed by Google.
Keywords: 
;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Downloads

211

Views

126

Comments

0

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Email

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated