Submitted:
13 February 2024
Posted:
15 February 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
- Sentence 1:
- Hey, have you heard the latest gist about the party next weekend? It’s gonna be lit!
- Sentence 2:
- Let’s schedule the meeting for October 10th at 2:30 PM.
- Sentence 3:
- I’ll meet you at the gas station; we can take the freeway to the shopping mall.
- Sentence 4:
- The project deadline is tomorrow, and I need to submit my resume to the recruiter.
3.1. Corpora
3.2. Toxicity
3.3. Automatic Speech Recognition
4. Results
5. Conclusions
6. Future Work
References
- Min, Z.; Wang, J. Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study. Neural Information Processing; Luo, B.; Cheng, L.; Wu, Z.G.; Li, H.; Li, C., Eds.; Springer Nature Singapore: Singapore, 2024; pp. 69–84.
- Hinsvark, A.; Delworth, N.; Rio, M.; McNamara, Q.; Dong, J.; Westerman, R.; Huang, M.; Palakapilly, J.; Drexler, J.; Pirkin, I.A.; Bhandari, N.; Jette, M. Accented Speech Recognition: A Survey. ArXiv 2021, abs/2104.10747.
- Ngueajio, M.K.; Washington, G. Hey ASR system! Why aren’t you more inclusive? Automatic speech recognition systems’ bias and proposed bias mitigation techniques. A literature review. International Conference on Human-Computer Interaction. Springer, 2022, pp. 421–440.
- Alabi, A. Introduction: Nollywood and the global south. The Global South 2013, 7, 1–10.
- Umukoro, O.E.; Eluyela, F.; Inua, O.I.; Babajide, S. Nollywood accounting and financial performance: Evidence from Nigerian cinemas. International Journal of Financial Research 2020, 11, 271–280.
- Ezepue, E.M. The new Nollywood: Professionalization or gentrification of cultural industry. Sage Open 2020, 10, 2158244020940994.
- Danladi, S.S. Language policy: Nigeria and the role of English language in the 21st century. European Scientific Journal 2013, 9.
- Benzeghiba, M.; De Mori, R.; Deroo, O.; Dupont, S.; Jouvet, D.; Fissore, L.; Laface, P.; Mertins, A.; Ris, C.; Rose, R.; others. Impact of variabilities on speech recognition. Proc. SPECOM, 2006, pp. 3–16.
- Amuda, S.A.Y.; Boril, H.; Sangwan, A.; Ibiyemi, T.S.; Hansen, J.H.L. Engineering Analysis and Recognition of Nigerian English: An Insight into Low Resource Languages. Transactions on Engineering and Computing Sciences 2014, 2, 115–128. [CrossRef]
- Babatunde, O. Automatic Speech Recognition for Nigerian-Accented English 2023. [CrossRef]
- Oluwatomiyin, S.; Misra, S.; Wejin, J.; Agrawal, A.; Oluranti, J. A Hybrid Translation Model for Pidgin English to English Language Translation. Data, Engineering and Applications; Sharma, S.; Peng, S.L.; Agrawal, J.; Shukla, R.K.; Le, D.N., Eds.; Springer Nature Singapore: Singapore, 2022; pp. 385–394.
- Oladipupo, R.; Akinfenwa, E. Educated Nollywood artistes’ accent as a Normative Standard of English pronunciation in Nigeria: Analysis of the phonemic realisation of educated Nollywood artistes. English Today 2023, 39, 207–217. [CrossRef]
- Karpf, A. The human voice: How this extraordinary instrument reveals essential clues about who we are; Bloomsbury Publishing USA, 2006.
- Wunder, E.M.; Voormann, H.; Gut, U. The ICE Nigeria corpus project: Creating an open, rich and accurate corpus. icame Journal 2010, 34, 78–88.
- Sun, L.; Huang, Y.; Wang, H.; Wu, S.; Zhang, Q.; Gao, C.; Huang, Y.; Lyu, W.; Zhang, Y.; Li, X.; others. Trustllm: Trustworthiness in large language models. arXiv preprint arXiv:2401.05561 2024.
- Soldaini, L.; Kinney, R.; Bhagia, A.; Schwenk, D.; Atkinson, D.; Authur, R.; Bogin, B.; Chandu, K.; Dumas, J.; Elazar, Y.; others. Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research. arXiv preprint arXiv:2402.00159 2024.
- Barrault, L.; Chung, Y.A.; Meglioli, M.C.; Dale, D.; Dong, N.; Duquenne, P.A.; Elsahar, H.; Gong, H.; Heffernan, K.; Hoffman, J.; others. SeamlessM4T-Massively Multilingual & Multimodal Machine Translation. arXiv preprint arXiv:2308.11596 2023.
- Radford, A.; Kim, J.W.; Xu, T.; Brockman, G.; McLeavey, C.; Sutskever, I. Robust speech recognition via large-scale weak supervision. International Conference on Machine Learning. PMLR, 2023, pp. 28492–28518.
- Agarwal, M.; Agarwal, S.; Anastasopoulos, A.; Bentivogli, L.; Bojar, O.; Borg, C.; Carpuat, M.; Cattoni, R.; Cettolo, M.; Chen, M.; others. Findings of the IWSLT 2023 evaluation campaign. Association for Computational Linguistics, 2023.
- Chen, W.; Yan, B.; Shi, J.; Peng, Y.; Maiti, S.; Watanabe, S. Improving massively multilingual asr with auxiliary ctc objectives. ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | |
| 8 | |
| 9 | |
| 10 | |
| 11 | |
| 12 |

| Deepcut | Acrimony | ICE Spoken | ICE Written | |
|---|---|---|---|---|
| ETOX | 2.08% | 3.35% | <1% | <1% |
| Evaluate | 1.30% | 2.16% | <1% | <1% |
| Deepcut | ICE | |
|---|---|---|
| Whisper Small | 123.5 | 93.8 |
| XLS-R | 230.8 | 39.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).