Metadata Analysis of the Generative AI Usefulness for African Languages

Yohanna Joseph Waliya; Margaret Mary Okon

doi:10.20944/preprints202604.0360.v1

Submitted:

03 April 2026

Posted:

06 April 2026

You are already at the latest version

Abstract

In the contemporary landscape, natural language processing (NLP) stands as a vital force, empowering computers to comprehend and engage with human languages, thereby enhancing the realm of human-computer interaction (HCI) through the utilisation of large language models (LLMs) and multilingual pre-trained language models (mPLMs). The widespread adoption of these LLMs on a global scale is obvious. However, a critical observation reveals a significant gap in their capacity to effectively recognize some low-resource African languages, a concern observed by numerous researchers. This paper endeavours to contribute to the discourse by conducting a comprehensive metadata analysis of existing African language models. Through this investigation, the aim is to outline the importance, strengths, and weaknesses inherent in these models. By shedding light on these aspects, the paper seeks to not only underscore the current limitations but also to provide valuable insights and recommendations for future research endeavours in the domain of language recognition, particularly focusing on African languages. In doing so, the paper aspires to catalyse advancements that promote inclusivity and a more nuanced understanding of linguistic diversity within the realm of natural language processing. Multilingual Testing shall be used on Cheetah to evaluate the model's proficiency strength in multiple languages, including those that are less widely spoken such as Margi and Ibibio as well as identify any language-specific weaknesses or limitations of the LLMs, especially in recognizing and understanding languages like Margi spoken in the North-East geo-political zone of Nigeria and Ibibio spoken in the South-South geo-political zone of Nigeria.

Keywords:

natural language processing (NLP)

;

large language models (LLMs)

;

Cheetah

;

Ibibio

;

Margi

;

Nigeria

;

Africa

Subject:

Arts and Humanities - Other

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Metadata Analysis of the Generative AI Usefulness for African Languages

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe