Submitted:
03 September 2024
Posted:
03 September 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We propose a novel hybrid approach that integrates Africa-centric multilingual PLMs with XAI techniques. This integration allows us to apply the sentiment analysis capabilities of Afro-centric PLMs while simultaneously incorporating XAI methods to explain their decision-making processes for improved transparency and trust.
- Our approach utilises fine-tuned benchmark Africa-centric PLMs specifically designed for African languages. This choice capitalises on their understanding of linguistic nuances in these languages, potentially leading to superior sentiment analysis performance compared to mainstream PLMs.
- By incorporating attention mechanisms and visualization techniques, we enhance the transparency of the Africa-centric sentiment analysis model. This allows users to understand which parts of the input text the model focuses on when making sentiment predictions, fostering trust in its decision-making process.
- We demonstrate that incorporating LIME and SHAP techniques into the sentiment classifier’s output enhances the model’s interpretability and explainability.
- We also show that by leveraging XAI strategies, the study ensures that the model’s sentiment predictions are accurately interpretable and understandable. Furthermore, the feedback survey shows that many of the participants are in agreement with the results of the models and XAI explanations.
2. Related Studies
2.1. Africa-Centric Pre-Trained Language Models
2.2. Explainable AI for Sentiment Analysis
3. Datasets
- SAfriSenti Corpus—a Twitter-based sentiment corpus developed for South Africa low-resourced languages in a multilingual context [19]. It is to date, the largest sentiment corpus (i.e., over 40,000 tweets) for South African languages such as Sepedi, Setswana, Sesotho, including English. The SAfriSenti corpus was manually annotated by experts and native speakers. Each tweet in the corpus has undergone the data cleaning and preprocessing steps where noise, URLs, meaningless tweets were removed [35]. We replaced all @mentions by @user for data protection purposes [36]. Furthermore, the corpus consists of 64% monolingual tweets and 36% multilingual tweets, including code-switched tweets. We used Krippendorff’s Alpha method to obtain an acceptable annotator reliability score.
- Additional Dataset—We used the Twitter API to collect this dataset, following the approach presented in [35]. This sentiment dataset was collected from Twitter and contains over 50,000 tweets in the isiZulu and isiXhosa languages. To automate sentiment labelling for our dataset, we leveraged a semi-automatic distant supervision approach. This method utilizes sentiment-bearing emojis and word-based sentiment lexicons, as detailed in [19]. However, we employed three native speakers to manually double-check and annotate only less than 24% portion of the dataset in each language, as proven in [19]. In this corpus, the isiXhosa comprises 35.74% of the total tweets. Meanwhile, the isiZulu contains 64.25% total tweets. The dataset also includes code-switched English words.
4. Proposed Methods for XAI for Sentiment Analysis
4.1. Overview of XAI for Sentiment Analysis
4.2. Model Descriptions
- mBERT: Multilingual BERT is a multilingual version of BERT pre-trained in the top 104 languages with the largest Wikipedia using a masked language modelling (MLM) objective and the next-sentence prediction task [21]. We fine-tune the bert-base-multilingual-cased model with 172M model parameters by adding a linear classification layer on top of the pre-trained transformer model.
- XLM-R (XLM-RoBERTa): XLM-R is a powerful multilingual model trained on a massive dataset of crawled text from hundreds of languages [23]. The XLM-R model is created by distilling knowledge from the DistilRoBERTa model into the XLM-R model using more than parallel data points from 50+ languages.
- AfroLM: AfroLM is a multilingual language model that has been pre-trained from scratch on African languages using a novel self-active learning framework [18]. It is the only available SOTA Transformer model that has been pre-trained with 23 African languages, including Setswana, isiXhosa, and isiZulu in our case. AfroLM stands out for its efficiency. Trained on a considerably smaller dataset than existing models, it still surpasses many multilingual PLMs in various NLP tasks.
- Afro-XLMR Researchers created Afro-XLMR by adapting the XLM-R large model using MLM (Masked Language Modeling) on 17 African languages. This covers major language families like Sesotho, isiXhosa, and isiZulu, along with 3 high-resource languages including English [17]. Afro-XLMR is multilingual adaptive fine-tuning that allows multilingual adaptation and preserves downstream performance on both high- and low-resourced languages. We are motivated to use this model because the authors are confident that it can be easily adapted to a wide range of other African languages, including those with limited linguistic resources.
- SERENGETI is the largest African MPLM that was pre-trained using 42GB of data comprising a multi-domain from religious, news, government documents, health documents, and existing Wikipedia corpora [16]. The pretraining data covers 517 African languages and the 10 most spoken languages globally. This model was pre-trained on both Electra style [37] as well as XLM-R style architecture. Electra utilises the multilingual replaced token detection (MRTD) objective during training. The model has 12 layers and 12 attention heads. SERENGETI model has significantly outperformed AfriBERTa, XLMR, mBERT, and Afro-XLMR on some NLP tasks.
4.3. Explainability Methods
5. Experimental Results and Explanations
5.1. Experimental Setup
5.2. Performance Results
- True Positives (TP): The number of tweets correctly labelled as having the positive sentiment.
- True Negatives (TN): The number of tweets correctly labelled as not having the positive sentiment.
- False Positives (FP): The number of items incorrectly labelled as positive sentiment.
- False Negatives (FN): The number of items incorrectly labelled as not having a positive sentiment.
5.3. Attention Explanations
5.4. Sentiment Decision Explanations
5.5. Human Assessment
5.5.1. Trustworthiness
- How confident do you feel in the accuracy of the sentiment analysis results provided by the XAI system?
- How valuable do you find the explanations in helping you trust and interpret the sentiment analysis results?
- Did the XAI system’s explanations positively influence any changes in your decision-making based on the sentiment analysis results?
- How likely are you to use the XAI system for future sentiment analysis tasks, considering the explanations it provides?
5.5.2. Transparency
- Were there any challenges or difficulties you encountered while trying to interpret the XAI system’s explanations of sentiment?
- Did the explanations provided by the XAI system help you understand why certain sentiments were identified in the text?
5.5.3. Reliability
- How confident do you feel in the accuracy of the sentiment analysis results provided by the XAI system?
- Were there any specific instances where you disagreed with the sentiment assigned by the XAI system, despite its explanations?
5.5.4. Interpretability
- Did the explanations provided by the XAI system help you understand why certain sentiments were identified in the text?
- Were there any instances where the XAI system’s explanation of sentiment did not align with your interpretation?
- Were there any challenges or difficulties you encountered while trying to interpret the XAI system’s explanations of sentiment?
6. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Vinuesa, Ricardo and Azizpour, Hossein and Leite, Iolanda and Balaam, Madeline and Dignum, Virginia, and Domisch, Sami and Felländer, Anna and Langhans, Simone Daniela and Tegmark, Max, and Fuso Nerini, Francesco. The Role of Artificial Intelligence in Achieving the Sustainable Development Goals. Nature communications 2020, 11, 1–10.
- Sharma, H.D.; Goyal, P. An Analysis of Sentiment: Methods, Applications, and Challenges. Engineering Proceedings 2023, 59. [CrossRef]
- Kokalj, Enja and Škrlj, Blaž and Lavrač, Nada and Pollak, Senja and Robnik-Šikonja, Marko". BERT meets Shapley: Extending SHAP Explanations to Transformer-based Classifiers. Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation. Association for Computational Linguistics, 2021, pp. 16–21.
- Fantozzi, P.; Naldi, M. The Explainability of Transformers: Current Status and Directions. Computers 2024, 13. [CrossRef]
- Alejandro Barredo Arrieta and Natalia Díaz-Rodríguez and Javier Del Ser and Adrien Bennetot and Siham Tabik and Alberto Barbado and Salvador Garcia and Sergio Gil-Lopez and Daniel Molina and Richard Benjamins and Raja Chatila and Francisco Herrera. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI. Information Fusion 2020, 58, 82–115. [CrossRef]
- Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Information Fusion 2023, 99, 101805. [CrossRef]
- Omnia Amin and Blair Brown and Bruce Stephen and Stephen McArthur. A case-study led investigation of explainable AI (XAI) to support deployment of prognostics in industry. Proceedings of the European Conference Of The PHM Society 2022; Do, P.; Michau, G.; Ezhilarasu, C., Eds., 2022, pp. 9–20.
- United Nations. Sustainable Development Goals: 17 Goals to Transform our World. https://www.un.org/sustainabledevelopment/sustainabledevelopment-goals, 2022. Accessed: 2023-08.
- Loh, H.W.; Ooi, C.P.; Seoni, S.; Barua, P.D.; Molinari, F.; Acharya, U.R. Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011–2022). Computer Methods and Programs in Biomedicine 2022, 226, 107161. [CrossRef]
- Kumar, P.; Hota, L.; Tikkiwal, V.A.; Kumar, A. Analysing Forecasting of Stock Prices: An Explainable AI Approach. Procedia Computer Science 2024, 235, 2009–2016. International Conference on Machine Learning and Data Engineering (ICMLDE 2023). [CrossRef]
- Schoonderwoerd, Tjeerd A.J. and Wiard Jorritsma and Neerincx, Mark A. and Van Den Bosch, Karel. Human-centered XAI: Developing Design Patterns for Explanations of Clinical Decision Support Systems. International Journal of Human-Computer Studies 2021, 154, 1–25.
- Minchae Song. A Study on Explainable Artificial Intelligence-based Sentimental Analysis System Model. International Journal of Internet, Broadcasting and Communication, 2022, pp. 142–151.
- Wankhade, Mayur and Rao, Annavarapu and Kulkarni, Chaitanya. A Survey on Sentiment Analysis Methods, Applications, and Challenges. Artificial Intelligence Review 2022, pp. 1–50.
- Mabokela, Koena Ronny and Celik, Turgay and Raborife, Mpho. Multilingual Sentiment Analysis for Under-Resourced Languages: A Systematic Review of the Landscape. IEEE Access 2023, 11, 15996–16020. [CrossRef]
- Le, Tuan Anh and Moeljadi, David and Miura, Yasuhide and Ohkuma, Tomoko. Sentiment Analysis for Low Resource Languages: A Study on Informal Indonesian Tweets. Proceedings of the 12th Workshop on Asian Language Resources (ALR12). The COLING 2016 Organizing Committee, 2016, pp. 123–131.
- Adebara, Ife and Elmadany, AbdelRahim and Abdul-Mageed, Muhammad and Alcoba Inciarte, Alcides. SERENGETI: Massively Multilingual Language Models for Africa. Findings of the Association for Computational Linguistics: ACL 2023; Association for Computational Linguistics: Toronto, Canada, 2023; pp. 1498–1537.
- Alabi, Jesujoba O. and Adelani, David Ifeoluwa and Mosbach, Marius and Klakow, Dietrich. Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning. Proceedings of the 29th International Conference on Computational Linguistics; International Committee on Computational Linguistics: Gyeongju, Republic of Korea, 2022; pp. 4336–4349.
- Dossou, B.F.P.; Tonja, A.L.; Yousuf, O.; Osei, S.; Oppong, A.; Shode, I.; Awoyomi, O.O.; Emezue, C. AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages". "Proceedings of The Third Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)"; Association for Computational Linguistics: Abu Dhabi, United Arab Emirates (Hybrid), 2022; pp. 52–64.
- Mabokela, Koena Ronny and Roborife, Mpho and Celik, Turguy. Investigating Sentiment-Bearing Words- and Emoji-based Distant Supervision Approaches for Sentiment Analysis. Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023); Association for Computational Linguistics: Dubrovnik, Croatia, 2023; pp. 115–125.
- Marvin M. Aguero-Torales and Jose I. Abreu Salas and Antonio G. Lopez-Herrera. Deep learning and multilingual sentiment analysis on social media data: An overview. Applied Soft Computing 2021, 107, 107–373.
- Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Association for Computational Linguistics: Minneapolis, Minnesota, 2019; pp. 4171–4186.
- Yinhan Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and Mike Lewis and Luke Zettlemoyer and Veselin Stoyanov. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv 2019, abs/1907.11692.
- Conneau, Alexis and Khandelwal, Kartikay and Goyal, Naman and Chaudhary, Vishrav and Wenzek, Guillaume and Guzmán, Francisco and Grave, Edouard and Ott, Myle and Zettlemoyer, Luke and Stoyanov, Veselin. Unsupervised Cross-lingual Representation Learning at Scale. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020, pp. 8440–8451.
- Ogueji, Kelechi and Zhu, Yuxin and Lin, Jimmy. Small Data? No Problem! Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages. Proceedings of the 1st Workshop on Multilingual Representation Learning; Association for Computational Linguistics: Punta Cana, Dominican Republic, 2021; pp. 116–126.
- Bacco, L.; Cimino, A.; Dell’Orletta, F.; Merone, M. Explainable Sentiment Analysis: A Hierarchical Transformer-Based Extractive Summarization Approach. Electronics 2021, 10. [CrossRef]
- Sage Kelly and Sherrie-Anne Kaye and Oscar Oviedo-Trespalacios. What factors contribute to the acceptance of artificial intelligence? A systematic review. Telematics and Informatics 2023, 77, 101925. [CrossRef]
- Saeed, W.; Omlin, C. Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities. Knowledge-Based Systems 2023, 263, 110273. [CrossRef]
- Qian, Kun and Danilevsky, Marina and Katsis, Yannis and Kawas, Ban and Oduor, Erick and Popa, Lucian and Li, Yunyao. XNLP: A Living Survey for XAI Research in Natural Language Processing. 26th International Conference on Intelligent User Interfaces - Companion; Association for Computing Machinery: New York, NY, USA, 2021; IUI ’21 Companion, p. 78–80.
- Liu, Shengzhong and Le, Franck and Chakraborty, Supriyo and Abdelzaher, Tarek. On Exploring Attention-based Explanation for Transformer Models in Text Classification. IEEE International Conference on Big Data, 2021, pp. 1193–1203.
- Park, S.; Lee, J. LIME: Weakly-Supervised Text Classification without Seeds". Proceedings of the 29th International Conference on Computational Linguistics; International Committee on Computational Linguistics: Gyeongju, Republic of Korea, 2022; pp. 1083–1088.
- Bodria, F.; Panisson, A.; Perotti, A.; Piaggesi, S. Explainability Methods for Natural Language Processing: Applications to Sentiment Analysis. Sistemi Evoluti per Basi di Dati, 2020.
- Marco Tulio Ribeiro and Sameer Singh and Carlos Guestrin. Why Should I Trust You?: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016.
- Scott M. Lundberg and Su-In Lee. A Unified Approach to Interpreting Model Predictions. ArXiv 2017, abs/1705.07874.
- Diwali, Arwa and Saeedi, Kawther and Dashtipour, Kia and Gogate, Mandar and Cambria, Erik and Hussain, Amir. Sentiment Analysis Meets Explainable Artificial Intelligence: A Survey on Explainable Sentiment Analysis. IEEE Transactions on Affective Computing 2023, pp. 1–12.
- Mabokela, Ronny and Schlippe, Tim. A Sentiment Corpus for South African Under-Resourced Languages in a Multilingual Context. Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages; European Language Resources Association: Marseille, France, 2022; pp. 70–77.
- Muhammad, S.; Abdulmumin, I.; Ayele, A.; Ousidhoum, N.; Adelani, D.; Yimam, S.; Ahmad, I.; Beloucif, M.; Mohammad, S.; Ruder, S.; Hourrane, O.; Jorge, A.; Brazdil, P.; Ali, F.; David, D.; Osei, S.; Shehu-Bello, B.; Lawan, F.; Gwadabe, T.; Rutunda, S.; Belay, T.; Messelle, W.; Balcha, H.; Chala, S.; Gebremichael, H.; Opoku, B.; Arthur, S. AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing; Bouamor, H.; Pino, J.; Bali, K., Eds.; Association for Computational Linguistics: Singapore, 2023; pp. 13968–13981. [CrossRef]
- Clark, K.; Luong, M.T.; Le, Q.V.; Manning, C.D. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. ICLR, 2020.
- Arras, L.; Montavon, G.; Müller, K.R.; Samek, W. Explaining Recurrent Neural Network Predictions in Sentiment Analysis. Proceedings of the EMNLP 2017 Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, 2017, pp. 159–168.
- Vig, Jesse. A Multiscale Visualization of Attention in the Transformer Model. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations; Association for Computational Linguistics": Florence, Italy, 2019; pp. 37–42.
- Saleem, R.; Yuan, B.; Kurugollu, F.; Anjum, A.; Liu, L. Explaining deep neural networks: A survey on the global interpretation methods. Neurocomputing 2022, 513, 165–180. [CrossRef]
- Mabokela, Koena Ronny and Schlippe, Tim. AI for Social Good: Sentiment Analysis to Detect Social Challenges in South Africa. Artificial Intelligence Research. Springer Nature Switzerland, 2022, pp. 309–322.
| 1 | X is the new name of the social media company and platform formerly known as Twitter |
| 2 | |
| 3 | |
| 4 | The survey questions are found here: https://forms.gle/oHHUw5ECwqjK1E66A
|
















| Language | Positive | Negative | Neutral | Total | |||
|---|---|---|---|---|---|---|---|
| (ISO 639) | Pos | % | Neg | % | Neu | % | |
| Sepedi (nso) | 5,153 | 48% | 3,270 | 30% | 2,355 | 22% | 10,778 |
| Setswana (tsn) | 3,932 | 51% | 2,150 | 28% | 1,590 | 21% | 7,672 |
| Sesotho (sot) | 3,050 | 48% | 2,024 | 32% | 1,241 | 20% | 6,314 |
| isiXhosa (xho) | 6,657 | 25.79% | 12,125 | 48.10% | 6,421 | 25.47% | 25,203 |
| isiZulu (zul) | 19,252 | 42.49% | 22,400 | 49.44% | 3,378 | 7.45% | 45,303 |
| Language | mBERT | XLM-R | AfroLM | Afro-XLMR | SERENGETI | |
|---|---|---|---|---|---|---|
| nso | 57.69% | 48.20% | 76.69% | 54.54% | 51.75% | |
| tso | 61.54% | 54.44% | 64.23% | 55.23% | 58.01% | |
| sot | 65.21% | 55.31% | 58.32% | 65.49% | 61.20% | |
| xho | 85.78% | 72.31% | 75.32% | 89.17% | 80.01% | |
| zul | 86.46% | 74.60% | 82.77% | 90.74% | 79.39% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).