Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Transformer Text Classification Model for Arabic Dialects that Utilizes Inductive Transfer

Version 1 : Received: 10 November 2023 / Approved: 13 November 2023 / Online: 13 November 2023 (12:11:04 CET)

A peer-reviewed article of this Preprint also exists.

Baniata, L.H.; Kang, S. Transformer Text Classification Model for Arabic Dialects That Utilizes Inductive Transfer. Mathematics 2023, 11, 4960. Baniata, L.H.; Kang, S. Transformer Text Classification Model for Arabic Dialects That Utilizes Inductive Transfer. Mathematics 2023, 11, 4960.

Abstract

In the realm of the five-category classification endeavor, there has been limited exploration into applied techniques for classifying Arabic text. These methods have primarily leaned on singletask learning, incorporating manually crafted features that lack robust sentence representations. Recently, the Transformer paradigm has emerged as a highly promising alternative. However, when these models are trained using single-task learning, they often face challenges in achieving outstanding performance and generating robust latent feature representations, especially when dealing with small datasets. This issue is particularly pronounced in the context of the Arabic dialect, which has a scarcity of available resources. Given these constraints, this study introduces an innovative approach to dissecting sentiment in Arabic text. This approach combines Inductive Transfer (INT) with the Transformer paradigm to augment the adaptability of the model and refine the representation of sentences. By employing self-attention SE-A and feed-forward sub-layers as a shared Transformer encoder for both the five-category and three-category Arabic text classification tasks, this proposed model adeptly discerns sentiment in Arabic dialect sentences. The empirical findings underscore the commendable performance of the proposed model, as demonstrated in assessments of the Hotel Arabic-Reviews Dataset, the Book Reviews Arabic Dataset, and the LARB dataset.

Keywords

transformer; inductive transfer; text classification; Arabic dialects; positional encoding; 5-polarity

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.