Preprint
Article

This version is not peer-reviewed.

SiAraSent: From Features to Deep Transformers for Large-Scale Arabic Sentiment Analysis

Submitted:

04 December 2025

Posted:

05 December 2025

You are already at the latest version

Abstract
Sentiment analysis of Arabic text, particularly on social media platforms, presents a for-midable set of unique challenges that stem from the language’s complex morphology, its numerous dialectal variations, and the frequent and nuanced use of emojis to convey emotional context. This paper presents SiAraSent, a hybrid framework that integrates tra-ditional text representations, emoji-aware features, and deep contextual embeddings based on Arabic transformers. Starting from a strong and fully interpretable baseline built on TF–IDF-weighted character and word N-grams combined with emoji embeddings, we progressively incorporate SinaTools for linguistically informed preprocessing and Ara-BERT for contextualized encodings. The framework is evaluated on a large-scale dataset of 58,751 Arabic tweets labeled for sentiment polarity. Our design works within four experi-mental configurations: (1) a baseline traditional machine learning architecture that em-ploys TF-IDF, N-grams, and emoji features with an SVM classifier; (2) an LLM feature ex-traction approach that leverages deep contextual embeddings from the pre-trained Ara-BERT model; (3) a novel hybrid fusion model that concatenates traditional morphological features, AraBERT embeddings, and emoji-based features into a high-dimensional vector; and (4) a fully fine-tuned AraBERT model specifically adapted for the sentiment classifi-cation task. Our experiments demonstrate the remarkable efficacy of our proposed framework, with the fine-tuned AraBERT architecture achieving an accuracy of 93.45%, a significant 10.89% improvement over the best traditional baseline.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated