Article
Version 1
Preserved in Portico This version is not peer-reviewed
Leveraging Sentiment Lexicon in Sentiment Detection
Version 1
: Received: 2 April 2024 / Approved: 3 April 2024 / Online: 3 April 2024 (11:06:37 CEST)
How to cite: Johnson, A.; Davis, E.; Nasir, W.; Brown, M. Leveraging Sentiment Lexicon in Sentiment Detection. Preprints 2024, 2024040254. https://doi.org/10.20944/preprints202404.0254.v1 Johnson, A.; Davis, E.; Nasir, W.; Brown, M. Leveraging Sentiment Lexicon in Sentiment Detection. Preprints 2024, 2024040254. https://doi.org/10.20944/preprints202404.0254.v1
Abstract
In the rapidly evolving field of sentiment analysis, the introduction of Transformer-based architectures, particularly the BERT model, has markedly improved accuracy levels, setting new standards in the analysis of textual sentiment. These advancements have been instrumental in enhancing the model's ability to grasp the nuances and complexities inherent in human language, thereby providing deeper insights into the sentiment expressed in various texts. However, the impressive performance of such deep learning models comes at the cost of increased computational demands and a lack of transparency in their decision-making processes. These challenges have reignited interest in rule-based sentiment analysis methods, which utilize sentiment lexicons for a more straightforward and computationally economical approach to determining text sentiment. Despite being overshadowed by the rise of machine learning models in recent years, these lexicon-based methods possess distinct advantages, including ease of interpretation and lower resource requirements, making them particularly appealing for certain applications. This paper seeks to re-evaluate the relevance and effectiveness of two prominent lexicon-based sentiment analysis methods, SO-CAL and SentiStrength, which have been specifically adapted for the language, in light of the advancements represented by Transformer-based models like SentBERT. Through a comprehensive comparative analysis, we examine the performance of these methodologies in contrast to SentBERT across an extensive collection of 16 text corpora, spanning a variety of genres and contexts. Our findings reveal a nuanced landscape of performance, where SentBERT's advanced capabilities typically afford it a significant advantage in accurately capturing sentiment. Nevertheless, in a surprising turn, the SO-CAL method exhibits exceptional performance on a substantial portion of the datasets, underscoring the continuing value and potential of lexicon-based approaches in sentiment analysis. This study not only highlights the strengths and weaknesses of both deep learning and lexicon-based methods but also opens the door for future hybrid approaches that could leverage the best of both worlds to achieve even greater accuracy and efficiency in sentiment analysis tasks.
Keywords
Sentiment Detection; Lexicon; Enhanced SentiStrength
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment