Submitted:
07 April 2024
Posted:
08 April 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
3.1. Pre-Trained Embeddings
3.1.1. BERT Base Uncased
3.1.2. RoBERTa Base
3.1.3. EmoConLearn RoBERTa Base Supervised
3.1.4. EmoConLearn RoBERTa Base Unsupervised
3.2. Classification Head
3.3. Loss Functions
3.3.1. Cross-Entropy Loss
3.3.2. Focal Loss
4. Experiments
4.1. Configurations
4.2. Datasets
4.2.1. DynaSent Dataset Examination
4.2.2. Exploration of Yelp and Amazon Review Datasets
4.2.3. SST-3 Dataset Overview
4.3. Results and Analysis
| Pre-Trained | Classification | Activation | Loss | SST-3 | Yelp | Amazon | |||
|---|---|---|---|---|---|---|---|---|---|
| Embedding | Head | Function | Function | Macro F1 | Accuracy | Macro F1 | Accuracy | Macro F1 | Accuracy |
| bert-base-uncased | Linear | tanh | CrossEntropy | 0.5835 | 0.6416 | 0.5506 | 0.6788 | 0.5116 | 0.5224 |
| Linear | tanh | Focal | 0.5705 | 0.6330 | 0.5448 | 0.6777 | 0.5125 | 0.5261 | |
| BiGRU | relu | CrossEntropy | 0.5944 | 0.6548 | 0.5482 | 0.6795 | 0.5119 | 0.5230 | |
| BiGRU | relu | Focal | 0.5947 | 0.6615 | 0.5418 | 0.6770 | 0.5132 | 0.5258 | |
| BiLSTM | relu | CrossEntropy | 0.5849 | 0.6443 | 0.5498 | 0.6788 | 0.5117 | 0.5228 | |
| BiLSTM | relu | Focal | 0.5838 | 0.648 | 0.5444 | 0.6777 | 0.5136 | 0.5259 | |
| roberta-base | Linear | tanh | CrossEntropy | 0.6198 | 0.6878 | 0.5646 | 0.6960 | 0.5210 | 0.5354 |
| Linear | tanh | Focal | 0.6144 | 0.6828 | 0.5487 | 0.6762 | 0.5075 | 0.5178 | |
| BiGRU | relu | CrossEntropy | 0.6228 | 0.6919 | 0.5649 | 0.6959 | 0.5205 | 0.5361 | |
| BiGRU | relu | Focal | 0.6209 | 0.6946 | 0.5671 | 0.7012 | 0.5216 | 0.5368 | |
| BiLSTM | relu | CrossEntropy | 0.6178 | 0.6860 | 0.5687 | 0.7008 | 0.5220 | 0.5381 | |
| BiLSTM | relu | Focal | 0.6179 | 0.6923 | 0.5662 | 0.6995 | 0.5217 | 0.5340 | |
| sup-simcse-roberta-base | Linear | tanh | CrossEntropy | 0.6104 | 0.6756 | 0.5653 | 0.7007 | 0.5248 | 0.5410 |
| Linear | tanh | Focal | 0.6172 | 0.6805 | 0.5656 | 0.6989 | 0.5234 | 0.5393 | |
| BiGRU | relu | CrossEntropy | 0.6216 | 0.6846 | 0.5670 | 0.6990 | 0.5235 | 0.5403 | |
| BiGRU | relu | Focal | 0.6250 | 0.6932 | 0.5649 | 0.699 | 0.5231 | 0.5398 | |
| BiLSTM | relu | CrossEntropy | 0.6216 | 0.6810 | 0.5670 | 0.7010 | 0.5253 | 0.5433 | |
| BiLSTM | relu | Focal | 0.6257 | 0.6946 | 0.5661 | 0.7000 | 0.5250 | 0.5429 | |
| unsup-simcse-roberta-base | Linear | tanh | CrossEntropy | 0.6132 | 0.6814 | 0.5621 | 0.6965 | 0.5217 | 0.5364 |
| Linear | tanh | Focal | 0.6112 | 0.6787 | 0.5650 | 0.6984 | 0.5215 | 0.5362 | |
| BiGRU | relu | CrossEntropy | 0.6112 | 0.6932 | 0.5653 | 0.6986 | 0.5223 | 0.5384 | |
| BiGRU | relu | Focal | 0.6330 | 0.7000 | 0.5656 | 0.6981 | 0.5211 | 0.5355 | |
| BiLSTM | relu | CrossEntropy | 0.6143 | 0.6846 | 0.5633 | 0.6969 | 0.5221 | 0.5386 | |
| BiLSTM | relu | Focal | 0.6330 | 0.7000 | 0.5656 | 0.6981 | 0.5211 | 0.5355 | |
| Pre-Trained | Classification | Activation | Loss | DynaSent r1 | DynaSent r2 | ||
|---|---|---|---|---|---|---|---|
| Embedding | Head | Function | Function | Macro F1 | Accuracy | Macro F1 | Accuracy |
| bert-base-uncased | Linear | tanh | CrossEntropy | 0.7902 | 0.7911 | 0.6619 | 0.6611 |
| Linear | tanh | Focal | 0.7881 | 0.7892 | 0.6638 | 0.6653 | |
| BiGRU | relu | CrossEntropy | 0.7879 | 0.7892 | 0.6585 | 0.6583 | |
| BiGRU | relu | Focal | 0.7858 | 0.7867 | 0.6468 | 0.6486 | |
| BiLSTM | relu | CrossEntropy | 0.7851 | 0.7864 | 0.6612 | 0.6611 | |
| BiLSTM | relu | Focal | 0.7886 | 0.7897 | 0.6480 | 0.6486 | |
| roberta-base | Linear | tanh | CrossEntropy | 0.8103 | 0.8111 | 0.7026 | 0.7028 |
| Linear | tanh | Focal | 0.7988 | 0.8006 | 0.6559 | 0.6583 | |
| BiGRU | relu | CrossEntropy | 0.8078 | 0.8089 | 0.6867 | 0.6875 | |
| BiGRU | relu | Focal | 0.8063 | 0.8072 | 0.6804 | 0.6819 | |
| BiLSTM | relu | CrossEntropy | 0.8096 | 0.8103 | 0.6937 | 0.6944 | |
| BiLSTM | relu | Focal | 0.8136 | 0.8142 | 0.6828 | 0.6847 | |
| sup-simcse-roberta-base | Linear | tanh | CrossEntropy | 0.8056 | 0.8064 | 0.6949 | 0.6958 |
| BiGRU | relu | CrossEntropy | 0.8070 | 0.8075 | 0.6891 | 0.6903 | |
| BiGRU | relu | Focal | 0.8083 | 0.8089 | 0.6904 | 0.6917 | |
| BiLSTM | relu | CrossEntropy | 0.8133 | 0.8139 | 0.696 | 0.6972 | |
| BiLSTM | relu | Focal | 0.8135 | 0.8142 | 0.686 | 0.6875 | |
| unsup-simcse-roberta-base | Linear | tanh | CrossEntropy | 0.8018 | 0.8028 | 0.7046 | 0.7056 |
| Linear | tanh | Focal | 0.8143 | 0.8130 | 0.6930 | 0.6931 | |
| BiGRU | relu | CrossEntropy | 0.8076 | 0.8085 | 0.6745 | 0.6750 | |
| BiGRU | relu | Focal | 0.8050 | 0.8058 | 0.6848 | 0.6847 | |
| BiLSTM | relu | CrossEntropy | 0.8058 | 0.8067 | 0.6899 | 0.6903 | |
| BiLSTM | relu | Focal | 0.8085 | 0.8094 | 0.6895 | 0.6903 | |
4.3.1. Discrepancies in Cross-Domain Performance
4.3.2. Addressing Neutral Sentiment Classification
5. Conclusion and Future Work
| Dataset | Classes | Imbalance Issue |
|---|---|---|
| SST-3 | Neutral | Underrepresented |
| Amazon | Neutral | Ambivalence |
| Yelp | Neutral | Mixed Sentiment |
References
- Pontiki, M.; Galanis, D.; Papageorgiou, H.; Manandhar, S.; Androutsopoulos, I. SemEval-2015 Task 12: Aspect Based Sentiment Analysis. 9th International Workshop on Semantic Evaluation (SemEval 2015). ACL, 2015, pp. 486–495.
- Pontiki, M.; Galanis, D.; Papageorgiou, H.; Manandhar, S.; Androutsopoulos, I. SemEval-2016 Task 5: Aspect Based Sentiment Analysis. 10th International Workshop on Semantic Evaluation (SemEval 2016). ACL, 2016, pp. 19–30.
- Fei, H.; Zhang, M.; Ji, D. Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 7014–7026.
- Wu, S.; Fei, H.; Li, F.; Zhang, M.; Liu, Y.; Teng, C.; Ji, D. Mastering the Explicit Opinion-Role Interaction: Syntax-Aided Neural Transition System for Unified Opinion Role Labeling. Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022, pp. 11513–11521.
- Shi, W.; Li, F.; Li, J.; Fei, H.; Ji, D. Effective Token Graph Modeling using a Novel Labeling Strategy for Structured Sentiment Analysis. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 4232–4241.
- Fei, H.; Zhang, Y.; Ren, Y.; Ji, D. Latent Emotion Memory for Multi-Label Emotion Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, pp. 7692–7699.
- Wang, F.; Li, F.; Fei, H.; Li, J.; Wu, S.; Su, F.; Shi, W.; Ji, D.; Cai, B. Entity-centered Cross-document Relation Extraction. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 9871–9881.
- Zhuang, L.; Fei, H.; Hu, P. Knowledge-enhanced event relation extraction via event ontology prompt. Inf. Fusion 2023, 100, 101919. [Google Scholar] [CrossRef]
- Socher, R.; Perelygin, A.; Wu, J.; Chuang, J.; Manning, C.D.; Ng, A.; Potts, C. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2013, pp. 1631–1642.
- Fei, H.; Ren, Y.; Ji, D. Retrofitting Structure-aware Transformer Language Model for End Tasks. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, pp. 2151–2161.
- Moghaddam, S.; Ester, M. AQA: Aspect-based Opinion Question Answering. 2011, pp. 89–96. [CrossRef]
- Fei, H.; Wu, S.; Li, J.; Li, B.; Li, F.; Qin, L.; Zhang, M.; Zhang, M.; Chua, T.S. LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS 2022, 2022, pp. 15460–15475. [Google Scholar]
- Qiu, G.; Liu, B.; Bu, J.; Chen, C. Opinion word expansion and target extraction through double propagation. Computational linguistics 2011, 37, 9–27. [Google Scholar] [CrossRef]
- Fei, H.; Ren, Y.; Zhang, Y.; Ji, D.; Liang, X. Enriching contextualized language model from knowledge graph for biomedical information extraction. Briefings in Bioinformatics 2021, 22. [Google Scholar] [CrossRef] [PubMed]
- Wu, S.; Fei, H.; Ji, W.; Chua, T.S. Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 2593–2608.
- Miwa, M.; Bansal, M. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016, Vol. 1, pp. 1105–1116.
- Wu, S.; Fei, H.; Qu, L.; Ji, W.; Chua, T.S. NExT-GPT: Any-to-Any Multimodal LLM. CoRR, 2309. [Google Scholar]
- Mäntylä, M.V.; Graziotin, D.; Kuutila, M. The Evolution of Sentiment Analysis - A Review of Research Topics, Venues, and Top Cited Papers. CoRR, 1612. [Google Scholar]
- Pang, B.; Lee, L. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. CoRR, 0409. [Google Scholar]
- Wu, S.; Fei, H.; Ren, Y.; Ji, D.; Li, J. Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Rich Syntactic Knowledge. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021, pp. 3957–3963.
- Li, B.; Fei, H.; Liao, L.; Zhao, Y.; Teng, C.; Chua, T.; Ji, D.; Li, F. Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition. Proceedings of the 31st ACM International Conference on Multimedia, MM, 2023, pp. 5923–5934.
- Fei, H.; Liu, Q.; Zhang, M.; Zhang, M.; Chua, T.S. Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 5980–5994.
- Turney, P.D. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. CoRR, 0212. [Google Scholar]
- Shen, A.; Han, X.; Cohn, T.; Baldwin, T.; Frermann, L. 2021; arXiv:cs.CL/2109.10645].
- Li, J.; Xu, K.; Li, F.; Fei, H.; Ren, Y.; Ji, D. MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021, pp. 1359–1370. [Google Scholar]
- Fei, H.; Wu, S.; Ren, Y.; Zhang, M. Matching Structure for Dual Learning. Proceedings of the International Conference on Machine Learning, ICML, 2022, pp. 6373–6391.
- Cao, H.; Li, J.; Su, F.; Li, F.; Fei, H.; Wu, S.; Li, B.; Zhao, L.; Ji, D. OneEE: A One-Stage Framework for Fast Overlapping and Nested Event Extraction. Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 1953–1964.
- Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality Reduction by Learning an Invariant Mapping. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), 2006, Vol. 2, pp. 1735–1742. [CrossRef]
- Rim, D.N.; Heo, D.; Choi, H. 2021; arXiv:cs.CL/2109.09075.
- Liao, D. 2021; arXiv:cs.CL/2106.04791.
- Wu, S.; Fei, H.; Cao, Y.; Bing, L.; Chua, T.S. Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 14734–14751.
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR, 1810. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, 1907. [Google Scholar]
- Fei, H.; Li, F.; Li, B.; Ji, D. Encoder-Decoder Based Unified Semantic Role Labeling with Label-Aware Syntax. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 12794–12802.
- Li, B.; Fei, H.; Li, F.; Wu, Y.; Zhang, J.; Wu, S.; Li, J.; Liu, Y.; Liao, L.; Chua, T.S.; Ji, D. DiaASQ: A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis. Findings of the Association for Computational Linguistics: ACL 2023, 2023, pp. 13449–13467. [Google Scholar]
- Gao, T.; Yao, X.; Chen, D. SimCSE: Simple Contrastive Learning of Sentence Embeddings. CoRR, 2104. [Google Scholar]
- Potts, C.; Wu, Z.; Geiger, A.; Kiela, D. DynaSent: A Dynamic Benchmark for Sentiment Analysis. CoRR, 2012. [Google Scholar]
- Zhang, X.; Zhao, J.J.; LeCun, Y. Character-level Convolutional Networks for Text Classification. CoRR, 1509. [Google Scholar]
- Socher, R.; Perelygin, A.; Wu, J.; Chuang, J.; Manning, C.D.; Ng, A.; Potts, C. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics: Seattle, Washington, USA, 2013; pp. 1631–1642. [Google Scholar]
- Fei, H.; Li, F.; Li, C.; Wu, S.; Li, J.; Ji, D. Inheriting the Wisdom of Predecessors: A Multiplex Cascade Framework for Unified Aspect-based Sentiment Analysis. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, 2022, pp. 4096–4103.
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. CoRR, 1908. [Google Scholar]
- Su, J.; Cao, J.; Liu, W.; Ou, Y. Whitening Sentence Representations for Better Semantics and Faster Retrieval. CoRR, 2103. [Google Scholar]
- Fei, H.; Wu, S.; Ren, Y.; Li, F.; Ji, D. Better Combine Them Together! Integrating Syntactic Constituency and Dependency Representations for Semantic Role Labeling. Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021, pp. 549–559. [Google Scholar]
- Wu, S.; Fei, H.; Zhang, H.; Chua, T.S. Imagine That! Abstract-to-Intricate Text-to-Image Synthesis with Scene Graph Hallucination Diffusion. Advances in Neural Information Processing Systems 2024, 36. [Google Scholar]
- Fei, H.; Wu, S.; Ji, W.; Zhang, H.; Chua, T.S. Empowering dynamics-aware text-to-video diffusion with large language models. arXiv preprint arXiv:2308.13812, arXiv:2308.13812 2023.
- Qu, L.; Wu, S.; Fei, H.; Nie, L.; Chua, T.S. Layoutllm-t2i: Eliciting layout guidance from llm for text-to-image generation. Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 643–654.
- Fei, H.; Ren, Y.; Ji, D. Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction. Information Processing & Management 2020, 57, 102311. [Google Scholar]
- Li, J.; Fei, H.; Liu, J.; Wu, S.; Zhang, M.; Teng, C.; Ji, D.; Li, F. Unified Named Entity Recognition as Word-Word Relation Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 10965–10973.
- Fei, H.; Chua, T.; Li, C.; Ji, D.; Zhang, M.; Ren, Y. On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training. ACM Transactions on Information Systems 2023, 41, 50:1–50:32. [Google Scholar] [CrossRef]
- Zhao, Y.; Fei, H.; Cao, Y.; Li, B.; Zhang, M.; Wei, J.; Zhang, M.; Chua, T. Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling. Proceedings of the 31st ACM International Conference on Multimedia, MM, 2023, pp. 5281–5291.
- Fei, H.; Ren, Y.; Zhang, Y.; Ji, D. Nonautoregressive Encoder-Decoder Neural Framework for End-to-End Aspect-Based Sentiment Triplet Extraction. IEEE Transactions on Neural Networks and Learning Systems 2023, 34, 5544–5556. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Y.; Fei, H.; Ji, W.; Wei, J.; Zhang, M.; Zhang, M.; Chua, T.S. Generating Visual Spatial Description via Holistic 3D Scene Understanding. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 7960–7977.
- Wu, X.; Gao, C.; Zang, L.; Han, J.; Wang, Z.; Hu, S. 2021; arXiv:cs.CL/2109.04380].
- Fei, H.; Li, B.; Liu, Q.; Bing, L.; Li, F.; Chua, T.S. Reasoning Implicit Sentiment with Chain-of-Thought Prompting. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023, pp. 1171–1182.
- Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; Davison, J.; Shleifer, S.; von Platen, P.; Ma, C.; Jernite, Y.; Plu, J.; Xu, C.; Scao, T.L.; Gugger, S.; Drame, M.; Lhoest, Q.; Rush, A.M. S: Transformers, 2020; arXiv:cs.CL/1910.03771].
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. 2018; arXiv:cs.CV/1708.02002.
- Scott, M.; Plested, J. GAN-SMOTE: A Generative Adversarial Network approach to Synthetic Minority Oversampling. Aust. J. Intell. Inf. Process. Syst. 2019, 15, 29–35. [Google Scholar]
| Dense Layer | Linear Layer | |||
|---|---|---|---|---|
| Type | In | Out | In | Out |
| Linear | 768 | 768 | 768 | 3 |
| BiGRU | 768 | 256 | 512 | 3 |
| BiLSTM | 768 | 256 | 512 | 3 |
| Hyper-Params | Values |
|---|---|
| Dropout | 0.1 |
| Activation | Tanh, ReLU |
| Focal Loss-gamma | 3 |
| Focal Loss-reduction | mean |
| Max-Length | 64, 256 |
| Batch-Size | 32 |
| Optimizer | AdamW |
| Learning Rate | 1e-5 |
| Weight Decay | 0.01 |
| Epochs | 4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).