Submitted:
30 November 2023
Posted:
30 November 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We introduce the Syntactic Enhancement Network (SEN), a pioneering approach that skillfully weaves complex syntactic structures into the fabric of pre-trained language models.
- We rigorously test the SEN across two pivotal sentence-level tasks, employing diverse experimental designs to thoroughly evaluate its performance.
- We develop the extensive English Sentence Gap-Filling dataset, tailored for sentence completion in English examinations, poised to become a benchmark in the field.
2. Related Work
3. The Proposed Framework
3.1. Encoding Layer
3.2. Dependency Syntax Integration Layer
3.3. Output Layer
4. Experiments
4.1. Settings
4.2. Configurations
4.3. Results and Analysis
5. Conclusion and Future Exploration
References
- Sun, C.; Huang, L.; Qiu, X. Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence. arXiv preprint, arXiv:1903.09588 2019. [CrossRef]
- Fei, H.; Wu, S.; Ren, Y.; Zhang, M. Matching Structure for Dual Learning. Proceedings of the International Conference on Machine Learning, ICML, 2022, pp. 6373–6391.
- Nogueira, R.; Cho, K. Passage Re-ranking with BERT. CoRR 2019, abs/1901.04085 [1901.04085].
- Fei, H.; Li, F.; Li, B.; Ji, D. Encoder-Decoder Based Unified Semantic Role Labeling with Label-Aware Syntax. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 12794–12802.
- Fei, H.; Wu, S.; Ren, Y.; Li, F.; Ji, D. Better Combine Them Together! Integrating Syntactic Constituency and Dependency Representations for Semantic Role Labeling. Findings of the Association for Computational Linguistics: ACL/IJCNLP, 2021; 2021, pp. 549–559. [Google Scholar]
- Tenney, I.; Xia, P.; Chen, B.; Wang, A.; Poliak, A.; McCoy, R.T.; Kim, N.; Van Durme, B.; Bowman, S.R.; Das, D. ; others. What do you learn from context? probing for sentence structure in contextualized word representations. arXiv preprint, arXiv:1905.06316 2019. [CrossRef]
- Fei, H.; Li, F.; Li, C.; Wu, S.; Li, J.; Ji, D. Inheriting the Wisdom of Predecessors: A Multiplex Cascade Framework for Unified Aspect-based Sentiment Analysis. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, 2022, pp. 4096–4103.
- Zhang, Z.; Han, X.; Liu, Z.; Jiang, X.; Sun, M.; Liu, Q. ERNIE: Enhanced Language Representation with Informative Entities. arXiv preprint, arXiv:1905.07129 2019. [CrossRef]
- Fei, H.; Ren, Y.; Zhang, Y.; Ji, D.; Liang, X. Enriching contextualized language model from knowledge graph for biomedical information extraction. Briefings in Bioinformatics 2021, 22. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Lu, L.; Lapata, M. Top-down Tree Long Short-Term Memory Networks. Proceedings of NAACL-HLT, 2016, pp. 310–320.
- Miwa, M.; Bansal, M. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016, Vol. 1, pp. 1105–1116.
- Zhang, Y.; Zheng, W.; Lin, H.; Wang, J.; Yang, Z.; Dumontier, M. Drug–drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths. Bioinformatics 2017, 34, 828–835. [Google Scholar] [CrossRef] [PubMed]
- Zweig, G.; Burges, C.J. The microsoft research sentence completion challenge. Microsoft Research, Redmond, WA, USA, Tech. Rep. MSR-TR-2011-129, 2011. [Google Scholar]
- Fei, H.; Ren, Y.; Ji, D. Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction. Information Processing & Management 2020, 57, 102311. [Google Scholar] [CrossRef]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv preprint, arXiv:1301.3781 2013. [CrossRef]
- Monz, K.T.A.B.C. Recurrent Memory Networks for Language Modeling. Proceedings of NAACL-HLT, 2016, pp. 321–331.
- Li, J.; Fei, H.; Liu, J.; Wu, S.; Zhang, M.; Teng, C.; Ji, D.; Li, F. Unified Named Entity Recognition as Word-Word Relation Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 10965–10973.
- Li, J.; Xu, K.; Li, F.; Fei, H.; Ren, Y.; Ji, D. MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction. Findings of the Association for Computational Linguistics: ACL-IJCNLP, 2021; 2021, pp. 1359–1370. [Google Scholar]
- Wu, S.; Fei, H.; Cao, Y.; Bing, L.; Chua, T.S. Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 14734–14751.
- Wang, F.; Li, F.; Fei, H.; Li, J.; Wu, S.; Su, F.; Shi, W.; Ji, D.; Cai, B. Entity-centered Cross-document Relation Extraction. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 9871–9881.
- Peters, M.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 2227–2237.
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint, arXiv:1810.04805 2018. [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N. ; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Advances in neural information processing systems, 2017, pp. 5998–6008.
- Fei, H.; Wu, S.; Li, J.; Li, B.; Li, F.; Qin, L.; Zhang, M.; Zhang, M.; Chua, T.S. LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model. Proceedings of the Advances in Neural Information Processing Systems, NeurIPS, 2022; 2022, pp. 15460–15475. [Google Scholar]
- Wu, S.; Fei, H.; Ren, Y.; Ji, D.; Li, J. Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Rich Syntactic Knowledge. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021, pp. 3957–3963.
- Fei, H.; Liu, Q.; Zhang, M.; Zhang, M.; Chua, T.S. Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 5980–5994.
- Liu, N.F.; Gardner, M.; Belinkov, Y.; Peters, M.; Smith, N.A. Linguistic Knowledge and Transferability of Contextual Representations. arXiv preprint, arXiv:1903.08855 2019. [CrossRef]
- Sun, Y.; Wang, S.; Li, Y.; Feng, S.; Chen, X.; Zhang, H.; Tian, X.; Zhu, D.; Tian, H.; Wu, H. ERNIE: Enhanced Representation through Knowledge Integration. CoRR 2019, abs/1904.09223, [1904.09223].
- Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C.H.; Kang, J. BioBERT: pre-trained biomedical language representation model for biomedical text mining. arXiv preprint, arXiv:1901.08746 2019. [CrossRef]
- Beltagy, I.; Cohan, A.; Lo, K. SciBERT: Pretrained Contextualized Embeddings for Scientific Text. arXiv preprint, arXiv:1903.10676 2019. [CrossRef]
- Zweig, G.; Platt, J.C.; Meek, C.; Burges, C.J.; Yessenalina, A.; Liu, Q. Computational approaches to sentence completion. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 2012, pp. 601–610.
- Wu, S.; Fei, H.; Li, F.; Zhang, M.; Liu, Y.; Teng, C.; Ji, D. Mastering the Explicit Opinion-Role Interaction: Syntax-Aided Neural Transition System for Unified Opinion Role Labeling. Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022, pp. 11513–11521.
- Shi, W.; Li, F.; Li, J.; Fei, H.; Ji, D. Effective Token Graph Modeling using a Novel Labeling Strategy for Structured Sentiment Analysis. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 4232–4241.
- Fei, H.; Zhang, Y.; Ren, Y.; Ji, D. Latent Emotion Memory for Multi-Label Emotion Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, pp. 7692–7699.
- Park, H.; Cho, S.; Park, J. Word RNN as a Baseline for Sentence Completion. 2018 IEEE 5th International Congress on Information Science and Technology (CiSt). IEEE, 2018, pp. 183–187.
- Mirowski, P.; Vlachos, A. Dependency recurrent neural language models for sentence completion. arXiv preprint, arXiv:1507.01193 2015. [CrossRef]
- Fei, H.; Li, B.; Liu, Q.; Bing, L.; Li, F.; Chua, T.S. Reasoning Implicit Sentiment with Chain-of-Thought Prompting. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023, pp. 1171–1182.
- Wu, S.; Fei, H.; Ji, W.; Chua, T.S. Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 2593–2608.
- Li, B.; Fei, H.; Li, F.; Wu, Y.; Zhang, J.; Wu, S.; Li, J.; Liu, Y.; Liao, L.; Chua, T.S.; Ji, D. DiaASQ: A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis. Findings of the Association for Computational Linguistics: ACL, 2023; 2023, pp. 13449–13467. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural computation 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Quan, C.; Hua, L.; Sun, X.; Bai, W. Multichannel convolutional neural network for biological relation extraction. BioMed research international 2016, 2016. [Google Scholar] [CrossRef] [PubMed]
- Lim, S.; Lee, K.; Kang, J. Drug drug interaction extraction from the literature using a recursive neural network. PloS one 2018, 13, e0190926. [Google Scholar] [CrossRef] [PubMed]
| 1. To ___ his thirst, John drinks | 2. Not only is the book ___ but also ___ | 3. The author implies that the character is ___ |
| two liters of water daily. | enlightening. | A. often misunderstood |
| A. quench | A. engaging, insightful | B. not as simple as he seems |
| B. avoid | B. long, tedious | C. more complex than he appears |
| C. satisfy | C. useful, practical | D. misunderstood by most |
| D. alleviate | D. interesting, controversial |
| Train | Test | |||||
| DrugBank | MedLine | Overall | DrugBank | MedLine | Overall | |
| Positive | 3767 | 231 | 3998 | 884 | 92 | 976 |
| Negative | 14445 | 1179 | 15624 | 2819 | 243 | 3062 |
| Advice | 815 | 7 | 822 | 214 | 7 | 221 |
| Effect | 1517 | 152 | 1669 | 298 | 62 | 360 |
| Mechanism | 1257 | 62 | 1319 | 278 | 21 | 299 |
| Int | 178 | 10 | 188 | 94 | 2 | 96 |
| Model | Acc(%) |
| baseline | 53.4 |
| biLM | 73.0 |
| biLM+SEN(concat) | 73.9 |
| biLM+SEN(gate) | 75.9 |
| 90.3 | |
| +SEN(gate) | 90.7 |
| +SEN(concat) | 90.9 |
| +SEN(concat) | 91.2 |
| Model | Time (seconds per train epoch) |
| Tree-LSTM | 10431 |
| SEN | 2367 |
| SEN +Tree-LSTM | 2587 |
| Method | P | R | |
| Multichannel CNN (Quan et al. [41]) | 75.9 | 62.2 | 70.2 |
| Hierarchical RNNs (Zhang et al. [12]) | 74.1 | 71.8 | 72.9 |
| One-Stage Model Ensemble (Lim et al. [42]) | 77.8 | 69.6 | 73.5 |
| 72.6 | 66.3 | 69.3 | |
| +SEN(concat) | 79.6 | 65.3 | 71.8 |
| 75.5 | 73,2 | 74.4 | |
| +SEN(concat) | 77.7 | 72.4 | 75.1 |
| Triple form | ESG dataset(Acc) | DDI dataset () |
| 91.2 | 75.1 | |
| 90.5 | 74.6 | |
| 90.2 | 74.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).