Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences

Version 1 : Received: 7 June 2023 / Approved: 7 June 2023 / Online: 7 June 2023 (09:41:14 CEST)

A peer-reviewed article of this Preprint also exists.

Fan, Y.; Li, B.; Sataer, Y.; Gao, M.; Shi, C.; Cao, S.; Gao, Z. Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences. Appl. Sci. 2023, 13, 9412. Fan, Y.; Li, B.; Sataer, Y.; Gao, M.; Shi, C.; Cao, S.; Gao, Z. Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences. Appl. Sci. 2023, 13, 9412.

Abstract

Most natural language processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works address the issue with an intuition of decomposing complex sentences and linking simple ones, such as RST-style discourse parsing, split-and-rephrase (SPRP), text simplification (TS), simple-sentence-decomposition (SSD), etc. However, these works are not applicable for semantic parsing like abstract meaning representation (AMR) parsing and semantic dependency parsing due to misalignments to semantic relations and unavailabilities to preserve original semantics. Following the same intuition and avoiding deficiencies of previous works, we propose a novel framework, hierarchical clause annotation (HCA), based on the linguistic research of clause hierarchy. With the HCA framework, we annotate a large HCA corpus to explore the potentialities of integrating HCA structural features into semantic parsing with complex sentences. Moreover, we decompose HCA into two subtasks, i.e., clause segmentation and clause parsing, and provide neural baseline models for more silver annotations.

Keywords

Clause Hierarchy; Hierarchical Clause Annotation; Complex sentences; Semantic Parsing; Syntactic Parsing; RST corpus

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.