Preprint Article Version 1 This version is not peer-reviewed

Self-Supervised Contextual Keyword and Keyphrase Retrieval with Self-Labelling

Version 1 : Received: 2 August 2019 / Approved: 6 August 2019 / Online: 6 August 2019 (09:17:36 CEST)

How to cite: Sharma, P.; Li, Y. Self-Supervised Contextual Keyword and Keyphrase Retrieval with Self-Labelling. Preprints 2019, 2019080073 (doi: 10.20944/preprints201908.0073.v1). Sharma, P.; Li, Y. Self-Supervised Contextual Keyword and Keyphrase Retrieval with Self-Labelling. Preprints 2019, 2019080073 (doi: 10.20944/preprints201908.0073.v1).

Abstract

In this paper we propose a novel self-supervised approach of keywords and keyphrases retrieval and extraction by an end-to-end deep learning approach, which is trained by contextually self-labelled corpus. Our proposed approach is novel to use contextual and semantic features to extract the keywords and has outperformed the state of the art. Through the experiment the proposed approach has been proved to be better in both semantic meaning and quality than the existing popular algorithms of keyword extraction. In addition, we propose to use contextual features from bidirectional transformers to automatically label short-sentence corpus with keywords and keyphrases to build the ground truth. This process avoids the human time to label the keywords and do not need any prior knowledge. To the best of our knowledge, our published dataset in this paper is a fine domain-independent corpus of short sentences with labelled keywords and keyphrases in the NLP community.

Subject Areas

contextual keyword extraction; BERT; word embedding; LSTM; transformers; Deep Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.