Extraction of the Relations between Significant Pharmacological Entities in Russian-Language Internet Reviews on Medications

Alexander Sboev; Anton Selivanov; Ivan Moloshnikov; Roman Rybka; Artem Gryaznov; Sanna Sboeva; Gleb Rylkov

doi:10.20944/preprints202111.0344.v1

Submitted:

18 November 2021

Posted:

19 November 2021

You are already at the latest version

Abstract

Nowadays, an analysis of virtual media to predict society’s reaction to any events or processes is a task of great relevance. Especially it concerns meaningful information on healthcare problems. Internet sources contain a large amount of pharmacologically meaningful information useful for pharmacovigilance purposes and repurposing drug use. An analysis of such a scale of information demands developing the methods that require the creation of a corpus with labeled relations among entities. Before, there have been no such Russian language datasets. This paper considers the first Russian language dataset where labeled entity pairs are divided into multiple contexts within a single text (by used drugs, by different users, by the cases of use, etc.), and a method based on the XLM-RoBERTa language model, previously trained on medical texts to evaluate the state-of-the-art accuracy for the task of indication of the four types of relationships among entities: ADR–Drugname, Drugname–Diseasename, Drugname–SourceInfoDrug, Diseasename–Indication. As shown based on the presented dataset from the Russian Drug Review Corpus, the developed method achieves the F1-score of 81.2% (obtained using cross-validation and averaged for the four types of relationships), which is 7.8% higher than the basic classifiers.

Keywords:

pharmacological text corpus

;

automatic relation extraction

;

natural language processing

;

deep learning

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Extraction of the Relations between Significant Pharmacological Entities in Russian-Language Internet Reviews on Medications

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe