Extraction of the Relations among Significant Pharmacological Entities in Russian-Language Reviews of Internet Users on Medications

Alexander Sboev

Anton Selivanov

Ivan Moloshnikov

Roman Rybka

Artem Gryaznov

Sanna Sboeva and Gleb Rylkov

Resumen

Nowadays, the analysis of digital media aimed at prediction of the society?s reaction to particular events and processes is a task of a great significance. Internet sources contain a large amount of meaningful information for a set of domains, such as marketing, author profiling, social situation analysis, healthcare, etc. In the case of healthcare, this information is useful for the pharmacovigilance purposes, including re-profiling of medications. The analysis of the mentioned sources requires the development of automatic natural language processing methods. These methods, in turn, require text datasets with complex annotation including information about named entities and relations between them. As the relevant literature analysis shows, there is a scarcity of datasets in the Russian language with annotated entity relations, and none have existed so far in the medical domain. This paper presents the first Russian-language textual corpus where entities have labels of different contexts within a single text, so that related entities share a common context. therefore this corpus is suitable for the task of belonging to the medical domain. Our second contribution is a method for the automated extraction of entity relations in Russian-language texts using the XLM-RoBERTa language model preliminarily trained on Russian drug review texts. A comparison with other machine learning methods is performed to estimate the efficiency of the proposed method. The method yields state-of-the-art accuracy of extracting the following relationship types: ADR?Drugname, Drugname?Diseasename, Drugname?SourceInfoDrug, Diseasename?Indication. As shown on the presented subcorpus from the Russian Drug Review Corpus, the method developed achieves a mean F1-score of 80.4% (estimated with cross-validation, averaged over the four relationship types). This result is 3.6% higher compared to the existing language model RuBERT, and 21.77% higher compared to basic ML classifiers.

Palabras claves

pharmacological text corpus - automatic relation extraction - natural language processing - deep learning

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 6 Parte: 1 (2022)

MATERIAS

INFRAESTRUCTURA

REVISTAS SIMILARES

Big Data and Cognitive Computing
Future Internet
ISPRS International Journal of Geo-Information

DOI

https://doi.org/10.3390/bdcc6010010

Art�culos similares

Recognizing Linear Building Patterns in Topographic Data by Using Two New Indices based on Delaunay Triangulation

Acceso

Xianjin He, Min Deng and Guowei Luo

Building pattern recognition is fundamental to a wide range of downstream applications, such as urban landscape evaluation, social analyses, and map generalization. Although many studies have been conducted, there is still a lack of satisfactory results,... ver m�s

Revista: ISPRS International Journal of Geo-Information

A Knowledge-Based Filtering Method for Open Relations among Geo-Entities

Acceso

Li Yu, Peiyuan Qiu, Jialiang Gao and Feng Lu

Knowledge graphs (KGs) are crucial resources for supporting geographical knowledge services. Given the vast geographical knowledge in web text, extraction of geo-entity relations from web text has become the core technology for construction of geographic... ver m�s

Revista: ISPRS International Journal of Geo-Information

Knowledge Embedding with Geospatial Distance Restriction for Geographic Knowledge Graph Completion

Acceso

Peiyuan Qiu, Jialiang Gao, Li Yu and Feng Lu

A Geographic Knowledge Graph (GeoKG) links geographic relation triplets into a large-scale semantic network utilizing the semantic of geo-entities and geo-relations. Unfortunately, the sparsity of geo-related information distribution on the web leads to ... ver m�s

Revista: ISPRS International Journal of Geo-Information

Medici�n del desgaste laboral en el ramo comercial

Acceso

Marianna Barrios Le�n, Ruth Illada Garc�a

El desempe�o del capital humano se ve influido por m�ltiples factores; uno de ellos es el desgaste laboral. Esta investigaci�n tiene por objeto identificar las dimensiones fundamentales que participar�an en la medici�n del desgaste percibido por los trab... ver m�s

Revista: Revista Ingenier�a Industrial

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles