ARTÍCULO
TITULO

REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets

Carmen Brando    
Francesca Frontini    
Jean-Gabriel Ganascia    

Resumen

This paper proposes a graph-based Named Entity Linking (NEL) algorithm named REDEN for the disambiguation of authors? names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and evaluated according to the two phases of NEL as reported in current state of the art, namely, candidate retrieval and candidate selection. REDEN leverages knowledge from different Linked Data sources in order to select candidates for each author mention, subsequently crawls data from other Linked Data sets using equivalence links (e.g., owl:sameAs), and, finally, fuses graphs of homologous individuals into a non-redundant graph well-suited for graph centrality calculation; the resulting graph is used for choosing the best referent. The REDEN algorithm is distributed in open-source and follows current standards in digital editions (TEI) and semantic Web (RDF). Its integration into an editorial workflow of digital editions in Digital humanities and cultural heritage projects is entirely plausible. Experiments are conducted along with the corresponding error analysis in order to test our approach and to help us to study the weaknesses and strengths of our algorithm, thereby to further improvements of REDEN.

 Artículos similares

       
 
Qiang He, Guowei Chen, Wenchao Song and Pengzhou Zhang    
Named entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, inf... ver más
Revista: Applied Sciences

 
Xiaohui Cui, Yu Yang, Dongmei Li, Xiaolong Qu, Lei Yao, Sisi Luo and Chao Song    
Recently, researchers have extensively explored various methods for electronic medical record named entity recognition, including character-based, word-based, and hybrid methods. Nonetheless, these methods frequently disregard the semantic context of ent... ver más
Revista: Applied Sciences

 
Shuiyan Li, Rongzhi Qi and Shengnan Zhang    
Compared with English named entity recognition (NER), Chinese NER faces significant challenges due to the flexible, non-standard word formation and vague word boundaries, which cause a lot of boundary ambiguity and reduce the accuracy of entity identific... ver más
Revista: Applied Sciences

 
Mihael Arcan, Sampritha Manjunath, Cécile Robin, Ghanshyam Verma, Devishree Pillai, Simon Sarkar, Sourav Dutta, Haytham Assem, John P. McCrae and Paul Buitelaar    
Intent classification is an essential task for goal-oriented dialogue systems for automatically identifying customers? goals. Although intent classification performs well in general settings, domain-specific user goals can still present a challenge for t... ver más
Revista: Information

 
Feiyang Ye, Liang Huang, Senjie Liang and KaiKai Chi    
Named entity recognition (NER) in a few-shot setting is an extremely challenging task, and most existing methods fail to account for the gap between NER tasks and pre-trained language models. Although prompt learning has been successfully applied in few-... ver más
Revista: Information