ARTÍCULO
TITULO

REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets

Carmen Brando    
Francesca Frontini    
Jean-Gabriel Ganascia    

Resumen

This paper proposes a graph-based Named Entity Linking (NEL) algorithm named REDEN for the disambiguation of authors? names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and evaluated according to the two phases of NEL as reported in current state of the art, namely, candidate retrieval and candidate selection. REDEN leverages knowledge from different Linked Data sources in order to select candidates for each author mention, subsequently crawls data from other Linked Data sets using equivalence links (e.g., owl:sameAs), and, finally, fuses graphs of homologous individuals into a non-redundant graph well-suited for graph centrality calculation; the resulting graph is used for choosing the best referent. The REDEN algorithm is distributed in open-source and follows current standards in digital editions (TEI) and semantic Web (RDF). Its integration into an editorial workflow of digital editions in Digital humanities and cultural heritage projects is entirely plausible. Experiments are conducted along with the corresponding error analysis in order to test our approach and to help us to study the weaknesses and strengths of our algorithm, thereby to further improvements of REDEN.

 Artículos similares

       
 
Xiaohui Cui, Yu Yang, Dongmei Li, Xiaolong Qu, Lei Yao, Sisi Luo and Chao Song    
Recently, researchers have extensively explored various methods for electronic medical record named entity recognition, including character-based, word-based, and hybrid methods. Nonetheless, these methods frequently disregard the semantic context of ent... ver más
Revista: Applied Sciences

 
Shuiyan Li, Rongzhi Qi and Shengnan Zhang    
Compared with English named entity recognition (NER), Chinese NER faces significant challenges due to the flexible, non-standard word formation and vague word boundaries, which cause a lot of boundary ambiguity and reduce the accuracy of entity identific... ver más
Revista: Applied Sciences

 
Jaskaran Gill, Madhu Chetty, Suryani Lim and Jennifer Hallinan    
Relation extraction from biological publications plays a pivotal role in accelerating scientific discovery and advancing medical research. While vast amounts of this knowledge is stored within the published literature, extracting it manually from this co... ver más
Revista: Informatics

 
Kokoy Siti Komariah, Ariana Tulus Purnomo, Ardianto Satriawan, Muhammad Ogin Hasanuddin, Casi Setianingsih and Bong-Kee Sin    
To pursue a healthy lifestyle, people are increasingly concerned about their food ingredients. Recently, it has become a common practice to use an online recipe to select the ingredients that match an individual?s meal plan and healthy diet preference. T... ver más
Revista: Informatics

 
Lin He, Shengnan Wang and Xinran Cao    
Shipping Enterprise Credit Named Entity Recognition (NER) aims to recognize shipping enterprise credit entities from unstructured shipping enterprise credit texts. Aiming at the problem of low entity recognition rate caused by complex and diverse entitie... ver más
Revista: Applied Sciences