Redirigiendo al acceso original de articulo en 19 segundos...
Inicio  /  Applied Sciences  /  Vol: 11 Par: 14 (2021)  /  Artículo
ARTÍCULO
TITULO

Linked Data Triples Enhance Document Relevance Classification

Dinesh Nagumothu    
Peter W. Eklund    
Bahadorreza Ofoghi and Mohamed Reda Bouadjenek    

Resumen

Standardized approaches to relevance classification in information retrieval use generative statistical models to identify the presence or absence of certain topics that might make a document relevant to the searcher. These approaches have been used to better predict relevance on the basis of what the document is ?about?, rather than a simple-minded analysis of the bag of words contained within the document. In more recent times, this idea has been extended by using pre-trained deep learning models and text representations, such as GloVe or BERT. These use an external corpus as a knowledge-base that conditions the model to help predict what a document is about. This paper adopts a hybrid approach that leverages the structure of knowledge embedded in a corpus. In particular, the paper reports on experiments where linked data triples (subject-predicate-object), constructed from natural language elements are derived from deep learning. These are evaluated as additional latent semantic features for a relevant document classifier in a customized news-feed website. The research is a synthesis of current thinking in deep learning models in NLP and information retrieval and the predicate structure used in semantic web research. Our experiments indicate that linked data triples increased the F-score of the baseline GloVe representations by 6% and show significant improvement over state-of-the art models, like BERT. The findings are tested and empirically validated on an experimental dataset and on two standardized pre-classified news sources, namely the Reuters and 20 News groups datasets.

 Artículos similares

       
 
J. Javier Samper-Zapater, Julián Gutiérrez-Moret, Jose Macario Rocha, Juan José Martinez-Durá and Vicente R. Tomás    
The significance of Linked Open Data datasets for traffic information extends beyond just including open traffic data. It incorporates links to other relevant thematic datasets available on the web. This enables federated queries across different data pl... ver más
Revista: Information

 
Muhammad Umer Masood, Muhammad Rashid, Saif Haider, Iram Naz, Chaitanya B. Pande, Salim Heddam, Fahad Alshehri, Ismail Elkhrachy, Amimul Ahsan and Saad Sh. Sammen    
Groundwater is an important source of freshwater. At the same time, anthropogenic activities, in particular, industrialization, urbanization, population growth, and excessive application of fertilizers, are some of the major reasons for groundwater quali... ver más
Revista: Water

 
Matharit Namsai, Butsawan Bidorn, Ruetaitip Mama and Warit Charoenlerkthawin    
The construction of large dams in the upper tributary basin of the Chao Phraya River (CPR) has been linked to a significant decrease in sediment load in the CPR system, estimated between 75?85%. This study, utilizing historical and recent river flow and ... ver más
Revista: Water

 
Rui Yuan, Ruiyang Xu, Hezhenjia Zhang, Yutao Hua, Hongsheng Zhang, Xiaojing Zhong and Shenliang Chen    
This study presents an in-depth analysis of the dynamic beach landscapes of Hainan Island, which is located at the southernmost tip of China. Home to over a hundred natural and predominantly sandy beaches, Hainan Island confronts significant challenges p... ver más
Revista: Water

 
Moritz Müller, Ambre Dupuis, Tobias Zeulner, Ignacio Vazquez, Johann Hagerer and Peter A. Gloor    
Well-being is one of the pillars of positive psychology, which is known to have positive effects not only on the personal and professional lives of individuals but also on teams and organizations. Understanding and promoting individual well-being is esse... ver más
Revista: Applied Sciences