Inicio  /  Applied Sciences  /  Vol: 10 Par: 16 (2020)  /  Artículo
ARTÍCULO
TITULO

An ERNIE-Based Joint Model for Chinese Named Entity Recognition

Yu Wang    
Yining Sun    
Zuchang Ma    
Lisheng Gao and Yang Xu    

Resumen

Named Entity Recognition (NER) is the fundamental task for Natural Language Processing (NLP) and the initial step in building a Knowledge Graph (KG). Recently, BERT (Bidirectional Encoder Representations from Transformers), which is a pre-training model, has achieved state-of-the-art (SOTA) results in various NLP tasks, including the NER. However, Chinese NER is still a more challenging task for BERT because there are no physical separations between Chinese words, and BERT can only obtain the representations of Chinese characters. Nevertheless, the Chinese NER cannot be well handled with character-level representations, because the meaning of a Chinese word is quite different from that of the characters, which make up the word. ERNIE (Enhanced Representation through kNowledge IntEgration), which is an improved pre-training model of BERT, is more suitable for Chinese NER because it is designed to learn language representations enhanced by the knowledge masking strategy. However, the potential of ERNIE has not been fully explored. ERNIE only utilizes the token-level features and ignores the sentence-level feature when performing the NER task. In this paper, we propose the ERNIE-Joint, which is a joint model based on ERNIE. The ERNIE-Joint can utilize both the sentence-level and token-level features by joint training the NER and text classification tasks. In order to use the raw NER datasets for joint training and avoid additional annotations, we perform the text classification task according to the number of entities in the sentences. The experiments are conducted on two datasets: MSRA-NER and Weibo. These datasets contain Chinese news data and Chinese social media data, respectively. The results demonstrate that the ERNIE-Joint not only outperforms BERT and ERNIE but also achieves the SOTA results on both datasets.

 Artículos similares

       
 
Xiaoyu Han, Yue Zhang, Wenkai Zhang and Tinglei Huang    
Relation extraction is a vital task in natural language processing. It aims to identify the relationship between two specified entities in a sentence. Besides information contained in the sentence, additional information about the entities is verified to... ver más
Revista: Information