ARTÍCULO
TITULO

Extracting Spatio-Temporal Information from Chinese Archaeological Site Text

Wenjing Yuan    
Lin Yang    
Qing Yang    
Yehua Sheng and Ziyang Wang    

Resumen

Archaeological site text is the main carrier of archaeological data at present, which contains rich information. How to efficiently extract useful knowledge from the massive unstructured archaeological site texts is of great significance for the mining and reuse of archaeological information. According to the site information (such as name, location, cultural type, dynasty, etc.) recorded in the Chinese archaeological site text, this paper combines deep learning and natural language processing techniques to study the information extraction method for automatically obtaining the spatio-temporal information of sites. The initial construction of the corpus of Chinese archaeological site text is completed for the first time, and the corpus is input into the Bidirectional Long Short-Term Memory with Conditional Random Fields (BiLSTM-CRF) entity recognition model and Bidirectional Gated Recurrent Units with Dual Attention (BiGRU-Dual Attention) relationship extraction model for training. The F1 values of BiLSTM-CRF model and BiGRU-Dual Attention model on the test set reach 87.87% and 88.05%, respectively. The study demonstrates that the information extraction method proposed in this paper is feasible for the Chinese archaeological site texts, which promotes the establishment of knowledge graphs in archaeology and provides new methods and ideas for the development of information mining technology in archaeology.

 Artículos similares