ARTÍCULO
TITULO

A Semantic-Spatial Aware Data Conflation Approach for Place Knowledge Graphs

Lianlian He    
Hao Li and Rui Zhang    

Resumen

Recent advances in knowledge graphs show great promise to link various data together to provide a semantic network. Place is an important part in the big picture of the knowledge graph since it serves as a powerful glue to link any data to its georeference. A key technical challenge in constructing knowledge graphs with location nodes as geographical references is the matching of place entities. Traditional methods typically rely on rule-based matching or machine-learning techniques to determine if two place names refer to the same location. However, these approaches are often limited in the feature selection of places for matching criteria, resulting in imbalanced consideration of spatial and semantic features. Deep feature-based methods such as deep learning methods show great promise for improved place data conflation. This paper introduces a Semantic-Spatial Aware Representation Learning Model (SSARLM) for Place Matching. SSARLM liberates the tedious manual feature extraction step inherent in traditional methods, enabling an end-to-end place entity matching pipeline. Furthermore, we introduce an embedding fusion module designed for the unified encoding of semantic and spatial information. In the experiment, we evaluate the approach to named places from Guangzhou and Shanghai cities in GeoNames, OpenStreetMap (OSM), and Baidu Map. The SSARLM is compared with several classical and commonly used binary classification machine learning models, and the state-of-the-art large language model, GPT-4. The results demonstrate the benefit of pre-trained models in data conflation of named places.

 Artículos similares

       
 
Zhen Lei and Ting L. Lei    
Geospatial data conflation is the process of identifying and merging the corresponding features in two datasets that represent the same objects in reality. Conflation is needed in a wide range of geospatial analyses, yet it is a difficult task, often con... ver más

 
Liufeng Tao, Kai Ma, Miao Tian, Zhenyang Hui, Shuai Zheng, Junjie Liu, Zhong Xie and Qinjun Qiu    
The efficient and precise retrieval of desired information from extensive geological databases is a prominent and pivotal focus within the realm of geological information services. Conventional information retrieval methods primarily rely on keyword matc... ver más

 
Irina Kochetkova, Kseniia Leonteva, Ibram Ghebrial, Anastasiya Vlaskina, Sofia Burtseva, Anna Kushchazli and Konstantin Samouylov    
Fifth-generation (5G) networks provide network slicing capabilities, enabling the deployment of multiple logically isolated network slices on a single infrastructure platform to meet specific requirements of users. This paper focuses on modeling and anal... ver más
Revista: Future Internet

 
Md Monowar Hossain, A. H. M. Faisal Anwar, Nikhil Garg, Mahesh Prakash and Mohammed Abdul Bari    
The fidelity of the decadal experiment in Coupled Model Intercomparison Project Phase-5 (CMIP5) has been examined, over different climate variables for multiple temporal and spatial scales, in many previous studies. However, most of the studies were for ... ver más
Revista: Hydrology

 
Ying Sun, Yuefeng Lu, Ziqi Ding, Qiao Wen, Jing Li, Yanru Liu and Kaizhong Yao    
Most commonly used road-based homonymous entity matching algorithms are only applicable to the same scale, and are weak in recognizing the one-to-many and many-to-many types that are common in matching at different scales. This paper explores model match... ver más