ARTÍCULO
TITULO

Analyzing Geographic Questions Using Embedding-based Topic Modeling

Jonghyeon Yang    
Hanme Jang and Kiyun Yu    

Resumen

Recently, open-domain question-answering systems have achieved tremendous progress because of developments in large language models (LLMs), and have successfully been applied to question-answering (QA) systems, or Chatbots. However, there has been little progress in open-domain question answering in the geographic domain. Existing open-domain question-answering research in the geographic domain relies heavily on rule-based semantic parsing approaches using few data. To develop intelligent GeoQA agents, it is crucial to build QA systems upon datasets that reflect the real users? needs regarding the geographic domain. Existing studies have analyzed geographic questions using the geographic question corpora Microsoft MAchine Reading Comprehension (MS MARCO), comprising real-world user queries from Bing in terms of structural similarity, which does not discover the users? interests. Therefore, we aimed to analyze location-related questions in MS MARCO based on semantic similarity, group similar questions into a cluster, and utilize the results to discover the users? interests in the geographic domain. Using a sentence-embedding-based topic modeling approach to cluster semantically similar questions, we successfully obtained topic models that could gather semantically similar documents into a single cluster. Furthermore, we successfully discovered latent topics within a large collection of questions to guide practical GeoQA systems on relevant questions.

Palabras claves

 Artículos similares

       
 
Dongling Ma, Chunhong Zhang, Liang Zhao, Qingji Huang and Baoze Liu    
Monitoring, analyzing, and managing public sentiment surrounding urban emergencies hold significant importance for city governments in executing effective response strategies and maintaining social stability. In this study, we present a study which was c... ver más

 
Khaled Abuhasel    
This study compares the environmental sustainability of two cities in Saudi Arabia, Abha, and Bisha, through their green spaces, by analyzing green spaces in both cities. And the application of spatial statistics tools in the Arc Map program, to measure ... ver más

 
Yanjie Sun, Mingguang Wu, Xiaoyan Liu and Liangchen Zhou    
High-precision dynamic traffic noise maps can describe the spatial and temporal distributions of noise and are necessary for actual noise prevention. Existing monitoring point-based methods suffer from limited spatial adaptability, and prediction model-b... ver más

 
Xuehua Han and Juanle Wang    
Public behavior in cyberspace is extremely sensitive to emergency disaster events. Using appropriate methodologies to capture the semantic evolution of social media users? behaviors and discover how it varies across geographic space and time still presen... ver más

 
Chuan Yin, Binyu Zhang, Wanzeng Liu, Mingyi Du, Nana Luo, Xi Zhai and Tu Ba    
Expansion of the entity attribute information of geographic knowledge graphs is essentially the fusion of the Internet?s encyclopedic knowledge. However, it lacks structured attribute information, and synonymy and polysemy always exist. These reduce the ... ver más