Redirigiendo al acceso original de articulo en 21 segundos...
ARTÍCULO
TITULO

Using topic modeling for communities clusterization in the VKontakte social network

Sergey Gorshkov    
Eugene Ilyushin    
Anastasia Chernysheva    
Viacheslav Goiko    
Dmitry Namiot    

Resumen

Topic modeling is one of the most widely used methods in text analysis. It can be used to select topics as well as to find the topics distributed in each document from the corpus. In this article, we present a method for clustering communities in the social network VKontakte (the most popular Russian social network) using topic modeling. As a communities sample a set of groups for which several students of Tomsk State University are subscribed was selected. There were about 7,000 of them in this set. The article describes the method by which the text corpus was formed, as well as mathematical modeling using two popular classical methods LDA and ARTM. A detailed description of these models, quality assessment criteria, and the main practical techniques used by the authors in training the models are given. The aggregated results of clustering communities by topic are also presented. There are also described a method for expert evaluation of community topics based on visualization of the words that make up the lexical core of the topic.

 Artículos similares

       
 
Michal Welcer, Nezar Sahbon and Albert Zajdel    
Modern aviation technology development heavily relies on computer simulations. SIL (Software-In-The-Loop) simulations are essential for evaluating autopilots and control algorithms for multi-rotors, including drones and other UAVs (Unmanned Aerial Vehicl... ver más
Revista: Aerospace

 
Javier Domingo-Espiñeira, Oscar Fraile-Martínez, Cielo Garcia-Montero, María Montero, Andrea Varaona, Francisco J. Lara-Abelenda, Miguel A. Ortega, Melchor Alvarez-Mon and Miguel Angel Alvarez-Mon    
Neurological disorders represent the primary cause of disability and the secondary cause of mortality globally. The incidence and prevalence of the most notable neurological disorders are growing rapidly. Considering their social and public perception by... ver más
Revista: Information

 
Qishun Mei and Xuhui Li    
To address the limitations of existing methods of short-text entity disambiguation, specifically in terms of their insufficient feature extraction and reliance on massive training samples, we propose an entity disambiguation model called COLBERT, which f... ver más
Revista: Information

 
João P. Ferreira, Vinicius C. Ferreira, Sérgio L. Nogueira, João M. Faria and José A. Afonso    
The sharing of mobile network infrastructure has become a key topic with the introduction of 5G due to the high costs of deploying such infrastructures, with neutral host models coupled with features such as network function virtualization (NFV) and netw... ver más
Revista: Information

 
Hamid Reza Ahmadi, Zaher Rahimi and Wojciech Sumelka    
In this study, the behavior of double-walled carbon nanotubes (DWCNTs) used as mass sensors is explored under various boundary conditions; particular attention is paid to the crucial topic of resonant nanomechanical mass sensors. In the presented approac... ver más
Revista: Applied Sciences