Redirigiendo al acceso original de articulo en 16 segundos...
ARTÍCULO
TITULO

AN EMPIRICAL ANALYSIS OF SIMILARITY MEASURES FOR UNSTRUCTURED DATA

Mausumi Goswami    
B.S Purkayastha    

Resumen

With fast growth in size of digital text documents over internet and digital repositories, the pools of digital document is piling up day by day. Due to this digital revolution and growth, an efficient and effective technique is required to handle such an enormous amount of data. It is extremely important to understand the documents properly to mine them. To find coherence among documents text similarity measurement pays a humongous role.  The goal of similarity computation is to identify cohesion among text documents and to make the text ready for the required applications such as document organization, plagiarism detection, query matching etc. This task is one of the most fundamental task in the area of information retrieval, information extraction, document organization, plagiarism detection and text mining problems. But effectiveness of document clustering is highly dependent on this task.  In this paper four similarity measures are implemented and their descriptive statistics is compared. The results are found to be satisfactory. Graphs are drawn for visualization of results.

 Artículos similares