Redirigiendo al acceso original de articulo en 16 segundos...
ARTÍCULO
TITULO

Comparative Study of Clustering Algorithms using OverallSimSUX Similarity Function for XML Documents

Damny Magdaleno Guevara    
Yadriel Miranda    
Ivett Fuentes    
María Garc ía    

Resumen

A huge amount of information is represented in XML format. Several tools have been developed to store, and query XML data. It becomes inevitable to develop high performance techniques for efficiently analysing extremely large collections of XML data. One of the methods that many researchers have focused on is clustering, which groups similar XML data, according to their content and structures. In previous work, there has been proposed the similarity function OverallSimSUX, that facilitates to capture the degree of similitude among the documents with a novel methodology for clustering XML documents using both structural and content features. Although this methodology shows good performance, endorsed by experiments with several corpus and statistical tests, on having had impliedly only one clustering algorithm, K-Star, we do not know the effect that it would suffer if we replaced this algorithm by other with dissimilar characteristics. Therefore to endorse completely the methodology, in this work we make a comparative study of the effects of applying the methodology for the OverallSimSUX similarity function calculation, using clustering algorithms of different classifications . Based on our analysis, we arrived to two important results: (1) The Fuzzy-SKWIC clustering algorithm works best both with methodology and without methodology, although there are not present significant differences respect to the K-Star clustering algorithm; (2) For each analysed algorithm when using the methodology, we obtain better results than when it is not taken into account.

 Artículos similares

       
 
Annie Rose Elizabeth, Sumit Sarma, T. Jayachandran, P. A. Ramakrishna and Mondeep Borthakur    
Multiple applications in aerospace utilize pyrotechnic charges for their operation, and these charges are predominantly in the form of granules. One of the most used charges is boron potassium nitrate (BPN), and the present study focuses on mathematicall... ver más
Revista: Aerospace

 
Kichan Sim and Kangsu Lee    
A digital twin is a virtual model of a real-world structure (such as a device or equipment) which supports various problems or operations that occur throughout the life cycle of the structure through linkage with the actual structure. Digital twins have ... ver más

 
Sen Deng, Weiqiang Zhao, Tianbao Huang, Ming Xia and Zhengwei Wang    
Kaplan turbines are generally used in working conditions with a high flow and low head. These are a type of axial-flow hydro turbine that can adjust the opening of the guide vanes and blades simultaneously in order to achieve higher efficiency under a wi... ver más

 
Max Käding and Steffen Marx    
Acoustic emission monitoring (AEM) has emerged as an effective technique for detecting wire breaks resulting from, e.g., stress corrosion cracking, and its application on prestressed concrete bridges is increasing. The success of this monitoring measure ... ver más
Revista: Applied Sciences

 
Tahsin Koroglu and Elanur Ekici    
In recent years, wind energy has become remarkably popular among renewable energy sources due to its low installation costs and easy maintenance. Having high energy potential is of great importance in the selection of regions where wind energy investment... ver más
Revista: Applied Sciences