Inicio  /  Algorithms  /  Vol: 16 Par: 12 (2023)  /  Artículo
ARTÍCULO
TITULO

Improving Clustering Accuracy of K-Means and Random Swap by an Evolutionary Technique Based on Careful Seeding

Libero Nigro and Franco Cicirelli    

Resumen

K-Means is a ?de facto? standard clustering algorithm due to its simplicity and efficiency. K-Means, though, strongly depends on the initialization of the centroids (seeding method) and often gets stuck in a local sub-optimal solution. K-Means, in fact, mainly acts as a local refiner of the centroids, and it is unable to move centroids all over the data space. Random Swap was defined to go beyond K-Means, and its modus operandi integrates K-Means in a global strategy of centroids management, which can often generate a clustering solution close to the global optimum. This paper proposes an approach which extends both K-Means and Random Swap and improves the clustering accuracy through an evolutionary technique and careful seeding. Two new algorithms are proposed: the Population-Based K-Means (PB-KM) and the Population-Based Random Swap (PB-RS). Both algorithms consist of two steps: first, a population of J" role="presentation">??J J candidate solutions is built, and then the candidate centroids are repeatedly recombined toward a final accurate solution. The paper motivates the design of PB-KM and PB-RS, outlines their current implementation in Java based on parallel streams, and demonstrates the achievable clustering accuracy using both synthetic and real-world datasets.

 Artículos similares

       
 
Maya Erna Natnan, Chen-Fei Low, Chou-Min Chong, Wanilada Rungrassamee and Syarul Nataqain Baharum    
The aim of this study was to evaluate the impact of oleic acid supplements on the liver metabolome of hybrid grouper fingerlings (Epinephelus fuscoguttatus × Epinephelus lanceolatus) challenged with Vibrio vulnificus. Oleic acid was used as a fish feed s... ver más

 
Ramasubbareddy Somula, Yongyun Cho and Bhabendu Kumar Mohanta    
In recent years, the Internet of Things (IoT) has transformed human life by improving quality of life and revolutionizing all business sectors. The sensor nodes in IoT are interconnected to ensure data transfer to the sink node over the network. Owing to... ver más
Revista: Information

 
Abdullah Ali Jawad Al-Abadi, Mbarka Belhaj Mohamed and Ahmed Fakhfakh    
In recent years, the combination of wireless body sensor networks (WBSNs) and the Internet ofc Medical Things (IoMT) marked a transformative era in healthcare technology. This combination allowed for the smooth communication between medical devices that ... ver más
Revista: Computers

 
Konstantinos Charmanas, Nikolaos Mittas and Lefteris Angelis    
Security vulnerabilities constitute one of the most important weaknesses of hardware and software security that can cause severe damage to systems, applications, and users. As a result, software vendors should prioritize the most dangerous and impactful ... ver más
Revista: Information

 
Ayman Taha, Bernard Cosgrave and Susan Mckeever    
Insurance is a data-rich sector, hosting large volumes of customer data that is analysed to evaluate risk. Machine learning techniques are increasingly used in the effective management of insurance risk. Insurance datasets by their nature, however, are o... ver más
Revista: Applied Sciences