Inicio  /  Applied Sciences  /  Vol: 12 Par: 24 (2022)  /  Artículo
ARTÍCULO
TITULO

Improved SOSK-Means Automatic Clustering Algorithm with a Three-Part Mutualism Phase and Random Weighted Reflection Coefficient for High-Dimensional Datasets

Abiodun M. Ikotun and Absalom E. Ezugwu    

Resumen

Automatic clustering problems require clustering algorithms to automatically estimate the number of clusters in a dataset. However, the classical K-means requires the specification of the required number of clusters a priori. To address this problem, metaheuristic algorithms are hybridized with K-means to extend the capacity of K-means in handling automatic clustering problems. In this study, we proposed an improved version of an existing hybridization of the classical symbiotic organisms search algorithm with the classical K-means algorithm to provide robust and optimum data clustering performance in automatic clustering problems. Moreover, the classical K-means algorithm is sensitive to noisy data and outliers; therefore, we proposed the exclusion of outliers from the centroid update?s procedure, using a global threshold of point-to-centroid distance distribution for automatic outlier detection, and subsequent exclusion, in the calculation of new centroids in the K-means phase. Furthermore, a self-adaptive benefit factor with a three-part mutualism phase is incorporated into the symbiotic organism search phase to enhance the performance of the hybrid algorithm. A population size of 40+2?? 40 + 2 g was used for the symbiotic organism search (SOS) algorithm for a well distributed initial solution sample, based on the central limit theorem that the selection of the right sample size produces a sample mean that approximates the true centroid on Gaussian distribution. The effectiveness and robustness of the improved hybrid algorithm were evaluated on 42 datasets. The results were compared with the existing hybrid algorithm, the standard SOS and K-means algorithms, and other hybrid and non-hybrid metaheuristic algorithms. Finally, statistical and convergence analysis tests were conducted to measure the effectiveness of the improved algorithm. The results of the extensive computational experiments showed that the proposed improved hybrid algorithm outperformed the existing SOSK-means algorithm and demonstrated superior performance compared to some of the competing hybrid and non-hybrid metaheuristic algorithms.

 Artículos similares

       
 
Sejeong Kim and Jongho Park    
Recently, an Unmanned Aerial Vehicle (UAV)-based Wireless Sensor Network (WSN) for data collection was proposed. Multiple UAVs are more effective than a single UAV in wide WSNs. However, in this scenario, many factors must be considered, such as collisio... ver más
Revista: Aerospace

 
Xingchen Xu, Xingguang Geng, Zhixing Gao, Hao Yang, Zhiwei Dai and Haiying Zhang    
The accurate localization of S1 and S2 is essential for heart sound segmentation and classification. However, current direct heart sound segmentation algorithms have poor noise immunity and low accuracy. Therefore, this paper proposes a new optimal heart... ver más
Revista: Applied Sciences

 
Huichan Kim, Sunho Park and Seong-Yeob Jeong    
Growing interest in finding the optimal route through the arctic ocean, and sea ice concentration is also emerging as a factor to be considered. In this paper, an algorithm to calculate the sea ice concentration was developed based on the images taken du... ver más

 
Daniel S. Soper    
When designed correctly, radial basis function (RBF) neural networks can approximate mathematical functions to any arbitrary degree of precision. Multilayer perceptron (MLP) neural networks are also universal function approximators, but RBF neural networ... ver más
Revista: Algorithms

 
Evangelos D. Spyrou, Chrysostomos Stylios and Ioannis Tsoulos    
Air pollution is a pressing concern in urban areas, necessitating the critical monitoring of air quality to understand its implications for public health. Internet of Things (IoT) devices are widely utilized in air pollution monitoring due to their senso... ver más
Revista: Algorithms