Inicio  /  Applied Sciences  /  Vol: 13 Par: 17 (2023)  /  Artículo
ARTÍCULO
TITULO

Improving Dimensionality Reduction Projections for Data Visualization

Bardia Rafieian    
Pedro Hermosilla and Pere-Pau Vázquez    

Resumen

In data science and visualization, dimensionality reduction techniques have been extensively employed for exploring large datasets. These techniques involve the transformation of high-dimensional data into reduced versions, typically in 2D, with the aim of preserving significant properties from the original data. Many dimensionality reduction algorithms exist, and nonlinear approaches such as the t-SNE (t-Distributed Stochastic Neighbor Embedding) and UMAP (Uniform Manifold Approximation and Projection) have gained popularity in the field of information visualization. In this paper, we introduce a simple yet powerful manipulation for vector datasets that modifies their values based on weight frequencies. This technique significantly improves the results of the dimensionality reduction algorithms across various scenarios. To demonstrate the efficacy of our methodology, we conduct an analysis on a collection of well-known labeled datasets. The results demonstrate improved clustering performance when attempting to classify the data in the reduced space. Our proposal presents a comprehensive and adaptable approach to enhance the outcomes of dimensionality reduction for visual data exploration.

 Artículos similares

       
 
Abdek Hassan Aden    
At the center of the Republic of Djibouti, an eroded rift called Asal is located where tectonic and magmatic activities can be observed at the surface. Multiple studies were carried out with different exploration methods, such as structural, geophysical ... ver más
Revista: Applied Sciences

 
Lexin Zhang, Ruihan Wang, Zhuoyuan Li, Jiaxun Li, Yichen Ge, Shiyun Wa, Sirui Huang and Chunli Lv    
This research introduces a novel high-accuracy time-series forecasting method, namely the Time Neural Network (TNN), which is based on a kernel filter and time attention mechanism. Taking into account the complex characteristics of time-series data, such... ver más
Revista: Information

 
Hadeel Alsolai and Marc Roper    
Various prediction models have been proposed by researchers to predict the change-proneness of classes based on source code metrics. However, some of these models suffer from low prediction accuracy because datasets exhibit high dimensionality or imbalan... ver más
Revista: Applied Sciences

 
Sifat Muin and Khalid M. Mosalam    
Machine learning (ML)-aided structural health monitoring (SHM) can rapidly evaluate the safety and integrity of the aging infrastructure following an earthquake. The conventional damage features used in ML-based SHM methodologies face the curse of dimens... ver más
Revista: Applied Sciences

 
Zeynel Cebeci and Cagatay Cebeci    
The goal of partitioning clustering analysis is to divide a dataset into a predetermined number of homogeneous clusters. The quality of final clusters from a prototype-based partitioning algorithm is highly affected by the initially chosen centroids. In ... ver más
Revista: Information