Inicio  /  Applied Sciences  /  Vol: 14 Par: 3 (2024)  /  Artículo
ARTÍCULO
TITULO

Two-Stage Dimensionality Reduction for Social Media Engagement Classification

Jose Luis Vieira Sobrinho    
Flavio Henrique Teles Vieira and Alisson Assis Cardoso    

Resumen

The high dimensionality of real-life datasets is one of the biggest challenges in the machine learning field. Due to the increased need for computational resources, the higher the dimension of the input data is, the more difficult the learning task will be?a phenomenon commonly referred to as the curse of dimensionality. Laying the paper?s foundation based on this premise, we propose a two-stage dimensionality reduction (TSDR) method for data classification. The first stage extracts high-quality features to a new subset by maximizing the pairwise separation probability, with the aim of avoiding overlap between individuals from different classes that are close to one another, also known as the class masking problem. The second stage takes the previous resulting subset and transforms it into a reduced final space in a way that maximizes the distance between the cluster centers of different classes while also minimizing the dispersion of instances within the same class. Hence, the second stage aims to improve the accuracy of the succeeding classifier by lowering its sensitivity to an imbalanced distribution of instances between different classes. Experiments on benchmark and social media datasets show how promising the proposed method is over some well-established algorithms, especially regarding social media engagement classification.

 Artículos similares