Inicio  /  Applied Sciences  /  Vol: 13 Par: 6 (2023)  /  Artículo
ARTÍCULO
TITULO

Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks

Javad Hassannataj Joloudari    
Abdolreza Marefat    
Mohammad Ali Nematollahi    
Solomon Sunday Oyelere and Sadiq Hussain    

Resumen

Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models from achieving satisfactory results. ID is the occurrence of a situation where the quantity of the samples belonging to one class outnumbers that of the other by a wide margin, making such models? learning process biased towards the majority class. In recent years, to address this issue, several solutions have been put forward, which opt for either synthetically generating new data for the minority class or reducing the number of majority classes to balance the data. Hence, in this paper, we investigate the effectiveness of methods based on Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) mixed with a variety of well-known imbalanced data solutions meaning oversampling and undersampling. Then, we propose a CNN-based model in combination with SMOTE to effectively handle imbalanced data. To evaluate our methods, we have used KEEL, breast cancer, and Z-Alizadeh Sani datasets. In order to achieve reliable results, we conducted our experiments 100 times with randomly shuffled data distributions. The classification results demonstrate that the mixed Synthetic Minority Oversampling Technique (SMOTE)-Normalization-CNN outperforms different methodologies achieving 99.08% accuracy on the 24 imbalanced datasets. Therefore, the proposed mixed model can be applied to imbalanced binary classification problems on other real datasets.

 Artículos similares

       
 
Zeyang Zhang, Zhongcai Pei, Zhiyong Tang and Fei Gu    
Traditional video object segmentation often has low detection speed and inaccurate results due to the jitter caused by the pan-and-tilt or hand-held devices. Deep neural network (DNN) has been widely adopted to address these problems; however, it relies ... ver más
Revista: Applied Sciences

 
Barbara Pes    
Class imbalance and high dimensionality are two major issues in several real-life applications, e.g., in the fields of bioinformatics, text mining and image classification. However, while both issues have been extensively studied in the machine learning ... ver más
Revista: Information