Inicio  /  Information  /  Vol: 13 Par: 12 (2022)  /  Artículo
ARTÍCULO
TITULO

DEGAIN: Generative-Adversarial-Network-Based Missing Data Imputation

Reza Shahbazian and Irina Trubitsyna    

Resumen

Insights and analysis are only as good as the available data. Data cleaning is one of the most important steps to create quality data decision making. Machine learning (ML) helps deal with data quickly, and to create error-free or limited-error datasets. One of the quality standards for cleaning the data includes handling the missing data, also known as data imputation. This research focuses on the use of machine learning methods to deal with missing data. In particular, we propose a generative adversarial network (GAN) based model called DEGAIN to estimate the missing values in the dataset. We evaluate the performance of the presented method and compare the results with some of the existing methods on publicly available Letter Recognition and SPAM datasets. The Letter dataset consists of 20,000 samples and 16 input features and the SPAM dataset consists of 4601 samples and 57 input features. The results show that the proposed DEGAIN outperforms the existing ones in terms of root mean square error and Frechet inception distance metrics.

 Artículos similares

       
 
Donghyuk Kum, Jichul Ryu, Yongchul Shin, Jihong Jeon, Jeongho Han, Kyoung Jae Lim and Jonggun Kim    
This study accounted for the importance of daily expansion flow data in compensating for insufficient flow data in a watershed. In particular, the 8-day interval flow measurement data (intermittent monitoring data) could cause uncertainty in the high- or... ver más
Revista: Water

 
Nikolaos Zafeiropoulos, Pavlos Bitilis, George E. Tsekouras and Konstantinos Kotis    
In the realm of Parkinson?s Disease (PD) research, the integration of wearable sensor data with personal health records (PHR) has emerged as a pivotal avenue for patient alerting and monitoring. This study delves into the complex domain of PD patient car... ver más
Revista: Information

 
Valerii Kozlovskyi, Ivan Shvets, Yurii Lysetskyi, Mikolaj Karpinski, Aigul Shaikhanova and Gulmira Shangytbayeva    
The classification of the natural and anthropogenic destabilizing factors of a telecommunications network as a complex system is presented herein. This research shows that to evaluate the parameters of a telecommunications network in the presence of dest... ver más
Revista: Information

 
Lixin Wang, Wenlei Sun, Jintao Zhao, Xuedong Zhang, Cheng Lu and Hao Luo    
As a critical raw material for the textile industry, cotton lint provides various types of cotton yarns, fabrics and finished products. However, due to the complexity of the supply chain and its many links, information records are often missing, inaccura... ver más
Revista: Applied Sciences

 
Bo Zhao, Qifan Zhang, Yangchun Liu, Yongzhi Cui and Baixue Zhou    
In response to the need for precision and intelligence in the assessment of transplanting machine operation quality, this study addresses challenges such as low accuracy and efficiency associated with manual observation and random field sampling for the ... ver más
Revista: Applied Sciences