Inicio  /  Algorithms  /  Vol: 16 Par: 1 (2023)  /  Artículo
ARTÍCULO
TITULO

Data Augmentation Methods for Enhancing Robustness in Text Classification Tasks

Huidong Tang    
Sayaka Kamei and Yasuhiko Morimoto    

Resumen

Text classification is widely studied in natural language processing (NLP). Deep learning models, including large pre-trained models like BERT and DistilBERT, have achieved impressive results in text classification tasks. However, these models? robustness against adversarial attacks remains an area of concern. To address this concern, we propose three data augmentation methods to improve the robustness of such pre-trained models. We evaluated our methods on four text classification datasets by fine-tuning DistilBERT on the augmented datasets and exposing the resulting models to adversarial attacks to evaluate their robustness. In addition to enhancing the robustness, our proposed methods can improve the accuracy and F1-score on three datasets. We also conducted comparison experiments with two existing data augmentation methods. We found that one of our proposed methods demonstrates a similar improvement in terms of performance, but all demonstrate a superior robustness improvement.

 Artículos similares

       
 
Sara Rajaram and Cassie S. Mitchell    
The ability to translate Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) into different modalities and data types is essential to improve Deep Learning (DL) for predictive medicine. This work presents DACMVA, a novel framework ... ver más
Revista: Information

 
Wenhao Sun, Yidong Zou, Yunhe Wang, Boyi Xiao, Haichuan Zhang and Zhihuai Xiao    
In the practical production environment, the complexity and variability of hydroelectric units often result in a need for more fault data, leading to inadequate accuracy in fault identification for data-driven intelligent diagnostic models. To address th... ver más
Revista: Water

 
François Legrand, Richard Macwan, Alain Lalande, Lisa Métairie and Thomas Decourselle    
Automated Cardiac Magnetic Resonance segmentation serves as a crucial tool for the evaluation of cardiac function, facilitating faster clinical assessments that prove advantageous for both practitioners and patients alike. Recent studies have predominant... ver más
Revista: Algorithms

 
Fabi Prezja, Leevi Annala, Sampsa Kiiskinen and Timo Ojala    
Diagnosing knee joint osteoarthritis (KOA), a major cause of disability worldwide, is challenging due to subtle radiographic indicators and the varied progression of the disease. Using deep learning for KOA diagnosis requires broad, comprehensive dataset... ver más
Revista: Algorithms

 
Daniel Rusche, Nils Englert, Marlen Runz, Svetlana Hetjens, Cord Langner, Timo Gaiser and Cleo-Aron Weis    
Background: In this study focusing on colorectal carcinoma (CRC), we address the imperative task of predicting post-surgery treatment needs by identifying crucial tumor features within whole slide images of solid tumors, analogous to locating a needle in... ver más
Revista: Applied Sciences