Inicio  /  Algorithms  /  Vol: 16 Par: 9 (2023)  /  Artículo
ARTÍCULO
TITULO

Regularized Contrastive Masked Autoencoder Model for Machinery Anomaly Detection Using Diffusion-Based Data Augmentation

Esmaeil Zahedi    
Mohamad Saraee    
Fatemeh Sadat Masoumi and Mohsen Yazdinejad    

Resumen

Unsupervised anomalous sound detection, especially self-supervised methods, plays a crucial role in differentiating unknown abnormal sounds of machines from normal sounds. Self-supervised learning can be divided into two main categories: Generative and Contrastive methods. While Generative methods mainly focus on reconstructing data, Contrastive learning methods refine data representations by leveraging the contrast between each sample and its augmented version. However, existing Contrastive learning methods for anomalous sound detection often have two main problems. The first one is that they mostly rely on simple augmentation techniques, such as time or frequency masking, which may introduce biases due to the limited diversity of real-world sounds and noises encountered in practical scenarios (e.g., factory noises combined with machine sounds). The second issue is dimension collapsing, which leads to learning a feature space with limited representation. To address the first shortcoming, we suggest a diffusion-based data augmentation method that employs ChatGPT and AudioLDM. Also, to address the second concern, we put forward a two-stage self-supervised model. In the first stage, we introduce a novel approach that combines Contrastive learning and masked autoencoders to pre-train on the MIMII and ToyADMOS2 datasets. This combination allows our model to capture both global and local features, leading to a more comprehensive representation of the data. In the second stage, we refine the audio representations for each machine ID by employing supervised Contrastive learning to fine-tune the pre-trained model. This process enhances the relationship between audio features originating from the same machine ID. Experiments show that our method outperforms most of the state-of-the-art self-supervised learning methods. Our suggested model achieves an average AUC and pAUC of 94.39% and 87.93% on the DCASE 2020 Challenge Task2 dataset, respectively.

 Artículos similares

       
 
Zihang Xu and Chiawei Chu    
Ensuring the sustainability of transportation infrastructure for electric vehicles (e-trans) is increasingly imperative in the pursuit of decarbonization goals and addressing the pressing energy shortage. By prioritizing the development and maintenance o... ver más
Revista: Applied Sciences

 
Dawei Luo, Heng Zhou, Joonsoo Bae and Bom Yun    
Reliability and robustness are fundamental requisites for the successful integration of deep-learning models into real-world applications. Deployed models must exhibit an awareness of their limitations, necessitating the ability to discern out-of-distrib... ver más
Revista: Applied Sciences

 
Yubo Zheng, Yingying Luo, Hengyi Shao, Lin Zhang and Lei Li    
Contrastive learning, as an unsupervised technique, has emerged as a prominent method in time series representation learning tasks, serving as a viable solution to the scarcity of annotated data. However, the application of data augmentation methods duri... ver más
Revista: Applied Sciences

 
Somaiyeh Dehghan and Mehmet Fatih Amasyali    
BERT, the most popular deep learning language model, has yielded breakthrough results in various NLP tasks. However, the semantic representation space learned by BERT has the property of anisotropy. Therefore, BERT needs to be fine-tuned for certain down... ver más
Revista: Applied Sciences

 
Ji Zhang, Xiangze Jia, Zhen Wang, Yonglong Luo, Fulong Chen, Gaoming Yang and Lihui Zhao    
Skeleton-based action recognition depends on skeleton sequences to detect categories of human actions. In skeleton-based action recognition, the recognition of action scenes with more than one subject is named as interaction recognition. Different from t... ver más
Revista: Algorithms