Inicio  /  Applied Sciences  /  Vol: 13 Par: 18 (2023)  /  Artículo
ARTÍCULO
TITULO

Patch-Level Consistency Regularization in Self-Supervised Transfer Learning for Fine-Grained Image Recognition

Yejin Lee    
Suho Lee and Sangheum Hwang    

Resumen

Fine-grained image recognition aims to classify fine subcategories belonging to the same parent category, such as vehicle model or bird species classification. This is an inherently challenging task because a classifier must capture subtle interclass differences under large intraclass variances. Most previous approaches are based on supervised learning, which requires a large-scale labeled dataset. However, such large-scale annotated datasets for fine-grained image recognition are difficult to collect because they generally require domain expertise during the labeling process. In this study, we propose a self-supervised transfer learning method based on Vision Transformer (ViT) to learn finer representations without human annotations. Interestingly, it is observed that existing self-supervised learning methods using ViT (e.g., DINO) show poor patch-level semantic consistency, which may be detrimental to learning finer representations. Motivated by this observation, we propose a consistency loss function that encourages patch embeddings of the overlapping area between two augmented views to be similar to each other during self-supervised learning on fine-grained datasets. In addition, we explore effective transfer learning strategies to fully leverage existing self-supervised models trained on large-scale labeled datasets. Contrary to the previous literature, our findings indicate that training only the last block of ViT is effective for self-supervised transfer learning. We demonstrate the effectiveness of our proposed approach through extensive experiments using six fine-grained image classification benchmark datasets, including FGVC Aircraft, CUB-200-2011, Food-101, Oxford 102 Flowers, Stanford Cars, and Stanford Dogs. Under the linear evaluation protocol, our method achieves an average accuracy of 78.5%" role="presentation" style="position: relative;">78.5%78.5% 78.5 % , outperforming the existing transfer learning method, which yields 77.2%" role="presentation" style="position: relative;">77.2%77.2% 77.2 % .

 Artículos similares

       
 
Tanvir Islam and Peter Washington    
Stress is widely recognized as a major contributor to a variety of health issues. Stress prediction using biosignal data recorded by wearables is a key area of study in mobile sensing research because real-time stress prediction can enable digital interv... ver más
Revista: Applied Sciences

 
Zhuo Wang, Haojie Chen, Hongde Qin and Qin Chen    
In the computer vision field, underwater object detection has been a challenging task. Due to the attenuation of light in a medium and the scattering of light by suspended particles in water, underwater optical images often face the problems of color dis... ver más

 
Yubo Zheng, Yingying Luo, Hengyi Shao, Lin Zhang and Lei Li    
Contrastive learning, as an unsupervised technique, has emerged as a prominent method in time series representation learning tasks, serving as a viable solution to the scarcity of annotated data. However, the application of data augmentation methods duri... ver más
Revista: Applied Sciences

 
Esmaeil Zahedi, Mohamad Saraee, Fatemeh Sadat Masoumi and Mohsen Yazdinejad    
Unsupervised anomalous sound detection, especially self-supervised methods, plays a crucial role in differentiating unknown abnormal sounds of machines from normal sounds. Self-supervised learning can be divided into two main categories: Generative and C... ver más
Revista: Algorithms

 
Xueliang Wang, Nan Yang, Enjun Liu, Wencheng Gu, Jinglin Zhang, Shuo Zhao, Guijiang Sun and Jian Wang    
In order to solve the problem of manual labeling in semi-supervised tree species classification, this paper proposes a pixel-level self-supervised learning model named M-SSL (multisource self-supervised learning), which takes the advantage of the informa... ver más
Revista: Applied Sciences