REVISTA
Applied Sciences

TODAS

Inicio / Applied Sciences / Vol: 13 Par: 18 (2023) / Art�culo

ART�CULO

TITULO

Patch-Level Consistency Regularization in Self-Supervised Transfer Learning for Fine-Grained Image Recognition

Yejin Lee

Suho Lee and Sangheum Hwang

Resumen

Fine-grained image recognition aims to classify fine subcategories belonging to the same parent category, such as vehicle model or bird species classification. This is an inherently challenging task because a classifier must capture subtle interclass differences under large intraclass variances. Most previous approaches are based on supervised learning, which requires a large-scale labeled dataset. However, such large-scale annotated datasets for fine-grained image recognition are difficult to collect because they generally require domain expertise during the labeling process. In this study, we propose a self-supervised transfer learning method based on Vision Transformer (ViT) to learn finer representations without human annotations. Interestingly, it is observed that existing self-supervised learning methods using ViT (e.g., DINO) show poor patch-level semantic consistency, which may be detrimental to learning finer representations. Motivated by this observation, we propose a consistency loss function that encourages patch embeddings of the overlapping area between two augmented views to be similar to each other during self-supervised learning on fine-grained datasets. In addition, we explore effective transfer learning strategies to fully leverage existing self-supervised models trained on large-scale labeled datasets. Contrary to the previous literature, our findings indicate that training only the last block of ViT is effective for self-supervised transfer learning. We demonstrate the effectiveness of our proposed approach through extensive experiments using six fine-grained image classification benchmark datasets, including FGVC Aircraft, CUB-200-2011, Food-101, Oxford 102 Flowers, Stanford Cars, and Stanford Dogs. Under the linear evaluation protocol, our method achieves an average accuracy of 78.5%" role="presentation" style="position: relative;">78.5%78.5% 78.5 % , outperforming the existing transfer learning method, which yields 77.2%" role="presentation" style="position: relative;">77.2%77.2% 77.2 % .

Palabras claves

self-supervised learning - fine-grained image recognition - transfer learning - Vision Transformer

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 13 Parte: 18 (2023)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Journal of Marine Science and Engineering
Information
AI

DOI

https://doi.org/10.3390/app131810493

Art�culos similares

Individualized Stress Mobile Sensing Using Self-Supervised Pre-Training

Acceso

Tanvir Islam and Peter Washington

Stress is widely recognized as a major contributor to a variety of health issues. Stress prediction using biosignal data recorded by wearables is a key area of study in mobile sensing research because real-time stress prediction can enable digital interv... ver m�s

Revista: Applied Sciences

Self-Supervised Pre-Training Joint Framework: Assisting Lightweight Detection Network for Underwater Object Detection

Acceso

Zhuo Wang, Haojie Chen, Hongde Qin and Qin Chen

In the computer vision field, underwater object detection has been a challenging task. Due to the attenuation of light in a medium and the scattering of light by suspended particles in water, underwater optical images often face the problems of color dis... ver m�s

Revista: Journal of Marine Science and Engineering

DABaCLT: A Data Augmentation Bias-Aware Contrastive Learning Framework for Time Series Representation

Acceso

Yubo Zheng, Yingying Luo, Hengyi Shao, Lin Zhang and Lei Li

Contrastive learning, as an unsupervised technique, has emerged as a prominent method in time series representation learning tasks, serving as a viable solution to the scarcity of annotated data. However, the application of data augmentation methods duri... ver m�s

Revista: Applied Sciences

Regularized Contrastive Masked Autoencoder Model for Machinery Anomaly Detection Using Diffusion-Based Data Augmentation

Acceso

Esmaeil Zahedi, Mohamad Saraee, Fatemeh Sadat Masoumi and Mohsen Yazdinejad

Unsupervised anomalous sound detection, especially self-supervised methods, plays a crucial role in differentiating unknown abnormal sounds of machines from normal sounds. Self-supervised learning can be divided into two main categories: Generative and C... ver m�s

Revista: Algorithms

Tree Species Classification Based on Self-Supervised Learning with Multisource Remote Sensing Images

Acceso

Xueliang Wang, Nan Yang, Enjun Liu, Wencheng Gu, Jinglin Zhang, Shuo Zhao, Guijiang Sun and Jian Wang

In order to solve the problem of manual labeling in semi-supervised tree species classification, this paper proposes a pixel-level self-supervised learning model named M-SSL (multisource self-supervised learning), which takes the advantage of the informa... ver m�s

Revista: Applied Sciences

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles