Inicio  /  Applied Sciences  /  Vol: 13 Par: 22 (2023)  /  Artículo
ARTÍCULO
TITULO

Vision-Language Models for Zero-Shot Classification of Remote Sensing Images

Mohamad Mahmoud Al Rahhal    
Yakoub Bazi    
Hebah Elgibreen and Mansour Zuair    

Resumen

Zero-shot classification presents a challenge since it necessitates a model to categorize images belonging to classes it has not encountered during its training phase. Previous research in the field of remote sensing (RS) has explored this task by training image-based models on known RS classes and then attempting to predict the outcomes for unfamiliar classes. Despite these endeavors, the outcomes have proven to be less than satisfactory. In this paper, we propose an alternative approach that leverages vision-language models (VLMs), which have undergone pre-training to grasp the associations between general computer vision image-text pairs in diverse datasets. Specifically, our investigation focuses on thirteen VLMs derived from Contrastive Language-Image Pre-Training (CLIP/Open-CLIP) with varying levels of parameter complexity. In our experiments, we ascertain the most suitable prompt for RS images to query the language capabilities of the VLM. Furthermore, we demonstrate that the accuracy of zero-shot classification, particularly when using large CLIP models, on three widely recognized RS scene datasets yields superior results compared to existing RS solutions.

 Artículos similares

       
 
Jun-Seong Kim, Kun-Woo Kim, Se-Ro Kim, Tae-Gyeong Woo, Joong-Wha Chung, Seong-Won Yang and Seong-Yong Moon    
Echocardiography is a medical examination that uses ultrasound to assess and diagnose the structure and function of the cardiac. Through the use of ultrasound waves, this examination allows medical professionals to create visualizations of the cardiac mu... ver más
Revista: Applied Sciences

 
Anibal Pedraza, Lucia Gonzalez, Oscar Deniz and Gloria Bueno    
HER2 overexpression is a prognostic and predictive factor observed in about 15% to 20% of breast cancer cases. The assessment of its expression directly affects the selection of treatment and prognosis. The measurement of HER2 status is performed by an e... ver más
Revista: Algorithms

 
Joel de Conceição Nogueira Diniz, Anselmo Cardoso de Paiva, Geraldo Braz Junior, João Dallyson Sousa de Almeida, Aristofanes Correa Silva, António Manuel Trigueiros da Silva Cunha and Sandra Cristina Alves Pereira da Silva Cunha    
Pathologies in concrete structures, such as cracks, splintering, efflorescence, corrosion spots, and exposed steel bars, can be visually evidenced on the concrete surface. This paper proposes a method for automatically detecting these pathologies from im... ver más
Revista: Applied Sciences

 
Hyunmin Gwak, Yongho Jeong, Chanyeong Kim, Yonghak Lee, Seongmin Yang and Sunghwan Kim    
The key to semi-supervised semantic segmentation is to assign the appropriate pseudo-label to the pixels of unlabeled images. Recently, various approaches to consistency-based training and the filtering of reliable pseudo-labels have shown remarkable res... ver más
Revista: Applied Sciences

 
Ahram Song    
Deep learning techniques have recently shown remarkable efficacy in the semantic segmentation of natural and remote sensing (RS) images. However, these techniques heavily rely on the size of the training data, and obtaining large RS imagery datasets is d... ver más
Revista: Aerospace