Inicio  /  Applied Sciences  /  Vol: 9 Par: 15 (2019)  /  Artículo
ARTÍCULO
TITULO

Hierarchical Semantic Loss and Confidence Estimator for Visual-Semantic Embedding-Based Zero-Shot Learning

Sanghyun Seo and Juntae Kim    

Resumen

Traditional supervised learning is dependent on the label of the training data, so there is a limitation that the class label which is not included in the training data cannot be recognized properly. Therefore, zero-shot learning, which can recognize unseen-classes that are not used in training, is gaining research interest. One approach to zero-shot learning is to embed visual data such as images and rich semantic data related to text labels of visual data into a common vector space to perform zero-shot cross-modal retrieval on newly input unseen-class data. This paper proposes a hierarchical semantic loss and confidence estimator to more efficiently perform zero-shot learning on visual data. Hierarchical semantic loss improves learning efficiency by using hierarchical knowledge in selecting a negative sample of triplet loss, and the confidence estimator estimates the confidence score to determine whether it is seen-class or unseen-class. These methodologies improve the performance of zero-shot learning by adjusting distances from a semantic vector to visual vector when performing zero-shot cross-modal retrieval. Experimental results show that the proposed method can improve the performance of zero-shot learning in terms of hit@k accuracy.

 Artículos similares

       
 
Fulin Han, Liang Huo, Tao Shen, Xiaoyong Zhang, Tianjia Zhang and Na Ma    
In the study of 3D route scene construction, the expression of key targets needs to be highlighted. This is because compared with the 3D model, the abstract 3D symbols can reflect the number and spatial distribution characteristics of entities more intui... ver más
Revista: Applied Sciences

 
Wirapong Chansanam, Kanyarat Kwiecien, Marut Buranarach and Kulthida Tuamsuk    
This research was aimed at constructing a thesaurus of the ethnic groups in the Mekong River Basin that is a compilation of controlled vocabularies of both Thai and English language, with a digital platform that enables semantic search and linked open da... ver más
Revista: Informatics

 
Shurong Sheng, Katrien Laenen, Luc Van Gool and Marie-Francine Moens    
In this paper, we target the tasks of fine-grained image?text alignment and cross-modal retrieval in the cultural heritage domain as follows: (1) given an image fragment of an artwork, we retrieve the noun phrases that describe it; (2) given a noun phras... ver más
Revista: Computers

 
Adam Wawrzynski and Julian Szymanski    
To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such ... ver más
Revista: Applied Sciences

 
Stanislav Mikoni     Pág. 21 - 26
When constructing a hierarchical structure of indicators of a complex object, the problem of establishing links between generalized and particular indicators is solved. The problem of the ambiguity of the interpretation of the concepts corresponding to t... ver más