REVISTA
Applied Sciences

TODAS

Inicio / Applied Sciences / Vol: 10 Par: 19 (2020) / Art�culo

ART�CULO

TITULO

Learn and Tell: Learning Priors for Image Caption Generation

Pei Liu

Dezhong Peng and Ming Zhang

Resumen

In this work, we propose a novel priors-based attention neural network (PANN) for image captioning, which aims at incorporating two kinds of priors, i.e., the probabilities being mentioned for local region proposals (PBM priors) and part-of-speech clues for caption words (POS priors), into a visual information extraction process at each word prediction. This work was inspired by the intuitions that region proposals have different inherent probabilities for image captioning, and that the POS clues bridge the word class (part-of-speech tag) with the categories of visual features. We propose new methods to extract these two priors, in which the PBM priors are obtained by computing the similarities between the caption feature vector and local feature vectors, while the POS priors are predicated at each step of word generation by taking the hidden state of the decoder as input. After that, these two kinds of priors are further incorporated into the PANN module of the decoder to help the decoder extract more accurate visual information for the current word generation. In our experiments, we qualitatively analyzed the proposed approach and quantitatively evaluated several captioning schemes with our PANN on the MS-COCO dataset. Experimental results demonstrate that our proposed method could achieve better performance as well as the effectiveness of the proposed network for image captioning.

Palabras claves

image captioning - image understanding - probability-being-mentioned prior - part-of-speech prior

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 10 Parte: 19 (2020)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Applied Sciences
Journal of Marine Science and Engineering
Algorithms

DOI

https://doi.org/10.3390/app10196942

Art�culos similares

SFS-AGGL: Semi-Supervised Feature Selection Integrating Adaptive Graph with Global and Local Information

Acceso

Yugen Yi, Haoming Zhang, Ningyi Zhang, Wei Zhou, Xiaomei Huang, Gengsheng Xie and Caixia Zheng

As the feature dimension of data continues to expand, the task of selecting an optimal subset of features from a pool of limited labeled data and extensive unlabeled data becomes more and more challenging. In recent years, some semi-supervised feature se... ver m�s

Revista: Information

Few-Shot Fine-Grained Image Classification: A Comprehensive Review

Acceso

Jie Ren, Changmiao Li, Yaohui An, Weichuan Zhang and Changming Sun

Few-shot fine-grained image classification (FSFGIC) methods refer to the classification of images (e.g., birds, flowers, and airplanes) belonging to different subclasses of the same species by a small number of labeled samples. Through feature representa... ver m�s

Revista: AI

Audio-Based Emotion Recognition Using Self-Supervised Learning on an Engineered Feature Space

Acceso

Peranut Nimitsurachat and Peter Washington

Emotion recognition models using audio input data can enable the development of interactive systems with applications in mental healthcare, marketing, gaming, and social media analysis. While the field of affective computing using audio data is rich, a m... ver m�s

Revista: AI

Higher Education Students? Task Motivation in the Generative Artificial Intelligence Context: The Case of ChatGPT

Acceso

Mohammad Hmoud, Hadeel Swaity, Nardin Hamad, Omar Karram and Wajeeh Daher

Artificial intelligence has been attracting the attention of educational researchers recently, especially ChatGPT as a generative artificial intelligence tool. The context of generative artificial intelligence could impact different aspects of students? ... ver m�s

Revista: Information

Bridging the Gap: Exploring Interpretability in Deep Learning Models for Brain Tumor Detection and Diagnosis from MRI Images

Acceso

Wandile Nhlapho, Marcellin Atemkeng, Yusuf Brima and Jean-Claude Ndogmo

The advent of deep learning (DL) has revolutionized medical imaging, offering unprecedented avenues for accurate disease classification and diagnosis. DL models have shown remarkable promise for classifying brain tumors from Magnetic Resonance Imaging (M... ver m�s

Revista: Information

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas