REVISTA
Future Internet

TODAS

Inicio / Future Internet / Vol: 14 Par: 2 (2022) / Art�culo

ART�CULO

TITULO

DA-GAN: Dual Attention Generative Adversarial Network for Cross-Modal Retrieval

Liewu Cai

Lei Zhu

Hongyan Zhang and Xinghui Zhu

Resumen

Cross-modal retrieval aims to search samples of one modality via queries of other modalities, which is a hot issue in the community of multimedia. However, two main challenges, i.e., heterogeneity gap and semantic interaction across different modalities, have not been solved efficaciously. Reducing the heterogeneous gap can improve the cross-modal similarity measurement. Meanwhile, modeling cross-modal semantic interaction can capture the semantic correlations more accurately. To this end, this paper presents a novel end-to-end framework, called Dual Attention Generative Adversarial Network (DA-GAN). This technique is an adversarial semantic representation model with a dual attention mechanism, i.e., intra-modal attention and inter-modal attention. Intra-modal attention is used to focus on the important semantic feature within a modality, while inter-modal attention is to explore the semantic interaction between different modalities and then represent the high-level semantic correlation more precisely. A dual adversarial learning strategy is designed to generate modality-invariant representations, which can reduce the cross-modal heterogeneity efficiently. The experiments on three commonly used benchmarks show the better performance of DA-GAN than these competitors.

Palabras claves

cross-model retrieval - deep representation learning - generative adversarial network - intra-modal attention - inter-modal attention

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 14 Parte: 2 (2022)

MATERIAS

INFRAESTRUCTURA

REVISTAS SIMILARES

Hydrology
Future Internet
ISPRS International Journal of Geo-Information

DOI

https://doi.org/10.3390/fi14020043

Art�culos similares

Extracting Spatio-Temporal Information from Chinese Archaeological Site Text

Acceso

Wenjing Yuan, Lin Yang, Qing Yang, Yehua Sheng and Ziyang Wang

Archaeological site text is the main carrier of archaeological data at present, which contains rich information. How to efficiently extract useful knowledge from the massive unstructured archaeological site texts is of great significance for the mining a... ver m�s

Revista: ISPRS International Journal of Geo-Information

IAGC: Interactive Attention Graph Convolution Network for Semantic Segmentation of Point Clouds in Building Indoor Environment

Acceso

Ruoming Zhai, Jingui Zou, Yifeng He and Liyuan Meng

Point-based networks have been widely used in the semantic segmentation of point clouds owing to the powerful 3D convolution neural network (CNN) baseline. Most of the current methods resort to intermediate regular representations for reorganizing the st... ver m�s

Revista: ISPRS International Journal of Geo-Information

Combining Deep Fully Convolutional Network and Graph Convolutional Neural Network for the Extraction of Buildings from Aerial Images

Acceso

Wenzhuo Zhang, Mingyang Yu, Xiaoxian Chen, Fangliang Zhou, Jie Ren, Haiqing Xu and Shuai Xu

Deep learning technology, such as fully convolutional networks (FCNs), have shown competitive performance in the automatic extraction of buildings from high-resolution aerial images (HRAIs). However, there are problems of over-segmentation and internal c... ver m�s

Revista: Buildings

Detection of Schools in Remote Sensing Images Based on Attention-Guided Dense Network

Acceso

Han Fu, Xiangtao Fan, Zhenzhen Yan and Xiaoping Du

The detection of primary and secondary schools (PSSs) is a meaningful task for composite object detection in remote sensing images (RSIs). As a typical composite object in RSIs, PSSs have diverse appearances with complex backgrounds, which makes it diffi... ver m�s

Revista: ISPRS International Journal of Geo-Information

MFCNet: Mining Features Context Network for RGB?IR Person Re-Identification

Acceso

Jing Mei, Huahu Xu, Yang Li, Minjie Bian and Yuzhe Huang

RGB?IR cross modality person re-identification (RGB?IR Re-ID) is an important task for video surveillance in poorly illuminated or dark environments. In addition to the common challenge of Re-ID, the large cross-modality variations between RGB and IR ima... ver m�s

Revista: Future Internet

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles