Inicio  /  Algorithms  /  Vol: 16 Par: 2 (2023)  /  Artículo
ARTÍCULO
TITULO

Nemesis: Neural Mean Teacher Learning-Based Emotion-Centric Speaker

Aryan Yousefi and Kalpdrum Passi    

Resumen

Image captioning is the multi-modal task of automatically describing a digital image based on its contents and their semantic relationship. This research area has gained increasing popularity over the past few years; however, most of the previous studies have been focused on purely objective content-based descriptions of the image scenes. In this study, efforts have been made to generate more engaging captions by leveraging human-like emotional responses. To achieve this task, a mean teacher learning-based method has been applied to the recently introduced ArtEmis dataset. ArtEmis is the first large-scale dataset for emotion-centric image captioning, containing 455K emotional descriptions of 80K artworks from WikiArt. This method includes a self-distillation relationship between memory-augmented language models with meshed connectivity. These language models are trained in a cross-entropy phase and then fine-tuned in a self-critical sequence training phase. According to various popular natural language processing metrics, such as BLEU, METEOR, ROUGE-L, and CIDEr, our proposed model has obtained a new state of the art on ArtEmis.

 Artículos similares

       
 
Junting Wang, Tianhe Xu, Wei Huang, Liping Zhang, Jianxu Shu, Yangfan Liu and Linyang Li    
Underwater sound speed is one of the most significant factors that affects high-accuracy underwater acoustic positioning and navigation. Due to its complex temporal variation, the forecasting of the underwater sound speed field (SSF) becomes a challengin... ver más

 
Yi Lu, Dongyan Wei and Hong Yuan    
Magnetic positioning is a promising technique for vehicles in Global Navigation Satellite System (GNSS)-denied scenarios. Traditional magnetic positioning methods resolve the position coordinates by calculating the similarity between the measured sequenc... ver más
Revista: Applied Sciences

 
Jingxiong Lei, Xuzhi Liu, Haolang Yang, Zeyu Zeng and Jun Feng    
High-resolution remote sensing images (HRRSI) have important theoretical and practical value in urban planning. However, current segmentation methods often struggle with issues like blurred edges and loss of detailed information due to the intricate back... ver más
Revista: Applied Sciences

 
Syed As-Sadeq Tahfim and Yan Chen    
Severe and fatal crashes involving large trucks result in significant social and economic losses for human society. Unfortunately, the notably low proportion of severe and fatal injury crashes involving large trucks creates an imbalance in crash data. Mo... ver más
Revista: Information

 
Chi Han, Wei Xiong and Ronghuan Yu    
Mega-constellation network traffic forecasting provides key information for routing and resource allocation, which is of great significance to the performance of satellite networks. However, due to the self-similarity and long-range dependence (LRD) of m... ver más
Revista: Aerospace