Inicio  /  Future Internet  /  Vol: 11 Par: 2 (2019)  /  Artículo
ARTÍCULO
TITULO

3D-CNN-Based Fused Feature Maps with LSTM Applied to Action Recognition

Sheeraz Arif    
Jing Wang    
Tehseen Ul Hassan and Zesong Fei    

Resumen

Human activity recognition is an active field of research in computer vision with numerous applications. Recently, deep convolutional networks and recurrent neural networks (RNN) have received increasing attention in multimedia studies, and have yielded state-of-the-art results. In this research work, we propose a new framework which intelligently combines 3D-CNN and LSTM networks. First, we integrate discriminative information from a video into a map called a ?motion map? by using a deep 3-dimensional convolutional network (C3D). A motion map and the next video frame can be integrated into a new motion map, and this technique can be trained by increasing the training video length iteratively; then, the final acquired network can be used for generating the motion map of the whole video. Next, a linear weighted fusion scheme is used to fuse the network feature maps into spatio-temporal features. Finally, we use a Long-Short-Term-Memory (LSTM) encoder-decoder for final predictions. This method is simple to implement and retains discriminative and dynamic information. The improved results on benchmark public datasets prove the effectiveness and practicability of the proposed method.

 Artículos similares

       
 
Shangmin Zhao, Jiao Liu, Weiming Cheng and Chenghu Zhou    
Multi-source data fusion can help to weaken the original data?s shortcomings while improving data accuracy. The experimental area in this research is Taiyuan City in Shanxi Province, China. Using SRTM1 DEM, ASTER GDEM V3, and AW3D30 DEM, the optimal reso... ver más

 
Hongtao Zhu, Huahu Xu, Xiaojin Ma and Minjie Bian    
Facial Expression Recognition (FER) can achieve an understanding of the emotional changes of a specific target group. The relatively small dataset related to facial expression recognition and the lack of a high accuracy of expression recognition are both... ver más
Revista: Future Internet

 
Qingtian Ke and Peng Zhang    
Change detection based on bi-temporal remote sensing images has made significant progress in recent years, aiming to identify the changed and unchanged pixels between a registered pair of images. However, most learning-based change detection methods only... ver más

 
Stefan Wagenpfeil, Binh Vu, Paul Mc Kevitt and Matthias Hemmje    
The indexing and retrieval of multimedia content is generally implemented by employing feature graphs. These graphs typically contain a significant number of nodes and edges to reflect the level of detail in feature detection. A higher level of detail in... ver más

 
Danveer Rajpal, Akhil Ranjan Garg, Om Prakash Mahela, Hassan Haes Alhelou and Pierluigi Siano    
Hindi is the official language of India and used by a large population for several public services like postal, bank, judiciary, and public surveys. Efficient management of these services needs language-based automation. The proposed model addresses the ... ver más
Revista: Future Internet