Redirigiendo al acceso original de articulo en 24 segundos...
Inicio  /  Applied Sciences  /  Vol: 13 Par: 2 (2023)  /  Artículo
ARTÍCULO
TITULO

Chinese Lip-Reading Research Based on ShuffleNet and CBAM

Yixian Fu    
Yuanyao Lu and Ran Ni    

Resumen

Lip reading has attracted increasing attention recently due to advances in deep learning. However, most research targets English datasets. The study of Chinese lip-reading technology is still in its initial stage. Firstly, in this paper, we expand the naturally distributed word-level Chinese dataset called ?Databox? previously built by our laboratory. Secondly, the current state-of-the-art model consists of a residual network and a temporal convolutional network. The residual network leads to excessive computational cost and is not suitable for the on-device applications. In the new model, the residual network is replaced with ShuffleNet, which is an extremely computation-efficient Convolutional Neural Network (CNN) architecture. Thirdly, to help the network focus on the most useful information, we insert a simple but effective attention module called Convolutional Block Attention Module (CBAM) into the ShuffleNet. In our experiment, we compare several model architectures and find that our model achieves a comparable accuracy to the residual network (3.5 GFLOPs) under the computational budget of 1.01 GFLOPs.

 Artículos similares

       
 
Yan Wang, Nan Guan, Jie Li and Xiaoli Wang    
Fourier ptychographic microscopy (FPM) is a computational imaging technology that has endless vitality and application potential in digital pathology. Colored pathological image analysis is the foundation of clinical diagnosis, basic research, and most b... ver más
Revista: Applied Sciences

 
Hexin Lu, Xiaodong Zhu, Jingwei Cui and Haifeng Jiang    
The process of iris recognition can result in a decline in recognition performance when the resolution of the iris images is insufficient. In this study, a super-resolution model for iris images, namely SwinGIris, which combines the Swin Transformer and ... ver más
Revista: Algorithms

 
Zongshun Wang, Ce Li, Jialin Ma, Zhiqiang Feng and Limei Xiao    
In this study, we introduce a novel framework for the semantic segmentation of point clouds in autonomous driving scenarios, termed PVI-Net. This framework uniquely integrates three different data perspectives?point clouds, voxels, and distance maps?exec... ver más
Revista: Information

 
Sakorn Mekruksavanich and Anuchit Jitpattanakul    
Smartphones have become ubiquitous, allowing people to perform various tasks anytime and anywhere. As technology continues to advance, smartphones can now sense and connect to networks, providing context-awareness for different applications. Many individ... ver más
Revista: Information

 
Nyo Me Htun, Toshiaki Owari, Satoshi Tsuyuki and Takuya Hiroshima    
High-value timber species with economic and ecological importance are usually distributed at very low densities, such that accurate knowledge of the location of these trees within a forest is critical for forest management practices. Recent technological... ver más
Revista: Algorithms