Regularized Urdu Speech Recognition with Semi-Supervised Deep Learning

Mohammad Ali Humayun

Ibrahim A. Hameed

Syed Muslim Shah

Sohaib Hassan Khan

Irfan Zafar

Saad Bin Ahmed and Junaid Shuja

Resumen

Automatic Speech Recognition, (ASR) has achieved the best results for English, with end-to-end neural network based supervised models. These supervised models need huge amounts of labeled speech data for good generalization, which can be quite a challenge to obtain for low-resource languages like Urdu. Most models proposed for Urdu ASR are based on Hidden Markov Models (HMMs). This paper proposes an end-to-end neural network model, for Urdu ASR, regularized with dropout, ensemble averaging and Maxout units. Dropout and ensembles are averaging techniques over multiple neural network models while Maxout are units in a neural network which adapt their activation functions. Due to limited labeled data, Semi Supervised Learning (SSL) techniques are also incorporated to improve model generalization. Speech features are transformed into a lower dimensional manifold using an unsupervised dimensionality-reduction technique called Locally Linear Embedding (LLE). Transformed data along with higher dimensional features is used to train neural networks. The proposed model also utilizes label propagation-based self-training of initially trained models and achieves a Word Error Rate (WER) of 4% less than that reported as the benchmark on the same Urdu corpus using HMM. The decrease in WER after incorporating SSL is more significant with an increased validation data size.

Palabras claves

speech recognition - locally linear embedding - label propagation - Maxout - low resource languages

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 9 Parte: 9 (2019)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Water
Applied Sciences
Journal of Marine Science and Engineering

DOI

https://doi.org/10.3390/app9091956

Art�culos similares

Underwater Terrain Matching Method Based on Pulse-Coupled Neural Network for Unmanned Underwater Vehicles

Acceso

Pengyun Chen, Zhiru Li, Guangqing Liu, Ziyi Wang, Jiayu Chen, Shangyao Shi, Jian Shen and Lizhou Li

The positioning results of terrain matching in flat terrain areas will significantly deteriorate due to the influence of terrain nonlinearity and multibeam measurement noise. To tackle this problem, this study presents the Pulse-Coupled Neural Network (P... ver m�s