Employing Robust Principal Component Analysis for Noise-Robust Speech Feature Extraction in Automatic Speech Recognition with the Structure of a Deep Neural Network

Jeih-weih Hung

Jung-Shan Lin and Po-Jen Wu

Resumen

In recent decades, researchers have been focused on developing noise-robust methods in order to compensate for noise effects in automatic speech recognition (ASR) systems and enhance their performance. In this paper, we propose a feature-based noise-robust method that employs a novel data analysis technique?robust principal component analysis (RPCA). In the proposed scenario, RPCA is employed to process a noise-corrupted speech feature matrix, and the obtained sparse partition is shown to reveal speech-dominant characteristics. One apparent advantage of using RPCA for enhancing noise robustness is that no prior knowledge about the noise is required. The proposed RPCA-based method is evaluated with the Aurora-4 database and a task using a state-of-the-art deep neural network (DNN) architecture as the acoustic models. The evaluation results indicate that the newly proposed method can provide the original speech feature with significant recognition accuracy improvement, and can be cascaded with mean normalization (MN), mean and variance normalization (MVN), and relative spectral (RASTA)?three well-known and widely used feature robustness algorithms?to achieve better performance compared with the individual component method.

Palabras claves

robust principal component analysis - noise robustness - filter-bank features - mel-frequency cepstral coefficients - deep neural network

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 1 Parte: 3 (2018)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Algorithms
Information
Applied Sciences

DOI

https://doi.org/10.3390/asi1030028

Art�culos similares

A Novel Bi-Dual Inference Approach for Detecting Six-Element Emotions

Acceso

Xiaoping Huang, Yujian Zhou and Yajun Du

In recent years, there has been rapid development in machine learning for solving artificial intelligence tasks in various fields, including translation, speech, and image processing. These AI tasks are often interconnected rather than independent. One s... ver m�s

Revista: Applied Sciences

Es-Tacotron2: Multi-Task Tacotron 2 with Pre-Trained Estimated Network for Reducing the Over-Smoothness Problem

Acceso

Yifan Liu and Jin Zheng

Text-to-speech synthesis is a computational technique for producing synthetic, human-like speech by a computer. In recent years, speech synthesis techniques have developed, and have been employed in many applications, such as automatic translation applic... ver m�s

Revista: Information

Employing Robust Principal Component Analysis for Noise-Robust Speech Feature Extraction in Automatic Speech Recognition with the Structure of a Deep Neural Network

Acceso

Jeih-weih Hung, Jung-Shan Lin and Po-Jen Wu

Revista: Applied System Innovation

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas