Inicio  /  Applied Sciences  /  Vol: 9 Par: 10 (2019)  /  Artículo
ARTÍCULO
TITULO

Enhanced Automatic Speech Recognition System Based on Enhancing Power-Normalized Cepstral Coefficients

Mohamed Tamazin    
Ahmed Gouda and Mohamed Khedr    

Resumen

Many new consumer applications are based on the use of automatic speech recognition (ASR) systems, such as voice command interfaces, speech-to-text applications, and data entry processes. Although ASR systems have remarkably improved in recent decades, the speech recognition system performance still significantly degrades in the presence of noisy environments. Developing a robust ASR system that can work in real-world noise and other acoustic distorting conditions is an attractive research topic. Many advanced algorithms have been developed in the literature to deal with this problem; most of these algorithms are based on modeling the behavior of the human auditory system with perceived noisy speech. In this research, the power-normalized cepstral coefficient (PNCC) system is modified to increase robustness against the different types of environmental noises, where a new technique based on gammatone channel filtering combined with channel bias minimization is used to suppress the noise effects. The TIDIGITS database is utilized to evaluate the performance of the proposed system in comparison to the state-of-the-art techniques in the presence of additive white Gaussian noise (AWGN) and seven different types of environmental noises. In this research, one word is recognized from a set containing 11 possibilities only. The experimental results showed that the proposed method provides significant improvements in the recognition accuracy at low signal to noise ratios (SNR). In the case of subway noise at SNR = 5 dB, the proposed method outperforms the mel-frequency cepstral coefficient (MFCC) and relative spectral (RASTA)?perceptual linear predictive (PLP) methods by 55% and 47%, respectively. Moreover, the recognition rate of the proposed method is higher than the gammatone frequency cepstral coefficient (GFCC) and PNCC methods in the case of car noise. It is enhanced by 40% in comparison to the GFCC method at SNR 0dB, while it is improved by 20% in comparison to the PNCC method at SNR -5dB.

 Artículos similares

       
 
Ruoyan Shi, Jianpeng Hu and Bo Lin    
Automatic program repair techniques based on deep neural networks have attracted widespread attention from researchers due to the high degree of automation and generality. However, there is a scarcity of high-quality labeled datasets available for traini... ver más
Revista: Applied Sciences

 
Yu Ning, Yong-Ping Jin, You-Duo Peng and Jian Yan    
Efficient underwater visual environment perception is the key to realizing the autonomous operation of underwater robots. Because of the complex and diverse underwater environment, the underwater images not only have different degrees of color cast but a... ver más

 
Zhaojin Yan, Guanghao Yang, Rong He, Hui Yang, Hui Ci and Ran Wang    
Automatic identification systems (AIS) provides massive ship trajectory data for maritime traffic management, route planning, and other research. In order to explore the valuable ship traffic characteristics contained implicitly in massive AIS data, a sh... ver más

 
Álvaro Rodríguez-Sanz and Luis Rubio Andrada    
An important and challenging question for airport operators is the management of airport capacity and demand. Airport capacity depends on the available infrastructure, external factors, and operating procedures. Investments in Air Traffic Management (ATM... ver más
Revista: Aerospace

 
Sara Zollini, Donatella Dominici, Maria Alicandro, María Cuevas-González, Eduard Angelats, Francesca Ribas and Gonzalo Simarro    
Coastal environments are dynamic ecosystems, constantly subject to erosion/accretion processes. Erosional trends have unfortunately been intensifying for decades due to anthropic factors and an accelerated sea level rise might exacerbate the problem. It ... ver más