Inicio  /  Computers  /  Vol: 8 Par: 4 (2019)  /  Artículo
ARTÍCULO
TITULO

An Investigation of a Feature-Level Fusion for Noisy Speech Emotion Recognition

Sara Sekkate    
Mohammed Khalil    
Abdellah Adib and Sofia Ben Jebara    

Resumen

Because one of the key issues in improving the performance of Speech Emotion Recognition (SER) systems is the choice of an effective feature representation, most of the research has focused on developing a feature level fusion using a large set of features. In our study, we propose a relatively low-dimensional feature set that combines three features: baseline Mel Frequency Cepstral Coefficients (MFCCs), MFCCs derived from Discrete Wavelet Transform (DWT) sub-band coefficients that are denoted as DMFCC, and pitch based features. Moreover, the performance of the proposed feature extraction method is evaluated in clean conditions and in the presence of several real-world noises. Furthermore, conventional Machine Learning (ML) and Deep Learning (DL) classifiers are employed for comparison. The proposal is tested using speech utterances of both of the Berlin German Emotional Database (EMO-DB) and Interactive Emotional Dyadic Motion Capture (IEMOCAP) speech databases through speaker independent experiments. Experimental results show improvement in speech emotion detection over baselines.

 Artículos similares

       
 
Nikita Andriyanov    
The problem solved in the article is connected with the increase in the efficiency of phraseological radio exchange message recognition, which sometimes takes place in conditions of increased tension for the pilot. For high-quality recognition, signal pr... ver más
Revista: Algorithms

 
Zhichao Peng, Wenhua He, Yongwei Li, Yegang Du and Jianwu Dang    
Speech emotion recognition is a critical component for achieving natural human?robot interaction. The modulation-filtered cochleagram is a feature based on auditory modulation perception, which contains multi-dimensional spectral?temporal modulation repr... ver más
Revista: Applied Sciences

 
Musab T. S. Al-Kaltakchi, Ahmad Saeed Mohammad and Wai Lok Woo    
Speech separation is a well-known problem, especially when there is only one sound mixture available. Estimating the Ideal Binary Mask (IBM) is one solution to this problem. Recent research has focused on the supervised classification approach. The chall... ver más
Revista: Information

 
Mi Tian, Shengfa Yang and Peng Zhang    
The acoustic method, which enables continuous monitoring with great temporal resolution, is an alternative technique for detecting bedload movement. In order to record the sound signals produced by the impacts between gravel particles and detect the bedl... ver más
Revista: Water

 
Liangliang Cheng, Yunfeng Dou, Jian Zhou, Huabin Wang and Liang Tao    
Because of the acoustic characteristics of bone-conducted (BC) speech, BC speech can be enhanced to better communicate in a complex environment with high noise. Existing BC speech enhancement models have weak spectral recovery capability for the high-fre... ver más
Revista: Algorithms