Recognition of Cross-Language Acoustic Emotional Valence Using Stacked Ensemble Learning

Kudakwashe Zvarevashe and Oludayo O. Olugbara

Resumen

Most of the studies on speech emotion recognition have used single-language corpora, but little research has been done in cross-language valence speech emotion recognition. Research has shown that the models developed for single-language speech recognition systems perform poorly when used in different environments. Cross-language speech recognition is a craving alternative, but it is highly challenging because the corpora used will have been recorded in different environments and under varying conditions. The differences in the quality of recording devices, elicitation techniques, languages, and accents of speakers make the recognition task even more arduous. In this paper, we propose a stacked ensemble learning algorithm to recognize valence emotion in a cross-language speech environment. The proposed ensemble algorithm was developed from random decision forest, AdaBoost, logistic regression, and gradient boosting machine and is therefore called RALOG. In addition, we propose feature scaling using random forest recursive feature elimination and a feature selection algorithm to boost the performance of RALOG. The algorithm has been evaluated against four widely used ensemble algorithms to appraise its performance. The amalgam of five benchmarked corpora has resulted in a cross-language corpus to validate the performance of RALOG trained with the selected acoustic features. The comparative analysis results have shown that RALOG gave better performance than the other ensemble learning algorithms investigated in this study.

Palabras claves

deep learning - ensemble learning - feature elimination - feature selection - speech emotion - speech recognition

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 13 Parte: 10 (2020)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Computation
Applied Sciences
Algorithms

DOI

https://doi.org/10.3390/a13100246

Art�culos similares

Analyzing Multi-Mode Fatigue Information from Speech and Gaze Data from Air Traffic Controllers

Acceso

Lin Xu, Shanxiu Ma, Zhiyuan Shen, Shiyu Huang and Ying Nan

In order to determine the fatigue state of air traffic controllers from air talk, an algorithm is proposed for discriminating the fatigue state of controllers based on applying multi-speech feature fusion to voice data using a Fuzzy Support Vector Machin... ver m�s

Revista: Aerospace

A Dual-Branch Speech Enhancement Model with Harmonic Repair

Acceso

Lizhen Jia, Yanyan Xu and Dengfeng Ke

Recent speech enhancement studies have mostly focused on completely separating noise from human voices. Due to the lack of specific structures for harmonic fitting in previous studies and the limitations of the traditional convolutional receptive field, ... ver m�s

Revista: Applied Sciences

A Systematic Evaluation of Recurrent Neural Network Models for Edge Intelligence and Human Activity Recognition Applications

Acceso

Varsha S. Lalapura, Veerender Reddy Bhimavarapu, J. Amudha and Hariram Selvamurugan Satheesh

The Recurrent Neural Networks (RNNs) are an essential class of supervised learning algorithms. Complex tasks like speech recognition, machine translation, sentiment classification, weather prediction, etc., are now performed by well-trained RNNs. Local o... ver m�s

Revista: Algorithms

Toward Effective Aircraft Call Sign Detection Using Fuzzy String-Matching between ASR and ADS-B Data

Acceso

Mohammed Sa�d Kasttet, Abdelouahid Lyhyaoui, Douae Zbakh, Adil Aramja and Abderazzek Kachkari

Recently, artificial intelligence and data science have witnessed dramatic progress and rapid growth, especially Automatic Speech Recognition (ASR) technology based on Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs). Consequently, new end-to-... ver m�s

Revista: Aerospace

ODIN112?AI-Assisted Emergency Services in Romania

Acceso

Dan Ungureanu, Stefan-Adrian Toma, Ion-Dorinel Filip, Bogdan-Costel Mocanu, Iulian Aciobani?ei, Bogdan Marghescu, Titus Balan, Mihai Dascalu, Ion Bica and Florin Pop

The evolution of Natural Language Processing technologies transformed them into viable choices for various accessibility features and for facilitating interactions between humans and computers. A subset of them consists of speech processing systems, such... ver m�s

Revista: Applied Sciences

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas