Redirigiendo al acceso original de articulo en 22 segundos...
Inicio  /  Information  /  Vol: 13 Par: 10 (2022)  /  Artículo
ARTÍCULO
TITULO

Empirical Comparison between Deep and Classical Classifiers for Speaker Verification in Emotional Talking Environments

Ali Bou Nassif    
Ismail Shahin    
Mohammed Lataifeh    
Ashraf Elnagar and Nawel Nemmour    

Resumen

Speech signals carry various bits of information relevant to the speaker such as age, gender, accent, language, health, and emotions. Emotions are conveyed through modulations of facial and vocal expressions. This paper conducts an empirical comparison of performances between the classical classifiers: Gaussian Mixture Model (GMM), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Artificial neural networks (ANN); and the deep learning classifiers, i.e., Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Gated Recurrent Unit (GRU) in addition to the ivector approach for a text-independent speaker verification task in neutral and emotional talking environments. The deep models undergo hyperparameter tuning using the Grid Search optimization algorithm. The models are trained and tested using a private Arabic Emirati Speech Database, Ryerson Audio?Visual Database of Emotional Speech and Song dataset (RAVDESS) database, and a public Crowd-Sourced Emotional Multimodal Actors (CREMA) database. Experimental results illustrate that deep architectures do not necessarily outperform classical classifiers. In fact, evaluation was carried out through Equal Error Rate (EER) along with Area Under the Curve (AUC) scores. The findings reveal that the GMM model yields the lowest EER values and the best AUC scores across all datasets, amongst classical classifiers. In addition, the ivector model surpasses all the fine-tuned deep models (CNN, LSTM, and GRU) based on both evaluation metrics in the neutral, as well as the emotional speech. In addition, the GMM outperforms the ivector using the Emirati and RAVDESS databases.

 Artículos similares

       
 
Zeqin Tian, Dengfeng Chen and Liang Zhao    
Accurate building energy consumption prediction is a crucial condition for the sustainable development of building energy management systems. However, the highly nonlinear nature of data and complex influencing factors in the energy consumption of large ... ver más
Revista: Applied Sciences

 
Parisa Mahya and Johannes Fürnkranz    
Recently, some effort went into explaining intransparent and black-box models, such as deep neural networks or random forests. So-called model-agnostic methods typically approximate the prediction of the intransparent black-box model with an interpretabl... ver más
Revista: AI

 
Vadim Palchikovskiy, Aleksandr Kuznetsov, Igor Khramtsov and Oleg Kustov    
A comparison is considered of the experimentally obtained impedance of locally reacting acoustic liner samples with the impedance calculated using semi-empirical Goodrich, Sobolev and Eversman models. The semi-empirical impedance models are outlined. In ... ver más
Revista: Acoustics

 
Shanshan Chen, Sheng Guan, Hui Wang, Ningqi Ye and Zexun Wei    
Ship type identification is an important basis for ship management and monitoring. The paper proposed a new method of ship type identification by combining characteristic parameters from the energy difference between high and low frequencies and the sens... ver más

 
Fotios Bosmos, Alexandros T. Tzallas, Markos G. Tsipouras, Evripidis Glavas and Nikolaos Giannakeas    
The aim of this work is to highlight the possibilities of using VR applications in the informal learning process. This is attempted through the development of virtual reality cultural applications for historical monuments. For this purpose, the theoretic... ver más
Revista: Information