REVISTA
Applied Sciences

TODAS

Inicio / Applied Sciences / Vol: 12 Par: 12 (2022) / Art�culo

ART�CULO

TITULO

Developing a Speech Recognition System for Recognizing Tonal Speech Signals Using a Convolutional Neural Network

Sakshi Dua

Sethuraman Sambath Kumar

Yasser Albagory

Rajakumar Ramalingam

Ankur Dumka

Rajesh Singh

Mamoon Rashid

Anita Gehlot

Sultan S. Alshamrani and Ahmed Saeed AlGhamdi

Resumen

Deep learning-based machine learning models have shown significant results in speech recognition and numerous vision-related tasks. The performance of the present speech-to-text model relies upon the hyperparameters used in this research work. In this research work, it is shown that convolutional neural networks (CNNs) can model raw and tonal speech signals. Their performance is on par with existing recognition systems. This study extends the role of the CNN-based approach to robust and uncommon speech signals (tonal) using its own designed database for target research. The main objective of this research work was to develop a speech-to-text recognition system to recognize the tonal speech signals of Gurbani hymns using a CNN. Further, the CNN model, with six layers of 2DConv, 2DMax Pooling, and 256 dense layer units (Google?s TensorFlow service) was also used in this work, as well as Praat for speech segmentation. Feature extraction was enforced using the MFCC feature extraction technique, which extracts standard speech features and features of background music as well. Our study reveals that the CNN-based method for identifying tonal speech sentences and adding instrumental knowledge performs better than the existing and conventional approaches. The experimental results demonstrate the significant performance of the present CNN architecture by providing an 89.15% accuracy rate and a 10.56% WER for continuous and extensive vocabulary sentences of speech signals with different tones.

Palabras claves

machine learning - Convolutional neural network - speech recognition - feature extraction - Praat - word error rate - speech signal

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 12 Parte: 12 (2022)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

International Journal of Open Information Technologies
Applied Sciences
Aerospace

DOI

https://doi.org/10.3390/app12126223

Art�culos similares

A Preliminary Study of Robust Speech Feature Extraction Based on Maximizing the Probability of States in Deep Acoustic Models

Acceso

Li-Chia Chang and Jeih-Weih Hung

This study proposes a novel robust speech feature extraction technique to improve speech recognition performance in noisy environments. This novel method exploits the information provided by the original acoustic model in the automatic speech recognition... ver m�s

Revista: Applied System Innovation

Estimation of Confidence in the Dialogue based on Eye Gaze and Head Movement Information

Acceso

Cui Dewen, Matsufuji Akihiro, Liu Yi, Eri Sato- Shimokawa, Toru Yamaguchi P�g. 338 - 350

In human-robot interaction, human mental states in dialogue have attracted attention to human-friendly robots that support educational use. Although estimating mental states using speech and visual information has been conducted, it is still challenging ... ver m�s

Revista: Emitter: International Journal of Engineering Technology

Arabic Automatic Speech Recognition: A Systematic Literature Review

Acceso

Amira Dhouib, Achraf Othman, Oussama El Ghoul, Mohamed Koutheair Khribi and Aisha Al Sinani

Automatic Speech Recognition (ASR), also known as Speech-To-Text (STT) or computer speech recognition, has been an active field of research recently. This study aims to chart this field by performing a Systematic Literature Review (SLR) to give insight i... ver m�s

Revista: Applied Sciences

Risk Determination versus Risk Perception: A New Model of Reality for Human?Machine Autonomy

Acceso

William Lawless

We review the progress in developing a science of interdependence applied to the determinations and perceptions of risk for autonomous human?machine systems based on a case study of the Department of Defense?s (DoD) faulty determination of risk in a dron... ver m�s

Revista: Informatics

Closed-Loop Cognitive-Driven Gain Control of Competing Sounds Using Auditory Attention Decoding

Acceso

Ali Aroudi, Eghart Fischer, Maja Serman, Henning Puder and Simon Doclo

Recent advances have shown that it is possible to identify the target speaker which a listener is attending to using single-trial EEG-based auditory attention decoding (AAD). Most AAD methods have been investigated for an open-loop scenario, where AAD is... ver m�s

Revista: Algorithms

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas