Inicio  /  Applied Sciences  /  Vol: 10 Par: 19 (2020)  /  Artículo
ARTÍCULO
TITULO

MAKEDONKA: Applied Deep Learning Model for Text-to-Speech Synthesis in Macedonian Language

Kostadin Mishev    
Aleksandra Karovska Ristovska    
Dimitar Trajanov    
Tome Eftimov and Monika Simjanoska    

Resumen

This paper presents MAKEDONKA, the first open-source Macedonian language synthesizer that is based on the Deep Learning approach. The paper provides an overview of the numerous attempts to achieve a human-like reproducible speech, which has unfortunately shown to be unsuccessful due to the work invisibility and lack of integration examples with real software tools. The recent advances in Machine Learning, the Deep Learning-based methodologies, provide novel methods for feature engineering that allow for smooth transitions in the synthesized speech, making it sound natural and human-like. This paper presents a methodology for end-to-end speech synthesis that is based on a fully-convolutional sequence-to-sequence acoustic model with a position-augmented attention mechanism?Deep Voice 3. Our model directly synthesizes Macedonian speech from characters. We created a dataset that contains approximately 20 h of speech from a native Macedonian female speaker, and we use it to train the text-to-speech (TTS) model. The achieved MOS score of 3.93 makes our model appropriate for application in any kind of software that needs text-to-speech service in the Macedonian language. Our TTS platform is publicly available for use and ready for integration.

 Artículos similares

       
 
Peranut Nimitsurachat and Peter Washington    
Emotion recognition models using audio input data can enable the development of interactive systems with applications in mental healthcare, marketing, gaming, and social media analysis. While the field of affective computing using audio data is rich, a m... ver más
Revista: AI

 
Jiahui Zhao, Zhibin Li, Pan Liu, Mingye Zhang     Pág. 115 - 142
Demand prediction plays a critical role in traffic research. The key challenge of traffic demand prediction lies in modeling the complex spatial dependencies and temporal dynamics. However, there is no mature and widely accepted concept to support the so... ver más

 
Aaron A. Akin, Gia Nguyen and Aleksey Y. Sheshukov    
Soil erosion by water on agricultural hillslopes leads to numerous environmental problems including reservoir sedimentation, loss of agricultural land, declines in drinking water quality, and requires deep understanding of underlying physical processes f... ver más
Revista: Water

 
Yashi Yang, Peng Zhang, Lingjun Wu and Qian Zhang    
High-pile foundation is a common form of deep foundation commonly used in ocean environments, such as docks and bridge sites. Aiming at the problem of bearing capacity of high pile foundations, this paper proposes the calculation of bearing capacity and ... ver más
Revista: Water

 
Jiarui Xia and Yongshou Dai    
Ground roll noise suppression is a crucial step in processing deep pre-stack seismic data. Recently, supervised deep learning methods have gained popularity in this field due to their ability to adaptively learn and extract powerful features. However, th... ver más
Revista: Applied Sciences