Inicio  /  Applied Sciences  /  Vol: 10 Par: 19 (2020)  /  Artículo
ARTÍCULO
TITULO

MAKEDONKA: Applied Deep Learning Model for Text-to-Speech Synthesis in Macedonian Language

Kostadin Mishev    
Aleksandra Karovska Ristovska    
Dimitar Trajanov    
Tome Eftimov and Monika Simjanoska    

Resumen

This paper presents MAKEDONKA, the first open-source Macedonian language synthesizer that is based on the Deep Learning approach. The paper provides an overview of the numerous attempts to achieve a human-like reproducible speech, which has unfortunately shown to be unsuccessful due to the work invisibility and lack of integration examples with real software tools. The recent advances in Machine Learning, the Deep Learning-based methodologies, provide novel methods for feature engineering that allow for smooth transitions in the synthesized speech, making it sound natural and human-like. This paper presents a methodology for end-to-end speech synthesis that is based on a fully-convolutional sequence-to-sequence acoustic model with a position-augmented attention mechanism?Deep Voice 3. Our model directly synthesizes Macedonian speech from characters. We created a dataset that contains approximately 20 h of speech from a native Macedonian female speaker, and we use it to train the text-to-speech (TTS) model. The achieved MOS score of 3.93 makes our model appropriate for application in any kind of software that needs text-to-speech service in the Macedonian language. Our TTS platform is publicly available for use and ready for integration.

 Artículos similares

       
 
Jiahui Zhao, Zhibin Li, Pan Liu, Mingye Zhang     Pág. 115 - 142
Demand prediction plays a critical role in traffic research. The key challenge of traffic demand prediction lies in modeling the complex spatial dependencies and temporal dynamics. However, there is no mature and widely accepted concept to support the so... ver más

 
Peranut Nimitsurachat and Peter Washington    
Emotion recognition models using audio input data can enable the development of interactive systems with applications in mental healthcare, marketing, gaming, and social media analysis. While the field of affective computing using audio data is rich, a m... ver más
Revista: AI

 
Andris Slavinskis, Mario F. Palos, Janis Dalbins, Pekka Janhunen, Martin Tajmar, Nickolay Ivchenko, Agnes Rohtsalu, Aldo Micciani, Nicola Orsini, Karl Mattias Moor, Sergei Kuzmin, Marcis Bleiders, Marcis Donerblics, Ikechukwu Ofodile, Johan Kütt, Tõnis Eenmäe, Viljo Allik, Jaan Viru, Pätris Halapuu, Katriin Kristmann, Janis Sate, Endija Briede, Marius Anger, Katarina Aas, Gustavs Plonis, Hans Teras, Kristo Allaje, Andris Vaivads, Lorenzo Niccolai, Marco Bassetto, Giovanni Mengali, Petri Toivanen, Iaroslav Iakubivskyi, Mihkel Pajusalu and Antti TammaddShow full author listremoveHide full author list    
The electric solar wind sail, or E-sail, is a propellantless interplanetary propulsion system concept. By deflecting solar wind particles off their original course, it can generate a propulsive effect with nothing more than an electric charge. The high-v... ver más
Revista: Aerospace

 
Yiheng Zhou, Kainan Ma, Qian Sun, Zhaoyuxuan Wang and Ming Liu    
Over the past several decades, deep neural networks have been extensively applied to medical image segmentation tasks, achieving significant success. However, the effectiveness of traditional deep segmentation networks is substantially limited by the sma... ver más
Revista: Information

 
Sara Rajaram and Cassie S. Mitchell    
The ability to translate Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) into different modalities and data types is essential to improve Deep Learning (DL) for predictive medicine. This work presents DACMVA, a novel framework ... ver más
Revista: Information