ARTÍCULO
TITULO

Review of existing text-to-speech algorithms

Nikita Kireev    
Eugene Ilyushin    

Resumen

Scientists have long been working on algorithms for translate text written in natural language into speech. But the quality of work these algorithms left much to be desired until the moment when the application of deep learning methods was not possible. With the advent of the necessary computing resources and the accumulation of a sufficient amount of data for training, these methods have become widely used in machine learning in general and, of course, in speech synthesis in particular. A significant improvement in the quality of the work of text-to-speech algorithms has led to their widespread use, namely in mobile devices, smart speakers, voice assistants, etc. But it is worth noting that the algorithms of this class, developed at the moment, do not always correctly cope with the task. For example, they cannot always correctly emphasize or voice the necessary parts of the text with the necessary intonation. Thus, the study of methods and means of synthesizing speech has become even more relevant.There are many different ways to synthesize speech by text, such as parametric synthesis, compilation synthesis, subject-oriented synthesis, and full speech synthesis by the rules. The purpose of this work is to review existing algorithms for translating text to speech and conducting their comparative analysis. The main algorithms were considered: WaveNet, DeepVoice, Tacatron, DeepVoice 2, DeepVoice 3 and Tacatron 2. In the course of their comparison, it was determined that the best at the moment are DeepVoice 3 and Tacatron 2, since the assessments of the quality of their work are closest to professionally recorded speech.

 Artículos similares

       
 
Martin Wynn and Christian Weber    
The development and implementation of information systems strategy in multi-national corporations (MNCs) faces particular challenges?cultural differences and variations in work values and practices across different countries, numerous technology landscap... ver más
Revista: Information

 
Al Tariq Sheik, Carsten Maple, Gregory Epiphaniou and Mehrdad Dianati    
Cloud-Assisted Connected and Autonomous Vehicles (CCAV) are set to revolutionise road safety, providing substantial societal and economic advantages. However, with the evolution of CCAV technology, security and privacy threats have increased. Although se... ver más
Revista: Information

 
Fahim Sufi    
GPT (Generative Pre-trained Transformer) represents advanced language models that have significantly reshaped the academic writing landscape. These sophisticated language models offer invaluable support throughout all phases of research work, facilitatin... ver más
Revista: Information

 
Thomas Rötger, Chris Eyers and Roberta Fusaro    
The request for faster and greener civil aviation is urging the worldwide scientific community and aerospace industry to develop a new generation of supersonic aircraft, which are expected to be environmentally sustainable, and to guarantee a high level ... ver más
Revista: Aerospace

 
Marwah Abdulrazzaq Naser, Aso Ahmed Majeed, Muntadher Alsabah, Taha Raad Al-Shaikhli and Kawa M. Kaky    
Cardiovascular disease is the leading cause of global mortality and responsible for millions of deaths annually. The mortality rate and overall consequences of cardiac disease can be reduced with early disease detection. However, conventional diagnostic ... ver más
Revista: Algorithms