Es-Tacotron2: Multi-Task Tacotron 2 with Pre-Trained Estimated Network for Reducing the Over-Smoothness Problem

Yifan Liu and Jin Zheng

Resumen

Text-to-speech synthesis is a computational technique for producing synthetic, human-like speech by a computer. In recent years, speech synthesis techniques have developed, and have been employed in many applications, such as automatic translation applications and car navigation systems. End-to-end text-to-speech synthesis has gained considerable research interest, because compared to traditional models the end-to-end model is easier to design and more robust. Tacotron 2 is an integrated state-of-the-art end-to-end speech synthesis system that can directly predict closed-to-natural human speech from raw text. However, there remains a gap between synthesized speech and natural speech. Suffering from an over-smoothness problem, Tacotron 2 produced ?averaged? speech, making the synthesized speech sounds unnatural and inflexible. In this work, we first propose an estimated network (Es-Network), which captures general features from a raw mel spectrogram in an unsupervised manner. Then, we design Es-Tacotron2 by employing the Es-Network to calculate the estimated mel spectrogram residual, and setting it as an additional prediction task of Tacotron 2, to allow the model focus more on predicting the individual features of mel spectrogram. The experience shows that compared to the original Tacotron 2 model, Es-Tacotron2 can produce more variable decoder output and synthesize more natural and expressive speech.

Palabras claves

speech synthesis - over-smoothness problem - estimated network - multi-task learning - end-to-end

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 10 Parte: 4 (2019)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Applied Sciences
Aerospace
Algorithms

DOI

https://doi.org/10.3390/info10040131

Art�culos similares

Evaluation of Tacotron Based Synthesizers for Spanish and Basque

Acceso

V�ctor Garc�a, Inma Hern�ez and Eva Navas

In this paper, we describe the implementation and evaluation of Text to Speech synthesizers based on neural networks for Spanish and Basque. Several voices were built, all of them using a limited number of data. The system applies Tacotron 2 to compute m... ver m�s

Revista: Applied Sciences

A Reinforcement Learning Approach to Speech Coding

Acceso

Jerry Gibson and Hoontaek Oh

Speech coding is an essential technology for digital cellular communications, voice over IP, and video conferencing systems. For more than 25 years, the main approach to speech coding for these applications has been block-based analysis-by-synthesis line... ver m�s

Revista: Information

Analysis and Assessment of Controllability of an Expressive Deep Learning-Based TTS System

Acceso

No� Tits, Kevin El Haddad and Thierry Dutoit

In this paper, we study the controllability of an Expressive TTS system trained on a dataset for a continuous control. The dataset is the Blizzard 2013 dataset based on audiobooks read by a female speaker containing a great variability in styles and expr... ver m�s

Revista: Informatics

Application of sinusoidal speech modeling to the sound diarization problem

Acceso

Bulat Nutfullin,Eugene Ilyushin P�g. 14 - 20

Speech is a specific feature of human and his advantage over other species within evolution. Sound diarization is a process of sound separation, taking into account belonging to the speaker. Before the advent of deep learning and the ... ver m�s

Revista: International Journal of Open Information Technologies

Gated Recurrent Attention for Multi-Style Speech Synthesis

Acceso

Sung Jun Cheon, Joun Yeop Lee, Byoung Jin Choi, Hyeonseung Lee and Nam Soo Kim

End-to-end neural network-based speech synthesis techniques have been developed to represent and synthesize speech in various prosodic style. Although the end-to-end techniques enable the transfer of a style with a single vector of style representation, ... ver m�s

Revista: Applied Sciences

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas