Towards Data Selection On Tts Data For Children'S Speech Recognition

Wei Wang, Zhikai Zhou, Yizhou Lu, Hongji Wang, Chenpeng Du, Yanmin Qian

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:35

10 Jun 2021

Recent researches on both utterance-level and phone-level prosody modelling successfully improve the voice quality and naturalness in text-to-speech synthesis. However, most of them model the prosody with a unimodal distribution such like a single Gaussian, which is not reasonable enough. In this work, we focus on phone-level prosody modelling where we introduce a Gaussian mixture model(GMM) based mixture density network. Our experiments on the LJSpeech dataset demonstrate that GMM can better model the phone-level prosody than a single Gaussian. The subjective evaluations suggest that our method not only significantly improves the prosody diversity in synthetic speech without the need of manual control, but also achieves a better naturalness. We also find that using the additional mixture density network has only very limited influence on inference speed.

Chairs:

Abdelrahman Mohamed

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

Towards Data Selection On Tts Data For Children'S Speech Recognition

Wei Wang, Zhikai Zhou, Yizhou Lu, Hongji Wang, Chenpeng Du, Yanmin Qian

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

Panel: Charting the Course for Future-Ready Data Centers in the Era of Sustainability

Join the IEEE Signal Processing Society

Towards Data Selection On Tts Data For Children&#039;S Speech Recognition

Wei Wang, Zhikai Zhou, Yizhou Lu, Hongji Wang, Chenpeng Du, Yanmin Qian

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

Panel: Charting the Course for Future-Ready Data Centers in the Era of Sustainability

Join the IEEE Signal Processing Society

Towards Data Selection On Tts Data For Children'S Speech Recognition