Prediction Of Voicing And The F0 Contour From Electromagnetic Articulography Data For Articulation-To-Speech Synthesis

Simon Stone, Philipp Schmidt, Peter Birkholz

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 14:37

04 May 2020

Articulation-to-speech synthesis based solely on supraglottal articulation requires some sort of intonation control. This paper examines to what extent the f0 contour of an utterance can be predicted from such supraglottal articulation data. To that end, three groups of machine learning models (support vector machines, kernel ridge regression and neural networks) were trained and evaluated on the mngu0 speech corpus con- taining synchronous articulatory and audio data. The best voiced/unvoiced/silence classification rates were achieved by a deep neural network with two hidden layers: 85.8 % with no look-ahead (important for on-line applications) and 86 % with a look-ahead of 50 ms. The best f0 prediction model without look-ahead scored a root-mean-square error (RMSE) (when compared to the original f0 contours) of 10.4 Hz using a neural network with one hidden layer, while the best prediction with a look-ahead of 50 ms was attained by kernel ridge regression and an RMSE of 10.3 Hz. The predicted f0 contours were also subjectively evaluated in a listening test by manipulating the f0 of the original speech files using PRAAT. The results are consistent with the objective evaluation.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Prediction Of Voicing And The F0 Contour From Electromagnetic Articulography Data For Articulation-To-Speech Synthesis

Simon Stone, Philipp Schmidt, Peter Birkholz

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society