Accounting For Microprosody In Modeling Intonation

Peter Birkholz, Xinyu Zhang

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 15:57

04 May 2020

Intonation models are often used for the generation of fundamental frequency (f0) contours in speech synthesis. Current intonation models only represent the intentional f0 components that are related to the phonological structure of the utterance. However, natural speech also contains non-intentional microvariations of f0, which are usually not accounted for. Here, we derived models for two forms of microvariations: the drop in f0 during voiced obstruents, and the increased f0 at the onset of vowels following voiceless obstruents. These models were applied to remove the microvariations of f0 in a database of natural speech before the f0 contours were reproduced with the Target Approximation Model. The previously removed microvariations were then superimposed on the modeled f0 contours. The resulting model f0 contours were significantly more similar to the original (natural) f0 contours than model contours that did not account for the microvariations. This approach might improve f0 modeling in future parametric speech synthesizers.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Accounting For Microprosody In Modeling Intonation

Peter Birkholz, Xinyu Zhang

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society