Accounting For Microprosody In Modeling Intonation
Peter Birkholz, Xinyu Zhang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 15:57
Intonation models are often used for the generation of fundamental frequency (f0) contours in speech synthesis. Current intonation models only represent the intentional f0 components that are related to the phonological structure of the utterance. However, natural speech also contains non-intentional microvariations of f0, which are usually not accounted for. Here, we derived models for two forms of microvariations: the drop in f0 during voiced obstruents, and the increased f0 at the onset of vowels following voiceless obstruents. These models were applied to remove the microvariations of f0 in a database of natural speech before the f0 contours were reproduced with the Target Approximation Model. The previously removed microvariations were then superimposed on the modeled f0 contours. The resulting model f0 contours were significantly more similar to the original (natural) f0 contours than model contours that did not account for the microvariations. This approach might improve f0 modeling in future parametric speech synthesizers.