Skip to main content

Prosody is Not Identity: A Speaker Anonymization Approach Using Prosody Cloning

Sarina Meyer (University of Stuttgart); Florian Lux (University of Stuttgart); Julia Koch (University of Stuttgart); Pavel Denisov (University of Stuttgart); Pascal Tilli (University of Stuttgart); Ngoc Thang Vu (University of Stuttgart)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
07 Jun 2023

Prosody is closely linked to the identity of a speaker, leading to individual pitch and intonation patterns. Therefore, it is challenging in speaker anonymization to generate speech utterances that both keep the original audio's main prosodic structure and preserve the speaker's privacy. In this paper, we present a system that extends a speech-to-text-to-speech anonymization pipeline with prosody cloning and show how to control the cloning by multiplying pitch and energy sequences with random offset values. Using automatic and human evaluation, we find this combination to successfully overcome the privacy-utility trade-off for prosody by achieving high privacy and high pitch correlation scores. At the same time, the anonymized utterances prove to reproduce the original voice distinctiveness and content with high intelligibility and only a small loss in naturalness, making them suitable for downstream applications.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00