Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:09:55
10 May 2022

In this work, we study the effect of speaking style and audio condition variability between the spoken and singing voice on speaker recognition performance. Furthermore, we also explore the utility of domain adaptation for bridging the gap between multiple speaking styles (singing versus spoken) and improving overall speaker recognition performance. In that regard, we first extend a publicly available singing voice dataset, JukeBox, with corresponding spoken voice data and refer to it as JukeBox-V2. Next, we use domain adaptation for developing a speaker recognition method robust to varying speaking styles and audio conditions. Finally, we analyze the speech embeddings of domain-adapted models to explain their generalizability across varying speaking styles and audio conditions.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00