Skip to main content

Unsupervised Voice Type Discrimination Score Adaptation Using X-vector Clusters

Mark R Lindsey (Carnegie Mellon University); Tyler Vuong (Carnegie Mellon University); Richard M Stern (Carnegie Mellon University)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
07 Jun 2023

Voice type discrimination (VTD) is the task of automatically detecting speech produced in the same room as a recording device ("live speech") among other speech and non-speech noises, such as traffic noises or radio broadcasts ("distractor audio"). Existing work has described methods for performing the VTD task. This paper presents a method for adapting the output of these existing methods in an unsupervised manner via x-vector clustering and correlation. This adaptation method can be applied to the output of any VTD algorithm, requires no additional training data, and has been shown to yield a relative decrease in decision cost function (DCF) score of up to 47% on a standardized database collected for the task.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00