Skip to main content

Feature Enhancement With Deep Feature Losses For Speaker Verification

Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba, Nanxin Chen, Paola Garcia, Najim Dehak

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 10:55
04 May 2020

Speaker Verification still suffers from the challenge of generalization to novel adverse environments. We leverage on the recent advancements made by deep learning based speech enhancement and propose a feature-domain supervised denoising based solution. We propose to use Deep Feature Loss which optimizes the enhancement network in the hidden activation space of a pre-trained auxiliary speaker embedding network. We experimentally verify the approach on simulated and real data. A simulated testing setup is created using various noise types at different SNR levels. For evaluation on real data, we choose BabyTrain corpus which consists of children recordings in uncontrolled environments. We observe consistent gains in every condition over the state-of-the-art augmented Factorized-TDNN x-vector system. On BabyTrain corpus, we observe relative gains of 10.38% and 12.40% in minDCF and EER respectively.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00