Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 13:51
04 May 2020

In this paper, a task-driven multilevel framework (TDMF) is proposed for end-to-end speaker verification. The TDMF has four layers, and each layer has different effects on speaker models or representations to implement the functions of universal background model (UBM), Gaussian mixture model (GMM), total variability model (TVM) and probabilistic linear discriminant analysis (PLDA). Unlike the typical i-vector method, the proposed TDMF can supervise the optimal solution of each phase (layer) towards the direction required by the PLDA classifier. Moreover, different from most end-to-end neural network approaches, which extract embeddings first and then additionally calculate the distance between two embeddings as the verification score, the TDMF can directly provide scores via the fourth-layer PLDA. The experimental results show that the TDMF can achieve better performance than that of the typical i-vector framework and VGG-M convolutional neural networks (CNN) framework.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00