Tdmf: Task-Driven Multilevel Framework For End-To-End Speaker Verification
Chen Chen, Jiqing Han
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 13:51
In this paper, a task-driven multilevel framework (TDMF) is proposed for end-to-end speaker verification. The TDMF has four layers, and each layer has different effects on speaker models or representations to implement the functions of universal background model (UBM), Gaussian mixture model (GMM), total variability model (TVM) and probabilistic linear discriminant analysis (PLDA). Unlike the typical i-vector method, the proposed TDMF can supervise the optimal solution of each phase (layer) towards the direction required by the PLDA classifier. Moreover, different from most end-to-end neural network approaches, which extract embeddings first and then additionally calculate the distance between two embeddings as the verification score, the TDMF can directly provide scores via the fourth-layer PLDA. The experimental results show that the TDMF can achieve better performance than that of the typical i-vector framework and VGG-M convolutional neural networks (CNN) framework.