STATISTICAL PYRAMID DENSE TIME DELAY NEURAL NETWORK FOR SPEAKER VERIFICATION

Zi-Kai Wan, Qing-Hua Ren, You-Cai Qin, Qi-Rong Mao

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:34

11 May 2022

Recently, speaker verification (SV) techniques relay on deep learning frameworks to extract more informative embedding vectors, which greatly improves the accuracy compared with traditional machine learning methods. The well-known x-vector architecture, a time delay neural network (TDNN), is widely adapted for SV tasks. However, most of existing variants rarely combines the global and sub-region context information and suffer from the local receptive field that is engendered by the standard convolutional operation. In this paper, we propose statistical pyramid dense TDNN (SPD-TDNN) with the statistical pyramid pooling module which captures the context information. Specifically, the developed module adaptively exchanges information among contextual regions from different perspectives, which correspond to multiple parallel branches. The statistics collected by the global-region branch are comprised of mean and standard deviation across the time domain to acquire the more global context information. Extensive experiments on the VoxCeleb1&2 datasets demonstrate that the proposed PSD-TDNN outperforms corresponding D-TDNN, D-TDNN-SS and ECAPA-TDNN which achieve the state-of-the-art performances on the SV task, with similar model complexity.

Tags:

global context information

speaker verification

statistical pyramid pooling block

spd-tdnn

STATISTICAL PYRAMID DENSE TIME DELAY NEURAL NETWORK FOR SPEAKER VERIFICATION

Zi-Kai Wan, Qing-Hua Ren, You-Cai Qin, Qi-Rong Mao

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Few-Shot Lip-Password Based Speaker Verification

A BRIDGE BETWEEN FEATURES AND EVIDENCE FOR BINARY ATTRIBUTE-DRIVEN PERFECT PRIVACY

LEARNABLE NONLINEAR COMPRESSION FOR ROBUST SPEAKER VERIFICATION

Join the IEEE Signal Processing Society