CS-REP: MAKING SPEAKER VERIFICATION NETWORKS EMBRACING RE-PARAMETERIZATION

Ruiteng Zhang, Wenhuan Lu, Junhai Xu, Jianguo Wei, Lin Zhang, Yantao Ji, Xugang Lu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:51

10 May 2022

Automatic speaker verification (ASV) systems, which determine whether two speeches are from the same speaker, mainly focus on verification accuracy while ignoring inference speed. However, in real applications, both inference speed and verification accuracy are essential. This study proposes cross-sequential re-parameterization (CS-Rep), a novel topology re-parameterization strategy for multi-type networks, to increase the inference speed and verification accuracy of models. CS-Rep solves the problem that existing re-parameterization methods are not suitable for typical ASV backbones. When a model applies CS-Rep, the training-period network utilizes a multi-branch topology to capture speaker information, whereas the inference-period model converts to a time-delay neural network (TDNN)-like plain backbone with stacked TDNN layers to achieve the fast inference speed. Based on CS-Rep, an improved TDNN with friendly test and deployment called Rep-TDNN is proposed. Compared with the state-of-the-art model ECAPA-TDNN, Rep-TDNN increases the actual inference speed by about 50% and reduces the EER by 10%. The code and trained models are available at https://github.com/zrtlemontree/CS-Rep.

Tags:

speaker verification

cross-sequential transformation

inference speed

re-parameterization

CS-REP: MAKING SPEAKER VERIFICATION NETWORKS EMBRACING RE-PARAMETERIZATION

Ruiteng Zhang, Wenhuan Lu, Junhai Xu, Jianguo Wei, Lin Zhang, Yantao Ji, Xugang Lu

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Few-Shot Lip-Password Based Speaker Verification

A BRIDGE BETWEEN FEATURES AND EVIDENCE FOR BINARY ATTRIBUTE-DRIVEN PERFECT PRIVACY

MULTI-FEATURE INTEGRATION FOR SPEAKER EMBEDDING EXTRACTION

Join the IEEE Signal Processing Society