Marblenet: Deep 1D Time-Channel Separable Convolutional Neural Network For Voice Activity Detection

Fei Jia, Somshubra Majumdar, Boris Ginsburg

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:12:29

10 Jun 2021

We present MarbleNet, an end-to-end neural network for Voice Activity Detection (VAD). MarbleNet is a deep residual network composed from blocks of 1D time-channel separable convolution, batch-normalization, ReLU and dropout layers. When compared to a state-of-the-art VAD model, MarbleNet is able to achieve similar performance with roughly 1/10-th the parameter cost. We further conduct extensive ablation studies on different training methods and choices of parameters in order to study the robustness of MarbleNet in real-world VAD tasks.

Chairs:

Douglas O&#039,Shaughnessy

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

Value-Added Bundle(s) Including this Product

11 Jun 2021

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

22 Nov 2024

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

SPS

Members: Free
IEEE Members: $25.00
Non-members: $40.00

22 Nov 2024

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

SPS

Members: Free
IEEE Members: $25.00
Non-members: $40.00

22 Nov 2024

Panel: Charting the Course for Future-Ready Data Centers in the Era of Sustainability

SPS

Members: Free
IEEE Members: $25.00
Non-members: $40.00