Multi-Channel Target Speech Extraction With Channel Decorrelation And Target Speaker Adaptation

Jiangyu Han, Xinyuan Zhou, Yanhua Long, Yijie Li

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:12:46

09 Jun 2021

The end-to-end approaches for single-channel target speech extraction have attracted widespread attention. However, the studies for end-to-end multi-channel target speech extraction are still relatively limited. In this work, we propose two methods for exploiting the multi-channel spatial information to extract the target speech. The first one is using a target speech adaptation layer in a parallel encoder architecture. The second one is designing a channel decorrelation mechanism to extract the inter-channel differential information to enhance the multi-channel encoder representation. We compare the proposed methods with two strong state-of-the-art baselines. Experimental results on the multi-channel reverberant WSJ0 2-mix dataset demonstrate that our proposed methods achieve up to 11.2% and 11.5% relative improvements in SDR and SiSDR, respectively.

Chairs:

Dorothea Kolossa

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

Multi-Channel Target Speech Extraction With Channel Decorrelation And Target Speaker Adaptation

Jiangyu Han, Xinyuan Zhou, Yanhua Long, Yijie Li

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Keynote: Innovating for Product Sustainability – Making Data Centers Greener

Panel: Navigating Green: Regulatory Insights and Compliance Strategies for Building a Sustainable Future

Sustainability Start-up Pitch Competition

Join the IEEE Signal Processing Society