Non-Intrusive Binaural Prediction Of Speech Intelligibility Based On Phoneme Classification

Jana Roßbach, Saskia Röttges, Christopher F. Hauth, Thomas Brand, Bernd T. Meyer

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:38

09 Jun 2021

In this study, we explore an approach for modeling speech intelligibility in spatial acoustic scenes. To this end, we combine a non-intrusive binaural frontend with a deep neural network (DNN) borrowed from a standard automatic speech recognition (ASR) system. The DNN estimates phoneme probabilities that degrade in the presence of noise and reverberation, which is quantified with an entropy-based measure. The model output is used to predict speech recognition thresholds, i.e., signal-to-noise ratio with 50\% word recognition accuracy. It is compared to measured data obtained from eight normal-hearing listeners in acoustic scenarios with varying positions of localized maskers, different rooms and reverberation times. The model is non-intrusive; yet it produces a root mean squared error in the range of 0.6-2.1\,dB, which is similar to results obtained with a reference model (0.3-1.8\,dB) that uses oracle knowledge both in the frontend and in the backend stage.

Chairs:

Ina Kodrasi

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

Non-Intrusive Binaural Prediction Of Speech Intelligibility Based On Phoneme Classification

Jana Roßbach, Saskia Röttges, Christopher F. Hauth, Thomas Brand, Bernd T. Meyer

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Welcome and Opening Remarks for the IEEE SustainTech Leadership Forum

Panel: Building Sustainable Cities for Tomorrow

Panel: Unleashing the Potential of Virtual Power Plants for Sustainable Energy Solutions

Join the IEEE Signal Processing Society