Performance Fairness for Speech and Speaker Recognition & Graph Label Propagation for Cross-Utterance Rescoring

Andreas Stolcke

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 1:10:40

21 Dec 2023

The talk will give an overview of recent work on reducing performance disparities between speaker cohorts, for both speaker recognition and speech recognition. Such disparities are typically the result of uneven representation of cohorts in the training data, and I will survey a range of techniques that all aim to counter that unevenness in one way or another. In the second part of the talk I will introduce applications of the graph label propagation algorithm, first in a relatively straightforward manner, to speaker identification using unlabeled training data. Finally, I will show how graph-LP can be applied to ASR N-best rescoring to incorporate cross-utterance similarity into the decision process. Here the use case for graph-LP is less obvious, but gives very promising results for train/test mismatch conditions, while also reducing accuracy disparities between speaker cohorts.

Tags:

machine learning

speech processing

fairness

signal processing