Exploring Subgroup Performance in End-to-End Speech Models

Alkis Koudounas (Politecnico di Torino); Eliana Pastor (Politecnico di Torino); Giuseppe Attanasio (Bocconi University); Vittorio Mazzia (Amazon Alexa AI); Manuel Giollo (Amazon); Thomas Gueudre (Amazon Alexa AI); Luca Cagliero (Dipartimento di Automatica e Informatica Politecnico di Torino); Luca de Alfaro (University of California, Santa Cruz); Elena Baralis (Politecnico di Torino); Daniele Amberti (Amazon Alexa AI)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

End-to-End Spoken Language Understanding models are generally evaluated according to their overall accuracy, or separately on (a priori defined) data subgroups of interest. We propose a technique for analyzing model performance at the subgroup level, which considers all subgroups that can be defined via a given set of metadata and are above a specified minimum size. The metadata can represent user characteristics, recording conditions, and speech targets. Our technique is based on advances in model bias analysis, enabling efficient exploration of resulting subgroups. A fine-grained analysis reveals how model performance varies across subgroups, identifying modeling issues or bias towards specific subgroups. We compare the subgroup-level performance of models based on wav2vec 2.0 and HuBERT on the Fluent Speech Commands dataset. The experimental results illustrate how subgroup-level analysis reveals a finer and more complete picture of performance changes when models are replaced, automatically identifying the subgroups that most benefit or fail to benefit from the change.

Tags:

Machine learning methods for language

Exploring Subgroup Performance in End-to-End Speech Models

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

A Sentiment and Syntactic-Aware Graph Convolutional Network for Aspect-level Sentiment Classification

SELF SUPERVISED BERT FOR LEGAL TEXT CLASSIFICATION

Estimating Shapley Values of Training Utterances for Automatic Speech Recognition Models

Join the IEEE Signal Processing Society