Speech Data Explorer: Interactive Analysis Tool For Speech Datasets

Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:14:43

09 Jun 2021

Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) models require large labeled speech datasets for training. It is very important to have accurate reference transcripts that correspond to audio recordings. Otherwise, models might learn errors from training data and reproduce those errors during inference. We have developed Speech Data Explorer (SDE) to help examine quality of speech datasets and do interactive error analysis of ASR models’ predictions. Its core strengths include the following: - an interactive table that contains dataset’s utterances and supports filtering (thresholding) and sorting; - interactive visualization of metrics and a signal in time and frequency domains (with a built-in audio player); - easiness of extensibility (it is straightforward to add new metrics as table’s columns and have all interactive features). To the best of our knowledge, SDE is the first open source tool for interactive exploration of speech datasets and error analysis of ASR models’ predictions. It is implemented as a web application based on Plotly Dash framework. SDE is an essential tool for the analysis of speech datasets and ASR models in our own research. It has already helped us to quickly identify labeling issues in many public and commercial speech datasets, analyze accuracy of ASR models and construct new datasets (for example, Russian LibriSpeech [http://www.openslr.org/96/]). We believe that SDE with its interactivity and extensibility could be beneficial for the wide speech processing community. We will demonstrate how SDE could be used for: - interactive analysis of a speech dataset; - interactive error analysis of transcripts generated by an ASR model; - analysis with custom metrics that is useful for different tasks (for example, long utterance segmentation).

Tags:

signal processing society

IEEE icassp 2021

virtual conference

2021

sps

virtual conference icassp 2021

june 6-11 2021

icassp 2021

Speech Data Explorer: Interactive Analysis Tool For Speech Datasets

Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg

Value-Added Bundle(s) Including this Product

ICASSP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Welcome and Opening Remarks for the IEEE SustainTech Leadership Forum

Panel: Building Sustainable Cities for Tomorrow

Panel: Unleashing the Potential of Virtual Power Plants for Sustainable Energy Solutions

Join the IEEE Signal Processing Society