The pipeline system of ASR and NLU with MLM-based data augmentation toward STOP low-resource challenge

Hayato Futami (Sony Group Corporation); Jessica Huynh (Carnegie Mellon University); Siddhant Arora (Carnegie Mellon University); Shih-Lun Wu (Carnegie Mellon University); Yosuke Kashiwagi (Sony); Yifan Peng (Carnegie Mellon University); Brian Yan (Carnegie Mellon University); Emiru Tsunoo (Sony Group Corporation); Shinji Watanabe (Carnegie Mellon University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

10 Jun 2023

This paper describes our system for the low-resource domain adaptation track (Track 3) in Spoken Language Understanding Grand Challenge, which is a part of ICASSP Signal Processing Grand Challenge 2023. In the track, we adopt a pipeline approach of ASR and NLU. For ASR, we fine-tune Whisper for each domain with upsampling. For NLU, we fine-tune BART on all the Track3 data and then on low-resource domain data. We apply masked LM (MLM) -based data augmentation, where some of input tokens and corresponding target labels are replaced using MLM. We also apply a retrieval-based approach, where model input is augmented with similar training samples. As a result, we achieved exact match (EM) accuracy 63.3/75.0 (average: 69.15) for reminder/weather domain, and won the 1st place at the challenge.

Tags:

Signal Processing for Communications and Networking

The pipeline system of ASR and NLU with MLM-based data augmentation toward STOP low-resource challenge

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

A Transformer-Based E2E SLU model for Improved Semantic Parsing

THE NERCSLIP-USTC SYSTEM FOR THE L3DAS23 CHALLENGE TASK2: 3D SOUND EVENT LOCALIZATION AND DETECTION (SELD)

Agile Radio Map Prediction Using Deep Learning

Join the IEEE Signal Processing Society