{NASE: A Chinese Benchmark for Evaluating Robustness of Spoken Language Understanding Models in Slot Filling

Meizheng Peng (Wuhan University); Xu Jia (Wuhan University); Min Peng (Wuhan University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Slot filling is a major problem in spoken language understanding (SLU) task. However, the current SLU models may experience performance degradation when encountering unfamiliar data in different datasets. Meanwhile, as recent models are becoming more complex, retraining the model in a new application scenario would be too expensive. So we think it is important to study the robustness and generalization capability of SLU models. Then we propose the Natural Adversarial Slot Evaluator (NASE), a benchmark with adversarial SLU data to evaluate the robustness and generalization capability of SLU models on the task of slot filling. Our experiments and analysis reveal that all of the six SLU models have a significant performance degradation on NASE. The further analysis points out that the models rely more on the context of the slots than the slot values themselves to make predictions. In addition, the widespread use of joint learning strategy makes unfamiliar intents also affect the slot filling. Based on our findings, we also propose a simple data augmentation method to improve the robustness of SLU models in slot filling. The F1 Scores improve up to about 30\% compared to the original model.

Tags:

Discourse and dialog

{NASE: A Chinese Benchmark for Evaluating Robustness of Spoken Language Understanding Models in Slot Filling

Meizheng Peng (Wuhan University); Xu Jia (Wuhan University); Min Peng (Wuhan University)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

SPASHT: Semantic and PrAgmatic SpeecH Features for automatic assessment of autism

Think before you speak: Concept-guided Explicit Persona Reasoning for Personalized Dialogue Generation

History, Present and Future: Enhancing Dialogue Generation with Few-shot History-Future Prompt

Join the IEEE Signal Processing Society