Improving Sample-Efficiency In Reinforcement Learning For Dialogue Systems By Using Trainable-Action-Mask

Yen-Chen Wu, Bo-Hsiang Tseng, Carl Edward Rasmussen

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 12:23

04 May 2020

By interacting with human and learning from reward signals, reinforcement learning is an ideal way to build conversational AI. Concerning the expenses of real-users' responses, improving sample-efficiency has been the key issue when applying reinforcement learning in real-world spoken dialogue systems (SDS). Handcrafted action masks are commonly used to rule out impossible actions and accelerate the training process. However, the handcrafted action mask can barely be generalized to unseen domains. In this paper, we propose trainable-action-mask (TAM) which learns from data automatically without handcrafting complicated rules. In our experiments in Cambridge Restaurant domain, TAM requires only 30% of training data, compared with the baseline, to reach the 80% success rate and it also shows robustness to noisy environments.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Improving Sample-Efficiency In Reinforcement Learning For Dialogue Systems By Using Trainable-Action-Mask

Yen-Chen Wu, Bo-Hsiang Tseng, Carl Edward Rasmussen

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society