Skip to main content

Dialogue System with Missing Observation

Djallel Bouneffouf (IBM); mayank agarwal (ibm); Irina Rish (university of montreal)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
04 Jun 2023

Within the domain of dialogue, the ability to orchestrate multiple independently trained dialogue agents to create a unified system is of particular importance. Where we define orchestration as the task of selecting a subset of skills which most appropriately answer a user input using features extracted from both the user input and the individual skills. In this work, we study the task of online dialogue orchestration where the user feedback associated with the dialogue agent may not always be observed. In order to address the missing feedback setting, we propose to combine the attentive contextual bandit approach with an unsupervised learning mechanism such as clustering. By leveraging clustering to estimate missing reward, we are able to learn from each incoming event, even those with missing rewards. Promising empirical results are obtained on proprietary conversational datasets.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00