Visually Grounded Dialogue

Verena Rieser

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 0:29:10

19 Jan 2021

With recent progress in deep learning, there has been an increased interest in visually grounded dialogue, which requires an AI agent to hold a meaningful conversation with humans in Natural Language about visual content in other modalities, e.g. pictures or videos. In this talk, I will present two case studies: one in generating responses for closed-domain task-based multimodal dialogue systems with applications in conversational multimodal search; and one case-study in selecting/ retrieving responses for open-domain multimodal systems with applications in visual dialogue and visual question answering.
Throughout my talk I will highlight open challenges for deep learning and beyond, including context modelling, knowledge grounding, encoding history, multimodal fusion, evaluation techniques, and shortcomings of current datasets.

Tags:

sps conference

slt 2021

Visually Grounded Dialogue

Verena Rieser

Value-Added Bundle(s) Including this Product

SLT 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society