ICIP 2017 Tutorial - Vision and Language: Bridging Vision and Language with Deep Learning [Part 2 of 2]

Jiebo Luo

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 01:28:02

12 Jan 2018

Author Bio/Abstract
Recognition of visual content has been a fundamental challenge in computer vision for decades, where previous research predominantly focused on understanding visual content using a predefined yet limited vocabulary. Thanks to the recent development of deep learning techniques, researchers in both computer vision and multimedia communities are now striving to bridge vision with natural language, which can be regarded as the ultimate goal of visual understanding. We will present recent advances in exploring the synergy of visual understanding and language processing techniques, including vision-language alignment, visual captioning and commenting, visual emotion analysis, visual question answering, visual storytelling, and as well as open issues for this emerging research area.

Primary Committee:

IEEE ICIP

Tags:

tutorial

signalprocessing

2017

icip 2017

IEEE icip

ICIP 2017 Tutorial - Vision and Language: Bridging Vision and Language with Deep Learning [Part 2 of 2]

Jiebo Luo

More Like This

Tutorial Bundle: Variational Inference, (Not So) Approximate Bayesian Techniques, and Applications (Parts 1-2), ICASSP 2024

Tutorial Bundle: Zeroth-Order Machine Learning: Fundamental Principles and Emerging Applications in Foundation Models (Parts 1-3), ICASSP 2024

Tutorial Bundle: Parameter-Efficient and Prompt Learning for Speech and Language Foundation Models (Parts 1-3), ICASSP 2024

Join the IEEE Signal Processing Society