Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 1:20:50
Plenary 10 Oct 2023

Large multimodal models are increasingly demonstrating state of the art performance against the most exacting computer vision benchmarks. Thanks to their ability to provide learned priors across diverse competencies such as language, logical reasoning, geometry, and visual semantics, they are also increasingly becoming a foundation for many other capabilities, ranging from language grounding to common sense understanding, planning and robot control. I’ll discuss our experience leveraging large multimodal models for embodied applications and how this may impact the direction of computer vision, robotics, and embodied AI at large. Bio: Vincent Vanhoucke is a Distinguished Scientist and Senior Director of Robotics at Google. His research has spanned many areas of artificial intelligence and machine learning, from speech recognition to deep learning, computer vision, and robotics. His Udacity lecture series has introduced over 100,000 students to Deep Learning. He is President of the Robot Learning Foundation, which organizes the Conference on Robot Learning, now in its seventh year. He holds a doctorate from Stanford University and a diplôme d’ingénieur from the École Centrale Paris.

Tags:

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00