Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 0:49:23
30 Jan 2023

The audio-visual analysis of the environment surrounding a robot is important for the recognition of activities, objects, interactions and intentions. In this talk I will discuss methods that enable a robot to understand a dynamic scene using only its on-board sensors in order to interact with humans. These methods include a multi-modal training strategy that leverages complementary information across observation modalities to improve the testing performance of a uni-modal system and the estimation of the physical properties of unknown containers manipulated by humans to inform the control of a robot grasping the container during a dynamic handover. I will show examples of multi-modal dynamic scene understanding, present the results of an international challenge for physical human-robot interaction and discuss open research directions.