Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:06:20
08 May 2022

Echoes reflect a geometric structure of a scene surrounding a sound source. In this paper, we address the problem of estimating depth maps of indoor scenes based on echoes. First, we experimentally show that fusing multiple acoustic features, especially spectrogram and angular spectrum, can improve estimation accuracy. We then propose a novel bilinear model that incorporates dense co-attention for effective feature fusion. Our model is able to obtain a compact fused feature while capturing the second-order correlations of intra- and inter-features. Thorough evaluations on two datasets demonstrate the superiority of the proposed method over the state-of-the-art echo-based depth estimation and feature fusion methods.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00