DEPTH ESTIMATION OF MULTI-MODAL SCENE BASED ON MULTI-SCALE MODULATION

Anjie Wang, Zhijun Fang, Xiaoyan Jiang, Yongbin Gao, Gaofeng Cao, Siwei Ma

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Lecture 10 Oct 2023

As multimodal information is complementary, effectively utilizing scene multimodal information has become an increasingly important research topic for many scholars. This paper proposes a novel multi-scale global learning strategy that utilizes both echo and visual modal data as inputs to estimate scene depth. The framework involves constructing a multi-scale feature extraction method using pyramid pooling modules to aggregate contextual information from different regions and improve global information acquisition ability. Furthermore, a recurrent multi-scale feature modulation module is introduced to generate more semantic and accurate spatial representations in each iteration update process. Additionally, a multi-scale fusion method is constructed for the fusion of echo and visual modalities. The proposed method's superior performance is demonstrated through sufficient experiments conducted on the Replica dataset.

Tags:

multi-modal

multi-scale

depth estimation

DEPTH ESTIMATION OF MULTI-MODAL SCENE BASED ON MULTI-SCALE MODULATION

Anjie Wang, Zhijun Fang, Xiaoyan Jiang, Yongbin Gao, Gaofeng Cao, Siwei Ma

More Like This

Short Course Bundle: ICIP 2023 COURSE 1: Short Course: Multimodal Learning: Technical Foundation, Hands-on and Applications (Parts 1-4)

SELF-SUPERVISED FOCUS MEASURE FUSING FOR DEPTH ESTIMATION FROM COMPUTER-GENERATED HOLOGRAMS

A MULTI-SCALE CELL SEGMENTATION METHOD FOR DETECTING HEMATOLOGICAL DISORDERS

Join the IEEE Signal Processing Society