Class Activation Map Refinement Via Semantic Affinity Exploration For Weakly Supervised Object Detection

Zhendong Wang, Zhenyuan Chen, Chen Gong

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:13:54

04 Oct 2022

Multi-modality complementary information brings new impetus and innovation to saliency object detection (SOD). However, most existing RGB-D SOD methods either indiscriminately handle RGB features and depth features or only take depth features as additional information of RGB subnetwork, ignoring the different roles of two modalities for SOD tasks. To tackle this issue, we propose a novel multi-modality diversity fusion network with SwinTransformer (M2DFNet) for RGB-D SOD from the perspective of the different status of multi-modality, which adequately explores the roles of RGB and depth modalities. To this end, a triple-diversity supervision mechanism (TDSM) and a diversity fusion module (DFM) are designed to parse the function of different modalities. Besides, we designed a dense decoder (DSD) to integrate multi-scale features and transfer gain information from top to bottom, which can improve the performance of SOD. Extensive experiments on five benchmark datasets demonstrate that the proposed M2DFNet outperforms 17 other state-of-the-art (SOTA) RGB-D SOD methods.

Tags:

International Conference on Image Processing

IEEE ICIP 2022

icip