Class Activation Map Refinement Via Semantic Affinity Exploration For Weakly Supervised Object Detection
Zhendong Wang, Zhenyuan Chen, Chen Gong
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:13:54
Multi-modality complementary information brings new impetus and innovation to saliency object detection (SOD). However, most existing RGB-D SOD methods either indiscriminately handle RGB features and depth features or only take depth features as additional information of RGB subnetwork, ignoring the different roles of two modalities for SOD tasks. To tackle this issue, we propose a novel multi-modality diversity fusion network with SwinTransformer (M2DFNet) for RGB-D SOD from the perspective of the different status of multi-modality, which adequately explores the roles of RGB and depth modalities. To this end, a triple-diversity supervision mechanism (TDSM) and a diversity fusion module (DFM) are designed to parse the function of different modalities. Besides, we designed a dense decoder (DSD) to integrate multi-scale features and transfer gain information from top to bottom, which can improve the performance of SOD. Extensive experiments on five benchmark datasets demonstrate that the proposed M2DFNet outperforms 17 other state-of-the-art (SOTA) RGB-D SOD methods.