CFFMixer: Multi-dimensional Feature Fusion For Object Detection
Hao Xie (Southeast University); weizhe yuan (Southeast University); Bin Kang (Nanjing University of Posts and Telecommunication); Songlin Du (Southeast University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Object detection is a fundamental task in the field of computer vision, and one of its essential requirements is high-quality feature fusion. Previous works have made various efforts in this regard: CNN-based detectors use convolutional blocks to fuse local features and dense prior knowledge to predict objects, while query-based detectors fuse global features by self-attention then decode features with object queries. However, their feature fusion methods are relatively monotonous. Considering that different modules are applicable to different dimensions, we proposed an object detector named CFFMixer which used hybrid architecture to achieve multi-dimensional feature fusion. The sampling strategy to extract abundant local and global features was first introduced then the Comprehensive Feature Fusion Network (CFFN) was proposed to integrate them. CFFN not only achieved local and global features interaction in the spatial dimension, but also fused semantics in the channel dimension. Furthermore, we conducted experiments and made a comparison with competitive models, our model finally got 43.0 mAP on COCO 2017 dataset within 12 epochs. Experimental results showed that the model's accuracy benefits from the powerful feature fusion capability of CFFN. Besides, we performed ablation studies on our modules to evaluate their effectiveness.