IMC-NET: Learning Implicit Field With Corner Attention Network For 3D Shape Reconstruction
Jiongchao Jin, Huanqiang Xu, Pengliang Ji, Biao Leng
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:05:25
Vision Transformer (ViT) has been introduced into the computer vision (CV) field with its self-attention mechanism to capture global dependency. However, simply deploying ViT on a hyperspectral image (HSI) classification task can not get satisfying results because ViT is a spatial-only self-attention model, but rich spectral information exists in HSI. Moreover, most HSI classifiers integrate spectral and spatial features in a cascaded flowchart, ignoring the internal correlation between spectral and spatial information. Furthermore, existing positional embedding (PE) methods can not fulfil the 3D configuration of ViT. Therefore, this paper proposes a unified spectral-spatial-based 3D ViT with cooperative 3D coordinate positional embedding. in the meanwhile, a novel local-global feature fusion strategy is proposed. The model does not contain convolution or recurrent units and can achieve more competitive classification performance than other state-of-the-art (SOTA) methods. Furthermore, compared with existing ViT-based HSI classifiers, our concept can get better results.