GSANET: SEMANTIC SEGMENTATION WITH GLOBAL AND SELECTIVE ATTENTION
Qingfeng Liu, Mostafa El-Khamy, Dongwoon Bai, Jungwon Lee
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 16:45
This paper proposes a novel deep learning architecture for semantic segmentation. The proposed Global and Selective Attention Network (GSANet) features Atrous Spatial Pyramid Pooling (ASPP) with novel sparsemax global attention, and with novel selective attention that features a condensation and diffusion mechanism to aggregate the multi-scale contextual information from the extracted deep features. A selective attention decoder is also proposed to process the GSA-ASPP outputs for optimizing the softmax volume. We are the first to benchmark the performance of semantic segmentation networks with the low-complexity feature extraction network (FXN), MobileNetEdge, that is optimized for low latency on edge devices. We show that GSANet can result in performance improvements with MobileNetEdge, as well as with strong FXNs, such as Xception, and improve the state-of-the-art semantic segmentation accuracy on both the ADE20k and the Cityscapes datasets.