EGNET: A Novel Edge Guided Network For instance Segmentation
Kaiwen Du, Xiao Wang, Yan Yan, Yang Lu, Hanzi Wang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:02
Image captioning aims to generate descriptions of images, which requires capturing complex interactions between local regions and global context within image. However, effective global context modeling from image remains a challenging research problem. Existing approaches incorporate global-level information into the initialized input mainly based on transformer architecture. Unlike previous methods that may not be able to capture rich global contextual information, we propose a novel method named Context-Sensitive Transformer (CSTNet), which can discover the inherent global context and further empower the global-to-local interactions. Experimental results on the MSCOCO dataset show that the proposed model can significantly improve the performance of image captioning.