LE-BEIT: A Local-Enhanced Self-Supervised Transformer For Semantic Segmentation of High Resolution Remote Sensing Images
Yifei Huang, Zideng Feng, Junli Yang, Bin Wang, Jiaying Wang, Zhenglin Xian
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:09:56
This paper presents a multi-modal multi-label attribute classification model in anime illustration based on Graph Convolutional Networks (GCN) using domain-specific semantic features. in animation production, since creators often intentionally highlight the subtle characteristics of the characters and objects when creating anime illustrations, we focus on the task of multi-label attribute classification. To capture the relationship between attributes, we construct a multi-modal GCN model that can adopt semantic features specific to anime illustration. To generate the domain-specific semantic features that represent the semantic contents of anime illustrations, we construct a new captioning framework for anime illustration by combining real images and their style transformation. The contributions of the proposed method are two-folds. 1) More comprehensive relationships between attributes are captured by introducing GCN with semantic features into the multi-label attribute classification task of anime illustrations. 2) More accurate image captioning of anime illustrations can be generated by a trainable model by using only real-world images. To our best knowledge, this is the first work dealing with multi-label attribute classification in anime illustration. The experimental results show the effectiveness of the proposed method by comparing it with some existing methods including the state-of-the-art methods.