GENERATIVE ADVERSARIAL NETWORK INCLUDING REFERRING IMAGE SEGMENTATION FOR TEXT-GUIDED IMAGE MANIPULATION
Yuto Watanabe, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:14:08
This paper proposes a novel generative adversarial network to improve the performance of image manipulation using natural language descriptions that contain desired attributes. Text-guided image manipulation aims to semantically manipulate an image aligned with the text description while preserving text-irrelevant regions. To achieve this, we newly introduce referring image segmentation into the generative adversarial network for image manipulation. The referring image segmentation aims to generate a segmentation mask that extracts the text-relevant region. By utilizing the feature map of the segmentation mask in the network, the proposed method explicitly distinguishes the text-relevant and irrelevant regions and has the following two contributions. First, our model can pay attention only to the text-relevant region and manipulate the region aligned with the text description. Second, our model can achieve an appropriate balance between the generation of accurate attributes in the text-relevant region and the reconstruction in the text-irrelevant regions. Experimental results show that the proposed method can significantly improve the performance of image manipulation.