IMAGE RETRIEVAL WITH LINGUAL AND VISUAL PARAPHRASING VIA GENERATIVE MODELS
Rintaro Yanagi, Ren Togo, Takahiro Ogawa, Miki Haseyama
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 14:34
A new approach that improves text-based image retrieval (hereinafter referred to as TBIR) performance is proposed in this paper. TBIR methods aim to retrieve a desired image related to a query text. Especially, recent TBIR methods allow us to retrieve images considering word relationships by using a sentence as a query. In these TBIR methods, it is necessary to uniquely identify a desired image from similar images using a single query sentence. However, the diverse expressive styles for a query sentence make it difficult to uniquely identify a desired image. In this paper, we propose a novel TBIR method with paraphrasing on multiple representation spaces. Specifically, by paraphrasing a query sentence on lingual and visual representation spaces, the proposed method can retrieve a desired image from various perspectives and then it can uniquely identify a desired image from similar images. Comprehensive experimental results show the effectiveness of the proposed method.