learning class prototypes via Anisotropic Combination of Aligned Modalities for few-shot learning
Jieya Lian, Haojie Wang, Shengwu Xiong
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 07:57
Prototypical Network has shown its simplicity and effectiveness in few-shot learning. Recently, some works leverage cross-modal information to enhance the class prototype in few-shot learning. However, they seldom make use of structured information from textual space to optimize class prototype representation. And they either don’t align textual modality to visual modality or align them too rigidly. We argue that proper alignment method is important to improve the performance of cross-modal methods, as query data only has visual information in the few-shot learning tasks. In this paper, we propose a cross-modal alignment method to optimize class prototypes with structured information from textual space. We further introduce an anisotropic combination method to enhance class prototypes with information from two modalities. Experiments show that the cross-modal alignment method and anisotropic combination method achieve state-of-the-art results on the miniImageNet and tieredImageNet benchmark in one-shot regime.