A GENERAL FRAMEWORK FOR INCOMPLETE CROSS-MODAL RETRIEVAL WITH MISSING LABELS AND MISSING MODALITIES
Mingyang Li, Shao-Lun Huang, Lin Zhang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:08:00
Among various cross-modal retrieval methods, the supervised methods achieve the best performance by exploiting the semantic labels. However, in realistic applications, the data are not always complete with labels and full multi-modal data, which makes these methods hard to be used. In this paper, we propose a general framework for handling cross-modal retrieval tasks with both missing labels and missing modalities. To be more specific, in our framework we embed the data in each modality and labels all into a common feature space and maximize their correlation altogether. When labels or data in some modalities are missing, we can still maximize the correlation between the remaining data or labels. Combined with the label prediction and data reconstruction modules, our model can effectively extract useful information from the incomplete data for cross-modal retrieval tasks. In the extensive experiments, our model outperforms many other methods on different datasets, which proves the effectiveness and flexibility for handling incomplete data of our model.