Hyper-Spectral Imaging For Overlapping Plastic Flakes Segmentation
Guillem Mart�nez, Maya Aghaei, Martin Dijkstra, Bhalaji Nagarajan, Femke Jaarsma, Jaap van de Loosdrecht, Petia Radeva, Klaas Dijkstra
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:44
Traditional VQA models tend to rely on language priors as a shortcut to answer questions and neglect visual information. To solve this problem, the latest approaches divide language priors into ?good? language context and ?bad? language bias through global features to benefit the language context and suppress the language bias. However, language priors cannot be meticulously divided by global features. in this paper, we propose a novel Context Relation Fusion Model (CRFM), which produces comprehensive contextual features forcing the VQA model to more carefully distinguish language priors into ?good? language context and ?bad? language bias. Specifically, we utilize the Visual Relation Fusion Model (VRFM) and Question Relation Fusion Model (QRFM) to learn local critical contextual information and then perform information enhancement through the Attended Features Fusion Model (AFFM). Experiments show that our CRFM achieves state-of-the-art performance on the VQA-CP v2 dataset.