Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 08:25
27 Oct 2020

Visual Question Answering (VQA) is a comprehensive task to answer questions about the visual contents of an image. Recently, a number of studies have pointed out that VQA models tend to be misled by the dataset biases, and rely heavily on the superficial correlations between question and answer, rather than really understanding the visual contents. To address this issue, we propose visual calibration mechanism for VQA(VC-VQA) which extends the conventional VQA model with an additional image feature reconstruction module. The proposed model reconstructs image features based on predicted answer with question and measures the similarity between reconstructed image feature and original image feature, which will guide the VQA model predict the final answer. We evaluate our model on both VQA v1 and VQA v2 datasets, showing that VC-VQA effectively reduces impacts of dataset bias and achieves competitive performance compared to other mainstream methods.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00