Two-Stream Hybrid Attention Network For Multimodal Classification

Qipin Chen, Zhenyu Shi, Zhen Zuo, Jinmiao Fu, Yi Sun

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:11:38

20 Sep 2021

On modern e-commerce platforms like Amazon, the number of products is fast growing, precise and efficient product classification becomes a key lever to great customer shopping experience. To tackle the large-scale product classification problem, a major challenge is how to leverage multimodal product information (e.g., image, text). One of the most successful directions is the attention-based deep multimodal learning, where there are mainly two types of frameworks: 1) keyless attention, which learns the importance of features within each modal; and 2) key-based attention, which learns the importance of features using other modalities. In this paper, we propose a novel Two-stream Hybrid Attention Network (HANet), which leverages both key-based and keyless attention mechanisms to capture the key information across product image and title modalities. We experimentally show that our HANet achieves state-of-the-art performance on Amazon-scale product classification problem.

Tags:

signal processing society

IEEE icip 2021

september 19-22

virtual conference

2021

sps

virtual conference icip 2021

icip 2021

Two-Stream Hybrid Attention Network For Multimodal Classification

Qipin Chen, Zhenyu Shi, Zhen Zuo, Jinmiao Fu, Yi Sun

Value-Added Bundle(s) Including this Product

ICIP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Welcome and Opening Remarks for the IEEE SustainTech Leadership Forum

Panel: Building Sustainable Cities for Tomorrow

Panel: Unleashing the Potential of Virtual Power Plants for Sustainable Energy Solutions

Join the IEEE Signal Processing Society