Distilling Detr-Like Detectors With instance-Aware Feature

Honglie Wang, Jian Xu, Shouqian Sun

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:05:28

03 Oct 2022

Human Pose Estimation (HPE) is a long-standing yet challenging task in computer vision. The nature of the problem requires comprehensive global contextual reasoning among joints in different locations. in this work, we explore how to incorporate two popular and effective concepts, self-attention and Graph Neural Network (GNN), to model long-range information in HPE. Three different ways to implement self-attention in 3D feature maps are studied, where the best result is achieved via the channel-position version. Accuracy is further improved by refining the queries via an efficient channel-wise parallel GNN that explicitly models the human joint graphical relationships. We are able to improve prediction accuracy on strong baseline models and achieve state-of-the-art results.

Tags:

International Conference on Image Processing

IEEE ICIP 2022

icip