LEARNING MUTUALLY IN CROWD SCENES FOR PEDESTRIAN DETECTION
Ruonan Wei, Yuehuan Wang, Jinpu Zhang
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Pedestrian detection in crowded scenes is a challenging problem due to the diverse occlusion patterns and highly overlap. To tackle this critical problem, we propose a mutual learning detection network. First, a self-attention mechanism is proposed to achieve mutual learning between individuals by capturing similar semantics among pedestrians. Feature representation of occluded individuals is enhanced by locally fusing similar semantics. Second, mutual loss is designed to improve the consistency of regression and classification. Specifically, regression results are leveraged to make classification score aware of the quality of predicted boxes, and the classification scores help the regression head to accelerate convergence of redundant boxes. Finally, we evaluate our proposed method on MOT20 and CityPersons datasets and achieve comparable state-of-the-art performance using less data. Compared to baseline, our detector obtains 14.4% AP and 11.7% AR gains on challenging MOT20 dataset.