MovieNet-PS: A Large-Scale Person Search Dataset in the Wild
Jie Qin (Nanjing University of Aeronautics and Astronautics); Peng Zheng (NUAA, MBZUAI, Aalto University); Yichao Yan (Shanghai Jiao Tong University); Rong Quan (Nanjing University of Aeronautics and Astronautics); Xiaogang CHENG (Nanjing University of Posts and Telecommunications); Bingbing Ni (Shanghai Jiao Tong University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Person search (PS) aims to jointly localize and identify a query person from natural, uncropped images. Existing works unintentionally adopt pedestrians (with similar poses and unchanging clothing) as the query and restrict the application scenarios in surveillance. This is due to that most PS datasets are collected from surveillance cameras with a limited diversity of views, scenes, appearances, etc. In this paper, we study a more general and realistic task in the wild, where we aim to search target persons with a much higher degree of diversity. To this end, we introduce a new PS dataset, namely MovieNet-PS, based on an existing large-scale movie dataset. MovieNet-PS is currently the largest and most diverse PS dataset, consisting of 160K images (100K for training), 274K bounding boxes, and 3K identities. It stands out from existing counterparts from two levels of diversities, i.e., scene-level and identity-level, with 92,043 scenes and significant variations in poses, clothing, scales, etc. for the same identity. To validate the rich context information on our dataset and make full use of it, we propose a novel global-local context network, which exploits scene and group context to boost the search performance. Extensive experiments demonstrate that MovieNet-PS is more challenging and comprehensive than existing datasets, and our approach further pushes the state of the art by a large margin (relatively 34% in mAP) on this dataset. Codes, models, and the dataset are available at: https://github.com/ZhengPeng7/GLCNet.