Efficient Feature Extraction for Non-Maximum Suppression in Visual Person Detection
Charalampos Symeonidis (AUTH); Ioannis Mademlis (Department of Informatics, Aristotle University of Thessaloniki); Ioannis Pitas (Aristotle University of Thessaloniki); Nikolaos Nikolaidis (Aristotle University of Thessaloniki)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Non-Maximum Suppression (NMS) is a post-processing step in almost every visual object detector, tasked with rapidly pruning the number of overlapping detected candidate rectangular Regions-of-Interest (RoIs) and replacing them with a single, more spatially accurate detection (in pixel coordinates). The common Greedy NMS algorithm suffers from drawbacks, due to the need for careful manual tuning. In visual person detection, most NMS methods typically suffer when analyzing crowded scenes with high levels of in-between occlusions. This paper proposes a modification on a deep neural architecture for NMS, suitable for such cases and capable of efficiently cooperating with recent neural object detectors. The method approaches the NMS problem as a rescoring task, aiming to ideally assign precisely one detection per object. The proposed modification exploits the extraction of RoI representations, semantically capturing the region's visual appearance, from information-rich feature maps computed by the detector's intermediate layers. Experimental evaluation on two common public person detection datasets shows improved accuracy against competing methods, with acceptable inference speed.