Local-Variance-based Attention for Visual Tracking
Changlun Guo, Wen Xianbin, Yuan Liming, Haixia Xu
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 09:42
The RoIAlign module incorporated into the deep tracking-by-detection framework, which can thus receive the entire image as the input of the convolutional layer, alleviating high computational complexity induced by multiple proposals. Nevertheless, this would also produce an ambiguous feature discriminative boundary between the target and background in the feature map, which makes the following target identification and localization very difficult. To solve this problem, we apply a novel local-variance-based regularization for optimizing the convolutional layer, the local variance calculated from the attention map, i.e., the average pooling of the convolutional feature map. Therefore, the binary classification loss function integrated with local-variance-based regularization item can explicitly make the response of the target and background very distinguishable, specifically strengthening the response of target and weakening that of background. Extensive experiments on large-scale benchmark data sets demonstrate that the proposed algorithm is highly comparable to other state-of-the-art methods.