PRIME KNOWLEDGE WITH LOCAL PATTERN CONSISTENCY FOR KNOWLEDGE DISTILLATION
Qiankun Tang, Jun Wang, Xiaogang Xu
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:05:40
Intermediate feature maps of teacher model can produce enriched knowledge to improve the performance of student model. Existing works mainly focus on formulating beneficial knowledge for transferring, but ignore the contribution discrepancy of the knowledge to promote performance. To tackle this issue, we propose a simple Importance-based Knowledge Reweighting mechanism, which dynamically measure the importance of knowledge spatially and channel-wisely for teacher-student pairs. This reweighting scheme enables the student model to focus more on the prime knowledge. Furthermore, a local pattern consistency loss based on Structural Similarity Index Measure (SSIM) is presented to narrow the local pattern discrepancy between teacher and student features. Extensive experiments on CIFAR-100 with various combinations of network architectures for teacher and student well demonstrate the effectiveness and superiority of our proposed approach.