Skip to main content

AN EFFICIENT METHOD FOR MODEL PRUNING USING KNOWLEDGE DISTILLATION WITH FEW SAMPLES

Zhaojing Zhou, Zhuqing Jiang, Aidong Men, Haiying Wang, Yun Zhou

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:08:40
11 May 2022

Deep neural network compression methods can produce small-scale networks and utilizes fine-tuning to get back the dropped accuracy. Despite their remarkable performance, the fine-tuning procedure is limited to the requirement of a huge training dataset, which is a time-consuming progress. To address the issue, few-sample knowledge distillation (FSKD) has been proposed for data efficiency. However, FSKD needs to add additional convolution layers for compressed networks during training, which increases the complexity of network structure. In this paper, we present Progressive Feature Distribution Distillation (PFDD) without modifying network structures, which surpasses FSKD. Concretely, it is based on a progressive training strategy that is efficient for matching feature distributions between compressed network and original network. Thus, we can notably exploit both external information from samples and internal information from network, where using a small proportion of training dataset can yield quite considerable results. Experiments on various datasets and architectures demonstrate that our distillation approach is remarkably efficient and effective in improving compressed networks' performance while only few samples have been applied.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00