Skip to main content

Compressing Cross-Domain Representation via Lifelong Knowledge Distillation

Fei Ye (University of york); Adrian Bors (University of York)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
07 Jun 2023

Most Knowledge Distillation (KD) approaches focus on the discriminative information transfer and assume that the data is provided in batches during training stages. In this paper, we address a more challenging scenario in which different tasks are presented sequentially, at different times, and the learning goal is to transfer the generative factors of visual concepts learned by a Teacher module to a compact latent space represented by a Student module. In order to achieve this, we develop a new Lifelong Knowledge Distillation (LKD) framework where we train an infinite mixture model as the Teacher which automatically increases its capacity to deal with a growing number of tasks. In order to ensure a compact architecture and to avoid forgetting, we propose to measure the relevance of the knowledge from a new task for a set of experts making up the Teacher module, guiding each expert to capture the probabilistic characteristics of several similar domains. The network architecture is expanded only when learning an entirely different task. The Student is implemented as a lightweight probabilistic generative model. The experiments show that LKD can train a compressed Student module that achieves the state of the art results with fewer parameters.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00