CORSD: Class-Oriented Relational Self Distillation
Muzhou Yu (Xi'an Jiaotong University); Sia Huat Tan (Tsinghua University); Kailu Wu (Tsinghua University); Runpei Dong (Xi'an Jiaotong University); Linfeng Zhang (Tsinghua University ); Kaisheng Ma (Tsinghua University )
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Knowledge distillation conducts an effective model compression method while holding some limitations:(1) the feature-based distillation are lack of transferring the relation of data examples; (2) the relational distillation are either limited to the handcrafted functions for relation extraction or weak in inter- and intra- class relation modeling.Besides, the feature divergence of teacher-student architectures may lead to inaccurate relational knowledge transferring.In this work, we propose a novel training framework named Class-Oriented Relational Self Distillation to address the limitations. The trainable relation networks are designed to extract relation, and they enable the model to better classify samples by transferring the relation from the deepest layer of the model to shallow layers. Besides, auxiliary classifiers are proposed to make relation networks capture class-oriented relation that benefits classification task. Extensive experiments show the effectiveness of our method.