Mixed Sample Augmentation for Online Distillation
Yiqing Shen ( Johns Hopkins University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Mixed Sample Regularization (MSR) is a powerful data augmentation strategy to generalize convolutional neural networks. Previous empirical analysis has illustrated an orthogonal performance gain between MSR and the conventional offline Knowledge Distillation (KD). To be more specific, student networks can be enhanced with the involvement of MSR in the training stage of the sequential distillation. Yet, the interplay between MSR and online knowledge distillation, where an ensemble of peer students learn mutually from each other, remains unexplored. To bridge the gap, we make the first attempt at incorporating CutMix into online distillation, where we empirically observe a significant improvement. Encouraged by this fact, we propose an even stronger MSR specifically for online distillation, named as CutnMix. Furthermore, a novel online distillation framework is designed upon CutnMix, to enhance the distillation with feature level mutual learning and a self-ensemble teacher.