GRADIENT DECONFLICTION-BASED TRAINING FOR MULTI-EXIT ARCHITECTURES
Xinglu Wang, Yingming Li
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 04:35
Muiti-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting "easy" samples to speed up the inference. In this paper, we propose a new gradient deconfliction-based training technique for multi-exit architectures. In particular, the conflicting between the gradients back-propagated from different classifiers is removed by projecting the gradient from one classifier onto the normal plane of the gradient from the other classifier. Experiments on CIFAR-100 and ImageNet show that the gradient deconfliction-based training strategy significantly improves the performance of the state-of-the-art multi-exit neural networks. Moreover, this method does not require within architecture modifications and can be effectively combined with other previously-proposed training techniques and further boosts the performance.