A SIMPLE HYBRID FILTER PRUNING FOR EFFICIENT EDGE INFERENCE
Shabbeer Basha Shaik Hussain, Sheethal N Gowda, Jayachandra Dakala
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:16:38
Convolutional Neural Networks have been extensively used for solving many computer vision problems. However, due to high memory and compute requirements, deployment of these models on edge devices is limited. Many embedded friendly models such as MobileNet, ShuffleNet, SqueezeNet, and many more are proposed to serve this purpose. But these models are still not compact enough to deploy on edge devices. The popular metric based pruning methods (which are aimed at pruning insignificant and redundant filters) could achieve limited compression for embedded friendly models such as MobileNet. In this paper, we propose a novel hybrid filter pruning method that prunes both redundant and insignificant filters at the same time. Additionally, we have designed custom regularizers that enable us to prune additional filters from convolution layers. Pruning experiments are conducted on MobileNetv1 based Single-Shot Object Detector (SSD) for face detection problem. Through our experiments, we could prune 40.11% of parameters and reduce 67.03% of FLOPs from MobileNetv1 with a little drop in model performance (1.67 mAP on MS COCO). On an ARM based edge device, the inference time is reduced from 198ms to 84ms.