-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 11:35
Parameter quantization is crucial for model compression. This paper generalizes the binary and ternary quantizations to M-ary quantization for adaptive learning of the quantized neural networks. To compensate the performance loss, the representation values and the quantization partitions of model parameters are jointly trained to optimize the resolution of gradients for parameter updating where the nondifferentiable function in back-propagation algorithm is tackled. An asymmetric quantization is implemented. The restriction in parameter quantization is sufficiently relaxed. The resulting M-ary quantization scheme is general and adaptive with different M. Training of the M-ary quantized neural network (MQNN) can be tuned to balance the tradeoff between system performance and memory storage. Experimental results show that MQNN is able to achieve comparable image classification performance with full-precision neural network (FPNN), but the memory storage can be far less than that in FPNN.