Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 13:25
10 Jul 2020

Quantization on weight parameters in neural network training plays a key role for model compression in mobile devices. This paper presents a general M-ary adaptive quantization in construction of Bayesian neural networks. The trade-off between model capacity and memory cost is adjustable. The stochastic weight parameters are faithfully reflected. A compact model is trained to achieve robustness to model uncertainty due to heterogeneous data collection. To minimize the performance loss, the representation levels in quantized neural network are estimated by maximizing the variational lower bound of log likelihood conditioned on M-ary quantization. Bayesian learning is formulated by using the multi-spike-and-slab prior for quantization levels. An adaptive quantization is derived to implement a flexible parameter space for learning representation which is applied for object recognition. Experiments on image recognition show the merit of this Bayesian model compression for M-ary quantized neural networks.