Factorized Crf With Batch Normalization Based On The Entire Training Data
Eran Goldman, Jacob Goldberger
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:14:57
Batch normalization (BN) is a key component of most neural network architectures. A major weakness of Batch Normalization is its critical dependence on having a reasonably large batch size, due to the inherent approximation of estimating the mean and variance with a single batch of data. Another weakness is the difficulty of applying BN in autoregressive or structured models. In this study we show that it is feasible to calculate the mean and variance using the entire training dataset instead of standard BN for any network node obtained as a linear function of the input features. We dub this method Full Batch Normalization (FBN). Our main focus is on a factorized autoregressive CRF model where we show that FBN is applicable, and allows for the integration of BN into the linear-chain CRF likelihood. The improved performance of FBN is illustrated on the huge SKU dataset that contains images of retail store product displays.
Chairs:
Reinhold Häb-Umbach