A Bayesian Interpretation Of The Light Gated Recurrent Unit
Alexandre Bittar, Philip Garner
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:13:04
We summarise previous work showing that the basic sigmoid activation function arises as an instance of Bayes’s theorem, and that recurrence follows from the prior. We derive a layer-wise recurrence without the assumptions of previous work, and show that it leads to a standard recurrence with modest modifications to reflect use of log-probabilities. The resulting architecture closely resembles the Li-GRU which is the current state of the art for ASR. Although the contribution is mainly theoretical, we show that it is able to outperform the state of the art on the TIMIT and AMI datasets.
Chairs:
Yunxin Zhao