A Gaussian Mixture Model for Dialogue Generation with Dynamic Parameter Sharing Strategy
Qingqing Zhu, Pengfei Wu, Zhouxing Tan, Jiaxin Duan, Fengyu Lu, Junfei Liu
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:54
Existing dialog models are trained with data in an encoder-decoder framework with the same parameters, neglecting the multinomial distribution nature in a dataset. In fact, model improvement and development commonly requires fine-grained modeling on individual data subsets. However, collecting a labeled fine-grained dialogue dataset often requires expert-level domain knowledge and therefore is difficult to scale in reality. As we focus on better modeling the multinomial data for dialog generation, we study a method that combines the unsupervised clustering and generative model together with a GMM (Gaussian Mixture Model) based encoder-decoder framework. Specifically, our model samples from the prior and recognition distributions over the latent variables by a Gaussian mixture network and the latent layer with the capability to form multiple clusters. We also introduce knowledge distillation to guide and improve the clustering results. Finally, we use a dynamic parameter sharing strategy conditioned on different labels to train different decoders. Through extensive experiments, we show that our approach produces more coherent, informative and diverse replies than prior methods.