Cross-Vae: Towards Disentangling Expression From Identity For Human Faces
Haozhe Wu, Jia Jia, Guojun Qi, Lingxi Xie, Qi Tian, Yuanchun Shi
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 16:05
Facial expression and identity are two independent yet intertwined components for representing a face. For facial expression recognition, identity can contaminate the training procedure by providing tangled but irrelevant information. In this paper, we propose to learn clearly disentangled and discriminative features that are invariant of identities for expression recognition. However, such disentanglement normally requires annotations of both expression and identity on one large dataset, which is often unavailable. Our solution is to extend conditional VAE to a crossed version named Cross-VAE, which is able to use partially labeled data to disentangle expression from identity. We emphasis the following novel characteristics of our Cross-VAE: (1) It is based on an independent assumption that the two latent representationsâ distributions are orthogonal. This ensures both encoded representations to be disentangled and expressive. (2) It utilizes a symmetric training procedure where the output of each encoder is fed as the condition of the other. Thus two partially labeled sets can be jointly used. Extensive experiments show that our proposed method is capable of encoding expressive and disentangled features for facial expression. Compared with the baseline methods, our model shows an improvement of 3.56% on average in terms of accuracy.