TALKINGFLOW: TALKING FACIAL LANDMARK GENERATION WITH MULTI-SCALE NORMALIZING FLOW NETWORK

Sen Liang, Zhize Zhou, Hujun Bao, Rong Li, Juyong Zhang

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:05:36

08 May 2022

Deterministic models dominate the field of talking facial landmark generation by directly mapping speech signals to a certain lip-sync facial landmark sequence, which often suffer from regression to the mean face. In contrast, probability generative models are more beneficial to handle complex data space and generate diverse samples. In this work, we propose a flow-based probabilistic network named TalkingFlow to generate natural talking facial landmark with head movements from speech data. It is implemented by a weighted multi-scale architecture to improve model representation capability and a conditional Temporal Convolutional Network module to fuse speech data. Extensive experiments results show that it can effectively generate diverse and natural facial landmark from speech data. All code will be made publicly available online.

Tags:

talking head

facial landmark

generative model

motion synthesis

normalizing flow

TALKINGFLOW: TALKING FACIAL LANDMARK GENERATION WITH MULTI-SCALE NORMALIZING FLOW NETWORK

Sen Liang, Zhize Zhou, Hujun Bao, Rong Li, Juyong Zhang

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Tutorial: Deep Generative Model for Inference

Image-to-Image Translation: Methods and Applications

Slides: Image-to-Image Translation: Methods and Applications

Join the IEEE Signal Processing Society