Towards Pose-Invariant Lip-Reading

Shiyang Cheng, Pingchuan Ma, Stavros Petridis, Jie Shen, Adrian Bulat, Maja Pantic, Georgios Tzimiropoulos

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 14:28

04 May 2020

Lip-reading models have been significantly improved recently thanks to powerful deep learning architectures. However, most works focused on frontal or near frontal views of the mouth. As a consequence, lip-reading performance seriously deteriorates in non-frontal mouth views. In this work, we present a framework for training pose-invariant lip-reading models on synthetic data instead of collecting and annotating non-frontal data which is costly and tedious. The proposed model significantly outperforms previous approaches on non-frontal views while retaining the superior performance on frontal and near frontal mouth views. Specifically, we propose to use a 3D Morphable Model (3DMM) to augment LRW, an existing large-scale but mostly frontal dataset, by generating synthetic facial data in arbitrary poses. The newly derived dataset, is used to train a state-of-the-art neural network for lip-reading. We conducted a cross-database experiment for isolated word recognition on the LRS2 dataset, and reported an absolute improvement of 2.55%. The benefit of the proposed approach becomes clearer in extreme poses where an absolute improvement of up to 20.64% over the baseline is achieved.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Towards Pose-Invariant Lip-Reading

Shiyang Cheng, Pingchuan Ma, Stavros Petridis, Jie Shen, Adrian Bulat, Maja Pantic, Georgios Tzimiropoulos

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society