Speaker-Independent Lipreading By Disentangled Representation Learning

Qun Zhang, Shilin Wang, Gongliang Chen

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:37

21 Sep 2021

With the development of the deep learning technology, automatic lipreading based on deep neural network can achieve reliable results for speakers appeared in the training dataset. However, speaker-independent lipreading, i.e. lipreading for unseen speakers, is still a challenging task, especially when the training samples are quite limited. To improve the recognition performance in the speaker-independent scenario, a new deep neural network structure, named Disentangled Visual Speech Recognition Network (DVSR-Net), is proposed in this paper. DVSR-Net is designed to disentangle the identity-related features and the content-related features from the lip image sequence. To further eliminate the identity information that remained in the content features, a content feature refinement stage is designed in network optimization. By this way, the extracted features are closely related to the content information and irrelevant to the various talking style and thus the speech recognition performance for unseen speakers can be improved. Experiments on two widely used datasets have demonstrated the effectiveness of the proposed network in the speaker-independent scenario.

Tags:

signal processing society

IEEE icip 2021

september 19-22

virtual conference

2021

sps

virtual conference icip 2021

icip 2021

Speaker-Independent Lipreading By Disentangled Representation Learning

Qun Zhang, Shilin Wang, Gongliang Chen

Value-Added Bundle(s) Including this Product

ICIP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

Panel: Charting the Course for Future-Ready Data Centers in the Era of Sustainability

Join the IEEE Signal Processing Society