Speaker-Independent Acoustic-to-Articulatory Speech Inversion

Peter Wu (UC Berkeley); Li-Wei Chen (Carnegie Mellon University); Cheol Jun Cho (UC Berkeley); Shinji Watanabe (Carnegie Mellon University); Louis Goldstein (University of Southern California); Alan Black (CMU); Gopala Krishna Anumanchipalli (UC Berkeley)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space. The articulatory space is a promising inversion target, since this space captures the mechanics of speech production. To this end, we build an acoustic-to-articulatory inversion (AAI) model that leverages autoregression, adversarial training, and self supervision to generalize to unseen speakers. Our approach obtains 0.780 correlation on an electromagnetic articulography (EMA) dataset, improving the state-of-the-art by 12.9%. Additionally, we show the interpretability of these representations through directly comparing the behavior of estimated representations with speech production behavior. Finally, we propose a resynthesis-based AAI evaluation metric that does not rely on articulatory labels, demonstrating its efficacy with an 18-speaker dataset.

Tags:

Speech production, perception and psychoacoustics

Speaker-Independent Acoustic-to-Articulatory Speech Inversion

Peter Wu (UC Berkeley); Li-Wei Chen (Carnegie Mellon University); Cheol Jun Cho (UC Berkeley); Shinji Watanabe (Carnegie Mellon University); Louis Goldstein (University of Southern California); Alan Black (CMU); Gopala Krishna Anumanchipalli (UC Berkeley)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality

Acoustic Echo Cancellation Signal Processing Grand Challenge 2023

The First Pathloss Radio Map Prediction Challenge

Join the IEEE Signal Processing Society