Realistic Real-Time Voice Swapping From Single Unpaired Sentences

Carlo Provinciali, Yihong Liu, Junghoo Kim, Iddo Drori

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 09:11

04 May 2020

We demonstrate a system that allows two speakers to swap their voices from any two unpaired sentences such that the result is indistinguishable from real voices and performed in real-time on a laptop. Each of the two speakers takes turns pronouncing any unpaired single short sentences into a microphone. Our demo plays the original voice recordings, then swaps the speakers voices, playing the words pronounced by the first speaker with the secondâs speaker voice and vice-versa. The two input voices are processed in two distinct ways; one to extract the text of each speech, and one to learn each speaker's unique voice profile. We extract the text from speakersâ A speech by using state of the art pre-trained voice-to-text models. We then pass the audio from speaker B through an encoder, which derives an embedding that describes speakersâ B distinctive features. Next, we use the text extracted from speaker A and the embeddings of speaker B to synthesize the Mel spectrogram, which is fed into a vocoder to generate the final audio of speakersâ A sentence with speakersâ B voice. The same process is mirrored with speaker A and B's roles swapped. Our implementation leverages pre-trained neural networks: an encoder, synthesizer, and vocoder models, for a realistic real-time performance.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Realistic Real-Time Voice Swapping From Single Unpaired Sentences

Carlo Provinciali, Yihong Liu, Junghoo Kim, Iddo Drori

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society