Recurrent Neural Audiovisual Word Embeddings For Synchronized Speech And Real-Time Mri

Murat SaraÃ§lar, ÃykÃ¼ Deniz KÃ¶se

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 12:59

04 May 2020

In this paper, the use of word embeddings for the segments found in audio and real-time magnetic resonance imaging (rtMRI) videos is addressed. In this study, word embeddings are created to store and retrieve data efficiently, and their representation power of the original data is evaluated by the same-different word-discrimination task that is defined for both unimodal and cross-view settings. In order to create the word embeddings for two different data modalities independently for the unimodal setting, a Siamese neural network is designed. For the rtMRI videos, inputs to the network are generated through a correspondence autoencoder. In the cross-view setting, a recurrent neural network (RNN), which inputs data of different modalities, is trained to generate embeddings jointly for both data sources. The problem of objective function selection to the RNN is also investigated. The results on the USC-TIMIT rtMRI dataset outperform the conventional dynamic time warping (DTW) baseline by a clear margin. Outcomes demonstrate that the proposed word embeddings can be a step towards faster unimodal or cross-view query-by-example search tasks.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Recurrent Neural Audiovisual Word Embeddings For Synchronized Speech And Real-Time Mri

Murat SaraÃ§lar, ÃykÃ¼ Deniz KÃ¶se

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society

Recurrent Neural Audiovisual Word Embeddings For Synchronized Speech And Real-Time Mri

Murat SaraÃ§lar, ÃykÃ¼ Deniz KÃ¶se

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society

Murat SaraÃ§lar, ÃykÃ¼ Deniz KÃ¶se