Spectrogram Analysis Via Self-Attention For Realizing Cross-Model Visual-Audio Generation

Huadong Tan, Guang Wu, Pengcheng Zhao, Yanxiang Chen

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 13:47

04 May 2020

Human cognition is supported by the combination of multi-modal information from different sources of perception. The two most important modalities are visual and audio. Cross-modal visual-audio generation enables the synthesis of data from one modality following the acquisition of data from another. This brings about the full experience that can only be achieved through the combination of the two. In this paper,the Self-Attention mechanism is applied to cross-modal visual-audio generation for the first time. This technique is implemented to assist in the analysis of the structural characteristics of the spectrogram. A series of experiments are conducted to discover the best performing configuration. The post-experimental comparison shows that the Self-Attention module greatly improves the generation and classification of audio data. Furthermore, the presented method achieves results that are superior to existing cross-modal visual-audio generative models.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Spectrogram Analysis Via Self-Attention For Realizing Cross-Model Visual-Audio Generation

Huadong Tan, Guang Wu, Pengcheng Zhao, Yanxiang Chen

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society