SEMANTIC ASSOCIATION NETWORK FOR VIDEO CORPUS MOMENT RETRIEVAL

Dahyun Kim, Sunjae Yoon, Ji Woo Hong, Chang D Yoo

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:40

08 May 2022

This paper considers Semantic Association Network (SAN) for Video Corpus Moment Retrieval (VCMR) which localizes temporal moment that best corresponds to the given text query in a corpus of videos. Collaborations among common semantics from multi-modal inputs are essential for effectively understanding video together with subtitle and text query. For this collaboration, SAN associates common semantics within the same modality (by Intra Semantic Association) and across different modalities (by Inter Semantic Association) with dedicated module referred to as Modality Semantic Association (MSA). SAN surpasses existing state-of-the-art performance on the TVR and DiDeMo benchmark datasets. Extensive ablation studies and qualitative analyses show the effectiveness of the proposed model.

Tags:

vision language task

video moment retrieval

temporal moment localization

localizing moment

video corpus moment retrieval

SEMANTIC ASSOCIATION NETWORK FOR VIDEO CORPUS MOMENT RETRIEVAL

Dahyun Kim, Sunjae Yoon, Ji Woo Hong, Chang D Yoo

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

Sorry, no results were found

Join the IEEE Signal Processing Society