VIDEO SUMMARIZATION THROUGH FINE-GRAINED HIERARCHICAL MODELING WITH MULTI-DIMENSIONAL FEATURES

Mengnan Liang, Ju Liu, Xiaoxi Liu, Lingchen Gu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Lecture 10 Oct 2023

Video summarization aims to shorten the video length while maintaining the original video content, which facilitates large-scale video searching and browsing. Most of the existing methods simply take static image features as input, which causes the loss of temporal action information of successive frames. Additionally, the use of two-stage temporal modeling aggravates the loss of temporal relationship. In this paper, we propose a framework based on Fine-Grained Hierarchical Modeling (FGHM) employing multi-dimensional features. Firstly, the multi-dimensional features extractor extracts static image features and dynamic video features. Then dynamic temporal modeling is carried out to model the temporal dependency of the entire video. We also investigate the effects of spatial-temporal features extracted by various 3D features extractors. Extensive experiments demonstrate the effectiveness of FGHM against state-of-the-art methods.

Tags:

video summarization

Temporal modeling

fine-grained

Multiple features

Hierarchical structure

VIDEO SUMMARIZATION THROUGH FINE-GRAINED HIERARCHICAL MODELING WITH MULTI-DIMENSIONAL FEATURES

Mengnan Liang, Ju Liu, Xiaoxi Liu, Lingchen Gu

More Like This

ENHANCING CONTRASTIVE LEARNING WITH TEMPORAL COGNIZANCE FOR AUDIO-VISUAL REPRESENTATION GENERATION

A NOVEL PART FEATURE INTEGRATION AND FUSION METHOD FOR FINE-GRAINED VEHICLE RECOGNITION

FAST GRAPH SAMPLING FOR SHORT VIDEO SUMMARIZATION USING GERSHGORIN DISC ALIGNMENT

Join the IEEE Signal Processing Society