VIDEO SUMMARIZATION THROUGH FINE-GRAINED HIERARCHICAL MODELING WITH MULTI-DIMENSIONAL FEATURES
Mengnan Liang, Ju Liu, Xiaoxi Liu, Lingchen Gu
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Video summarization aims to shorten the video length while maintaining the original video content, which facilitates large-scale video searching and browsing. Most of the existing methods simply take static image features as input, which causes the loss of temporal action information of successive frames. Additionally, the use of two-stage temporal modeling aggravates the loss of temporal relationship. In this paper, we propose a framework based on Fine-Grained Hierarchical Modeling (FGHM) employing multi-dimensional features. Firstly, the multi-dimensional features extractor extracts static image features and dynamic video features. Then dynamic temporal modeling is carried out to model the temporal dependency of the entire video. We also investigate the effects of spatial-temporal features extracted by various 3D features extractors. Extensive experiments demonstrate the effectiveness of FGHM against state-of-the-art methods.