Multimodal Video Summarization Based on Fuzzy Similarity Features

Theodoros Psallidas

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 0:08:45

27 Jun 2022

The continuously growing number of user-generated videos has increased the need for efficient browsing through content collections and repositories, which in turn requires descriptive, yet compact representations. To this goal, a popular approach is to create a visual summary, which is by far more expressive compared to other approaches, e.g., textual descriptions. In this work, we present a video summarization approach that is based on the extraction and fusion of audio and visual features, in order to produce dynamic video summaries, i.e., comprising of the most important video segments of the original video, while preserving their temporal order. Based on the extracted features, each segment is classified as ``interesting,'' or ``uninteresting,'' thus included in the final summary, or not. The novelty of our approach is that prior to classification, the fused features are fuzzified, thus becoming more intuitive and robust to uncertainty. We evaluate our approach using a large dataset of user-generated videos and demonstrate that fuzzy features are able to boost classification performance, providing for more concrete video summaries.

Tags:

IVMSP 2022

June 2022

2022

IVMSP

IEEE IVMSP 2022

June 26

Nafplio

Multimodal Video Summarization Based on Fuzzy Similarity Features

Theodoros Psallidas

More Like This

Short Course Bundle: ICASSP 2022 COURSE 6: Transformer Architectures for Multimodal Signal Processing and Decision Making (Parts 1-3)

Short Course Bundle: ICASSP 2022 COURSE 5: Speech Technology for Health: From Technical Foundations to Applications (Parts 1-3)

Short Course Bundle: ICASSP 2022 COURSE 3: Biomedical Signal Analysis and Healthcare (Parts 1-3)

Join the IEEE Signal Processing Society