Spatial-Temporal Graph Convolutional Network boosted Flow-Frame Prediction for Video Anomaly Detection

Kai Cheng (Fudan University); Xinhua Zeng (Fudan University); Yang Liu (Fudan University); Mengyang Zhao (FUDAN University); pang chengxin (Shanghai University of Electric Power); Xing Hu (university of shanghai for science and technology)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Video Anomaly Detection (VAD) is a critical technology for intelligent surveillance systems and remains a challenging task in the signal processing community. An intuitive idea for VAD is to use a two-stream network to learn appearance and motion normality, respectively. However, existing approaches usually design a network architecture for the appearance stream with effort, then apply a similar architecture to the motion stream, ignoring the unique appearance and motion characteristics. In this paper, we propose STGCN-FFP, an unsupervised Spatial-Temporal Graph Convolutional Networks (STGCN) boosted Flow-Frame Prediction model. Specifically, we first design an STGCN-based memory module to extract and memorize normal patterns for optical flow, which is more suitable for learning motion normality. Then, we use a memory-augmented auto-encoder to model normal appearance patterns. Finally, the latent representation of two streams is fused to predict future frames, boosting the model to learn spatial-temporal normality. To our knowledge, STGCN-FFP is the first work applying STGCN to uniquely model the motion normality. Our method performs comparably to the state-of-the-art methods on three benchmarks.

Tags:

Image and video synthesis, rendering, and visualization