Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 10:24
28 Oct 2020

Temporal information plays an important role in action recognition. Recently, 3D CNN is widely used in extracting temporal features from videos. Compared to 2D CNN, 3D CNN has more parameters and brings heavy computation burden. It is necessary to improve the efficiency of action recognition. In this paper, inspired by group convolution and convolution kernel decomposition, we propose a novel module called grouped decomposed module (GDM) which separates channels into three groups and applies 3D, 2D and 1D convolution in parallel respectively. This module extracts spatial and temporal features efficiently. Based on GDM, we design a new network named grouped decomposed network (GDN). The grouped decomposed network achieves state-of-the-art performance on two temporal-related datasets (Something-Something V1&V2) but requires few parameters and FLOPs.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00