Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 10:02
04 May 2020

Human action recognition aims at assigning an action label to a well-segmented video. Recent work using two-stream or 3D convolutional neural networks achieves high recognition rates at the cost of huge computation complexity, memory footprint, and parameters. In this paper, we propose a lightweight neural network called Group Frame Network (GFNet) for human action recognition, which imposes intra-frame spatial information sparsity on spatial dimension in a simple yet effective way. Benefit from two core components, namely Group Temporal Module (GTM) and Group Spatial Module (GSM), GFNet decreases irrelevant motion inside frames and duplicate texture features among frames, which can extract the spatial-temporal information of frames at a minuscule cost. Experimental results on NTU RGB+D dataset and Varying-view RGB-D Action dataset show that our method without any pre-training strategy reaches a reasonable trade-off among computation complexity, parameters and performance, which is more cost-efficient than state-of-the-art methods.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00