Structural Reparameterization Lightweight Network for Video Action Recognition

AnLei Zhu (Jiangnan University); Wang Yinghui (Jiangnan University); Wei Li (Jiangnan University); Pengjiang Qian (Jiangnan University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

3D convolution networks play an important role in extracting spatiotemporal features in video action recognition. However, it usually brings a large number of paramters, which results in deployment difficulty in edge devices with limited memory space. Although lightweight 3DCNNs can reduce the mode size significantly, it causes a serious loss of accuracy. This paper proposes a novel approach to reduce the model size while preserves accuracy by combining lightweight networks with structural reparameterization. To reduce the model size, we propose 3D-DBB module, based on 2D Diverse Branch Block (DBB). Furthermore, we propose three structures based on 3D-DBB: (1) 3D depthwise convolution (called 3D-DBB DepthWise), (2) the 3D pointwise convolution (called 3D DBB-PointWise), and (3) reparameterizable depthwise separable structure (called DP3DBB), which is the concatenation of the two previous structures. We design and compare the effect of two different replacements for replacing depthwise separable structures in lightweight networks. Our method achieves 93.33% with only 0.42% loss in accuracy when the model size is only 1/50 of that of 3D-ResNeXt101.

Tags:

Applications of machine learning

Structural Reparameterization Lightweight Network for Video Action Recognition

AnLei Zhu (Jiangnan University); Wang Yinghui (Jiangnan University); Wei Li (Jiangnan University); Pengjiang Qian (Jiangnan University)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Gluformer: Transformer-Based Personalized Glucose Forecasting with Uncertainty Quantification

Learning silhouettes with group sparse autoencoders

Joint Cryo-ET Alignment and Reconstruction with Neural Deformation Fields

Join the IEEE Signal Processing Society