A GRAPH LEARNING BASED MULTI-MODAL VIDEO ACTION RECOGNITION
Lei Gao, Kai Liu, Ling Guan
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Due to the robustness to variations of viewpoints and environment, 3D skeleton-based action recognition has drawn considerable interests in academic and industrial sectors. Recently, deep neural networks (DNNs)-based algorithms have been applied to skeleton-based action recognition extensively. Amongst many contemporary approaches, graph learning plays a significant role in 3D skeleton-based action recognition. In addition, with the advancement of multi-sensory technology, multi-modal action recognition has grown at an extremely rapid pace. In this work, a graph learning based multi-modal framework is proposed with application to action recognition. Specifically, a two-stream heterogeneous network is designed to extract the complementary features from 3D skeleton and RGB modalities jointly. Then, a discriminative adaptation model (DAM) is presented and then applied to the designed heterogeneous network for multi-modal action recognition. To validate the effectiveness of the proposed model, experiments are conducted on two multi-modal action recognition database with different scales: NTU RGB+D 120, and SYSU. Experimental results show the power of the generated features and the DAM model on multi-modal action recognition.