ETPS: Efficient Two-Pass Encoding Scheme For Adaptive Live Streaming
Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, Christian Timmerer
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:45
Hand gesture recognition (HGR) is one of the most challenging tasks because it is very sensitive to occlusion or background. Various modalities such as RGB, depth, and point cloud and their combinations have been proposed to improve the performance of HGR, but convergence of RGB and point cloud with complementary characteristics has never been attempted. Therefore, this paper analyzes the synergistic effect of the two complementary modalities, and then proposes a new multi-modal fusion network that quantifies and converges the influence between two modalities. Also, considering the inherent limitation of multi-modality that the actual influence between the two modalities does not match the prediction, we propose the self-labeling-based adaptive guidance. For the NVGesture dataset, the proposed HGR method achieved 2.46% higher performance than the SOTA method.