LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

Ziyi Chen, Shang-Hong Lai, Akihiro Sugimoto

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:07:44

12 May 2022

Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.

Tags:

3d human pose estimation

data augmentation

skeletal interpolation

transformer

LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

Ziyi Chen, Shang-Hong Lai, Akihiro Sugimoto

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

BYOL for Audio: Exploring Pre-Trained General-Purpose Audio Representations

Slides: BYOL for Audio: Exploring Pre-Trained General-Purpose Audio Representations

Devising Transformers as an Autoencoder for Unsupervised Multivariate Time Series Imputation

Join the IEEE Signal Processing Society