Meta-Learned initialization For 3D Human Recovery
Mira Kim, Youngjo Min, Jiwon Kim, Seungryong Kim
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:07:56
Single-view 3D human pose estimation (HPE) based on Graph Convolutional Networks (GCNs) currently suffers from problems such as insufficient spatial feature representation, difficult fusion of various information, and depth ambiguity in 2D to 3D pose mapping. This paper proposes a framework for monocular 3D human pose learning based on spatio-temporal attention graph. Firstly, we build a spatial graph feature acquisition scheme to obtain spatial semantic feature of 3D human pose with strong representativeness, by constructing a global to local attention graph through a coarse-to-fine way. and then we capture contextual information of temporal related images in the sequence as attention factors, evaluate their influence on the target image and achieve effective integration of spatio-temporal characteristics to mitigate the depth ambiguity problem. Extensive experimental results on two challenging benchmark datasets (Human3.6M and HumanEva-I) show that our method can effectively improve the accuracy of 3D HPE and outperform the state-of-the-arts.