Time-Frequency Awareness Network for Human Mesh Recovery from Videos
Boyang Zhang (Ningxia University); Suping Wu (Ningxia University); Meining Jia (NingXia University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
This paper focuses on the problem of 3D human mesh recovery from videos. Most recent works mainly focus on human spatio-temporal modeling in the time domain, but these time-domain methods often focus on the short-range spatio-temporal receptive field and information transfer of the video, thus cannot adaptively sense effective spatio-temporal dependencies in a long-range, furthermore lacking the ability to perceive local motion with a small-scale movement. In this work, we propose a Time-Frequency Awareness Network for human mesh recovery. We present a novel paradigm that learns human feature representations by introducing frequency domain. Specifically, we first design a time-frequency aware attention module that uses frequency domain information as a guide to model temporal long-range dependence and spatial long-range dependence in a unified manner. Secondly, we carefully develop a time-frequency-aware recurrent module that treats moving humans as discrete signals over time in the frequency domain to capture the spatio-temporal information accumulated by human movement in videos. In addition, we also elaborate design of a local awareness loss constraint on human motion of the small scale, which helps to mitigate the interference of global motion on the prediction results. Extensive experimental results on large publicly available datasets show advantages over most state-of-the-art methods. The code will be made publicly available.