A LIGHTWEIGHT NETWORK MODEL FOR VIDEO FRAME INTERPOLATION USING SPATIAL PYRAMIDS
Jiankai Zhuang, Zengchang Qin, Jialu Chen, Tao Wan
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 08:55
In recent years, deep learning-based video frame interpolation methods have shown impressive results in handling occlusion, blur, and large motion. However, they are usually very heavy in terms of model size, and they hardly to be employed in i.e. mobile phones or other portable devices with limited computing power. To address the problem, we propose light-weighted Spatial Pyramid Frame Interpolation Network (SPFIN), a hierarchical network in a coarse-to-fine approach to reconstruct frames. At each pyramid level, we apply two light sub-networks to model optical flow and visibility mask instead of commonly used U-Net architecture. The flow and mask are up-sampled and optimized progressively. Finally, the intermediate frame is formed by linearly blending warped frames and masks. Experimental results on two benchmark problems show that our model has the smallest size, but better or comparable performance compared to existing state-of-the-art models.