LEARNED VIDEO CODING WITH MOTION COMPENSATION MIXTURE MODEL
Khanh Quoc Dinh (Samsung Research); Kwang Pyo Choi (Samsung Electronics)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Learned video coding employs explicit motion compensation (MC) with neural networks to predict the original frame from its reference frame and to compress its residual from the predicted frame, where neural networks are optimized with rate distortion trade-offs. However, good predictions are hard to find or even do not exist due to fast motions, dis-occlusions, and coding errors of the reference frame. To avoid the problem of carrying false edges/details caused by inaccurate optical flow in the predicted frame to the residual, we propose a dynamic mixture of explicit and implicit motion compensations, where implicitness means that the encoding and decoding of the original frame are conditioned on the predicted frame in pixel and latent domains, respectively. The proposed mixture model saves up to 30% bitrate over the baseline and achieves state-of-the-art performance.