Multi-rate adaptive transform coding for video compression

Lyndon Duong (New York University); Bohan Li (Google LLC); Cheng Chen (Google Inc.); Jingning Han (Google Inc.)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Contemporary lossy image and video coding standards rely on transform coding, the process through which pixels are mapped to an alternative representation to facilitate efficient data compression. Despite impressive performance of end-to-end optimized compression with deep neural networks, the high computational and space demands of these models has prevented them from superseding the relatively simple transform coding found in conventional video codecs. In this study, we propose learned transforms and entropy coding that may either serve as (non)linear drop-in replacements, or enhancements for linear transforms in existing codecs. These transforms can be multi-rate, allowing a single model to operate along the entire rate-distortion curve. To demonstrate the utility of our framework, we augmented the DCT with learned quantization matrices and adaptive entropy coding to compress intra-frame AV1 block prediction residuals. We report substantial BD-rate and perceptual quality improvements over more complex nonlinear transforms at a fraction of the computational cost.

Tags:

Machine learning for image processing

Multi-rate adaptive transform coding for video compression

Lyndon Duong (New York University); Bohan Li (Google LLC); Cheng Chen (Google Inc.); Jingning Han (Google Inc.)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Learning Generalizable Light Field Networks from Few Images

M2TSR: Multi-range and Mix-grained Transformer for Single Image Super-Resolution

Multistage Spatial Context Models for Learned Image Compression

Join the IEEE Signal Processing Society