Variable Rate Allocation for Vector-Quantized Autoencoders

Federico Baldassarre (KTH - Royal Institute of Technology); Alaaeldin M El-Nouby (Facebook AI Research); Herve Jegou (Facebook AI Research)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Vector-quantized autoencoders have recently gained interest in image compression, generation and self-supervised learning. However, as a neural compression method, they lack the possibility to allocate a variable number of bits to each image location, e.g. according to the semantic content or local saliency. In this paper, we address this limitation in a simple yet effective way. We adopt a product quantizer (PQ) that produces a set of discrete codes for each image patch rather than a single index. This PQ-autoencoder is trained end-to-end with a structured dropout that selectively masks a variable number of codes at each location. These mechanisms force the decoder to reconstruct the original image based on partial information and allow us to control the local rate. The resulting model can compress images on a wide range of operating points of the rate-distortion curve and can be paired with any external method for saliency estimation to control the compression rate at a local level. We demonstrate the effectiveness of our approach on the popular Kodak and ImageNet datasets by measuring both distortion and perceptual quality metrics.

Tags:

Deep learning techniques

Variable Rate Allocation for Vector-Quantized Autoencoders

Federico Baldassarre (KTH - Royal Institute of Technology); Alaaeldin M El-Nouby (Facebook AI Research); Herve Jegou (Facebook AI Research)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Adaptive Scale and Spatial Aggregation for Real-time Object Detection

Training Robust Spiking Neural Networks with ViewPoint Transform and SpatioTemporal Stretching

CryoSWD: Sliced Wasserstein Distance Minimization for 3D Reconstruction in Cryo-Electron Microscopy

Join the IEEE Signal Processing Society