Saliency-Driven Hierarchical Learned Image Coding for Machines

Kristian Fischer (Friedrich-Alexander-Univerity Erlangen-Nürnberg); Fabian Brand (Friedrich-Alexander University Erlangen-Nürnberg (FAU)); Christian Blum (Friedrich-Alexander University Erlangen-Nürnberg (FAU)); Andre Kaup (Friedrich-Alexander-Universität Erlangen-Nürnberg)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

09 Jun 2023

We propose to employ a saliency-driven hierarchical neural image compression network for a machine-to-machine communication scenario following the compress-then-analyze paradigm. By that, different areas of the image are coded at different qualities depending on whether salient objects are located in the corresponding area. Areas without saliency are transmitted in latent spaces of lower spatial resolution in order to reduce the bitrate. The saliency information is explicitly derived from the detections of an object detection network. Furthermore, we propose to add saliency information to the training process in order to further specialize the different latent spaces. All in all, our hierarchical model with all proposed optimizations achieves 77.1% bitrate savings over the latest video coding standard VVC on the Cityscapes dataset and with Mask R-CNN as analysis network at the decoder side. Thereby, it also outperforms traditional, non-hierarchical compression networks.

Tags:

Image and video coding

Saliency-Driven Hierarchical Learned Image Coding for Machines

Kristian Fischer (Friedrich-Alexander-Univerity Erlangen-Nürnberg); Fabian Brand (Friedrich-Alexander University Erlangen-Nürnberg (FAU)); Christian Blum (Friedrich-Alexander University Erlangen-Nürnberg (FAU)); Andre Kaup (Friedrich-Alexander-Universität Erlangen-Nürnberg)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

A Flow-Guided Non-Local Alignment Network for Video Compressive Sensing Reconstruction

JOINT COMPRESSION AND DEMOSAICKING FOR SATELLITE IMAGES

Multi-dimensional Signal Recovery using Low-rank Deconvolution

Join the IEEE Signal Processing Society