Authors: Tom Bordin and Thomas Maugey
Abstract:
We propose a framework for image compression in which the fidelity criterion is replaced by semantic preservation objectives. Encoding the image thus becomes a simple extraction of semantic enabling to reach drastic compression ratio. The decoding side is handled by a generative model relying on the diffusion process for the reconstruction of images. We propose to describe the semantic using low resolution segmentation maps as guide. We further improve the generation introducing colors map guidance without retraining the generative decoder. We show that it is possible to produce images of high visual quality with preserved semantic at a bitrate competitive with classical codecs.
Contributions:
- framework for image compression with a semantic representation
- image generation guided with colors on a trained Latent Diffusion Model
- visually competitive results with VVC at extremely low bitrates
Supplementary results: