Isolating the objects and semantics in an image can be useful for several processing tasks, such as compression.
However, this is usually done via a complex retraining and disentanglement of learned image representation.
In this paper, we rather study the effect of simple operations, additions and subtractions, in the latent space of the powerful foundation model CLIP.
We show that these simple operations in the CLIP latent space enables to remove or add objects or concepts in complex images.
(Right) Input image and secondary image. (Left) Gradually removing the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually removing the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually removing the secondary image from the input (Left to right, top to bottom).
(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).
(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually removing the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually removing the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).(Right) Input image and secondary image. (Left) Gradually adding the secondary image from the input (Left to right, top to bottom).
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.