Powers of layers for image-to-image translation

13 Aug 2020  Â·  Hugo Touvron, Matthijs Douze, Matthieu Cord, HervĂ© JĂ©gou ·

We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc. We start from an image autoencoder architecture with fixed weights. For each task we learn a residual block operating in the latent space, which is iteratively called until the target domain is reached. A specific training schedule is required to alleviate the exponentiation effect of the iterations. At test time, it offers several advantages: the number of weight parameters is limited and the compositional design allows one to modulate the strength of the transformation with the number of iterations. This is useful, for instance, when the type or amount of noise to suppress is not known in advance. Experimentally, we provide proofs of concepts showing the interest of our method for many transformations. The performance of our model is comparable or better than CycleGAN with significantly fewer parameters.

PDF Abstract

Datasets


Results from the Paper


 Ranked #1 on Image-to-Image Translation on horse2zebra (Frechet Inception Distance metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image-to-Image Translation horse2zebra PoL (CycleGAN) Frechet Inception Distance 53.0 # 1
Number of params 15.9M # 2
Image-to-Image Translation photo2vangogh PoL (CycleGAN) Frechet Inception Distance 152.7 # 2
Number of params 15.9M # 2
Image-to-Image Translation vangogh2photo PoL (CycleGAN) Frechet Inception Distance 134.4 # 1
Number of Params 15.9M # 2
Image-to-Image Translation zebra2horse PoL (CycleGAN) Frechet Inception Distance 112.3 # 2
Number of params 15.9M # 2

Methods