Improving Augmentation and Evaluation Schemes for Semantic Image Synthesis

25 Nov 2020  ·  Prateek Katiyar, Anna Khoreva ·

Despite data augmentation being a de facto technique for boosting the performance of deep neural networks, little attention has been paid to developing augmentation strategies for generative adversarial networks (GANs). To this end, we introduce a novel augmentation scheme designed specifically for GAN-based semantic image synthesis models. We propose to randomly warp object shapes in the semantic label maps used as an input to the generator. The local shape discrepancies between the warped and non-warped label maps and images enable the GAN to learn better the structural and geometric details of the scene and thus to improve the quality of generated images. While benchmarking the augmented GAN models against their vanilla counterparts, we discover that the quantification metrics reported in the previous semantic image synthesis studies are strongly biased towards specific semantic classes as they are derived via an external pre-trained segmentation network. We therefore propose to improve the established semantic image synthesis evaluation scheme by analyzing separately the performance of generated images on the biased and unbiased classes for the given segmentation network. Finally, we show strong quantitative and qualitative improvements obtained with our augmentation scheme, on both class splits, using state-of-the-art semantic image synthesis models across three different datasets. On average across COCO-Stuff, ADE20K and Cityscapes datasets, the augmented models outperform their vanilla counterparts by ~3 mIoU and ~10 FID points.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image-to-Image Translation ADE20K Labels-to-Photos Pix2PixHD-AUG Accuracy 77.9% # 5
FID 41.5 # 12
Image-to-Image Translation ADE20K Labels-to-Photos CC-FPSE-AUG mIoU 44 # 5
Accuracy 83% # 2
FID 32.6 # 7
Image-to-Image Translation Cityscapes Labels-to-Photo Pix2PixHD-AUG mIoU 58 # 10
FID 72.7 # 13
Accuracy 92.7 # 2
Image-to-Image Translation Cityscapes Labels-to-Photo CC-FPSE-AUG mIoU 63.1 # 7
FID 52.1 # 6
Accuracy 93.5 # 1
Image-to-Image Translation COCO-Stuff Labels-to-Photos Pix2PixHD-AUG mIoU 21.9 # 6
Accuracy 54.1 # 4
FID 54.2 # 12
Image-to-Image Translation COCO-Stuff Labels-to-Photos CC-FPSE-AUG mIoU 42.1 # 2
Accuracy 71.5 # 1
FID 19.1 # 6

Methods


No methods listed for this paper. Add relevant methods here