MAT: Mask-Aware Transformer for Large Hole Image Inpainting

CVPR 2022  ·  Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia ·

Recent studies have shown the importance of modeling long-range interactions in the inpainting problem. To achieve this goal, existing approaches exploit either standalone attention techniques or transformers, but usually under a low resolution in consideration of computational cost. In this paper, we present a novel transformer-based model for large hole inpainting, which unifies the merits of transformers and convolutions to efficiently process high-resolution images. We carefully design each component of our framework to guarantee the high fidelity and diversity of recovered images. Specifically, we customize an inpainting-oriented transformer block, where the attention module aggregates non-local information only from partial valid tokens, indicated by a dynamic mask. Extensive experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets. Code is released at https://github.com/fenglinglwb/MAT.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Inpainting CelebA-HQ MAT FID 4.86 # 1
P-IDS 13.83 # 1
U-IDS 25.33 # 1
Image Inpainting Places2 MAT FID 1.96 # 2
P-IDS 23.42 # 1
U-IDS 38.34 # 1

Methods