Video Inpainting
43 papers with code • 4 benchmarks • 12 datasets
The goal of Video Inpainting is to fill in missing regions of a given video sequence with contents that are both spatially and temporally coherent. Video Inpainting, also known as video completion, has many real-world applications such as undesired object removal and video restoration.
Datasets
Latest papers
Deficiency-Aware Masked Transformer for Video Inpainting
Firstly, we pretrain a image inpainting model DMT_img serve as a prior for distilling the video model DMT_vid, thereby benefiting the hallucination of deficiency cases.
HNeRV: A Hybrid Neural Representation for Videos
Such embedding largely limits the regression capacity and internal generalization for video interpolation.
Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
Transformers have been widely used for video processing owing to the multi-head self attention (MHSA) mechanism.
CIRI: Curricular Inactivation for Residue-aware One-shot Video Inpainting
Video inpainting aims at filling in missing regions of a video.
Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent Transformer
In this paper, we propose the concept of online video inpainting for autonomous vehicles to expand the field of view, thereby enhancing scene visibility, perception, and system safety.
INR-V: A Continuous Representation Space for Video-based Generative Tasks
In this work, we evaluate the space learned by INR-V on diverse generative tasks such as video interpolation, novel video generation, video inversion, and video inpainting against the existing baselines.
Scalable Neural Video Representations with Learnable Positional Features
Succinct representation of complex signals using coordinate-based neural representations (CNRs) has seen great progress, and several recent efforts focus on extending them for handling videos.
Flow-Guided Transformer for Video Inpainting
Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
Fine-Grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications
Egocentric videos offer fine-grained information for high-fidelity modeling of human behaviors.
Towards Unified Keyframe Propagation Models
We evaluate our two-stream approach for inpainting tasks, where experiments show that it improves both the propagation of features within a single frame as required for image inpainting, as well as their propagation from keyframes to target frames.