Search Results for author: Richard Zhang

Found 51 papers, 37 papers with code

Customizing Text-to-Image Diffusion with Camera Viewpoint Control

no code implementations18 Apr 2024 Nupur Kumari, Grace Su, Richard Zhang, Taesung Park, Eli Shechtman, Jun-Yan Zhu

Model customization introduces new concepts to existing text-to-image models, enabling the generation of the new concept in novel contexts.

Object Prompt Engineering

Lazy Diffusion Transformer for Interactive Image Editing

no code implementations18 Apr 2024 Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Daniel Cohen-Or, Taesung Park, Michaël Gharbi

We demonstrate that our approach is competitive with state-of-the-art inpainting methods in terms of quality and fidelity while providing a 10x speedup for typical user interactions, where the editing mask represents 10% of the image.

VideoGigaGAN: Towards Detail-rich Video Super-Resolution

no code implementations18 Apr 2024 Yiran Xu, Taesung Park, Richard Zhang, Yang Zhou, Eli Shechtman, Feng Liu, Jia-Bin Huang, Difan Liu

We introduce VideoGigaGAN, a new generative VSR model that can produce videos with high-frequency details and temporal consistency.

Video Super-Resolution

Jump Cut Smoothing for Talking Heads

no code implementations9 Jan 2024 Xiaojuan Wang, Taesung Park, Yang Zhou, Eli Shechtman, Richard Zhang

We leverage the appearance of the subject from the other source frames in the video, fusing it with a mid-level representation driven by DensePose keypoints and face landmarks.

One-step Diffusion with Distribution Matching Distillation

no code implementations30 Nov 2023 Tianwei Yin, Michaël Gharbi, Richard Zhang, Eli Shechtman, Fredo Durand, William T. Freeman, Taesung Park

We introduce Distribution Matching Distillation (DMD), a procedure to transform a diffusion model into a one-step image generator with minimal impact on image quality.

Online Detection of AI-Generated Images

no code implementations23 Oct 2023 David C. Epstein, Ishan Jain, Oliver Wang, Richard Zhang

With advancements in AI-generated images coming on a continuous basis, it is increasingly difficult to distinguish traditionally-sourced images (e. g., photos, artwork) from AI-generated ones.

Image Inpainting

Evaluating Data Attribution for Text-to-Image Models

1 code implementation ICCV 2023 Sheng-Yu Wang, Alexei A. Efros, Jun-Yan Zhu, Richard Zhang

The problem of data attribution in such models -- which of the images in the training set are most responsible for the appearance of a given generated image -- is a difficult yet important one.

Ablating Concepts in Text-to-Image Diffusion Models

1 code implementation ICCV 2023 Nupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, Jun-Yan Zhu

To achieve this goal, we propose an efficient method of ablating concepts in the pretrained model, i. e., preventing the generation of a target concept.

Scaling up GANs for Text-to-Image Synthesis

1 code implementation CVPR 2023 Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park

From a technical standpoint, it also marked a drastic change in the favored architecture to design generative image models.

Text-to-Image Generation

3D-FM GAN: Towards 3D-Controllable Face Manipulation

no code implementations24 Aug 2022 Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Richard Zhang, S. Y. Kung

While concatenating GAN inversion and a 3D-aware, noise-to-image GAN is a straight-forward solution, it is inefficient and may lead to noticeable drop in editing quality.

Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

1 code implementation CVPR 2022 Gaurav Parmar, Yijun Li, Jingwan Lu, Richard Zhang, Jun-Yan Zhu, Krishna Kumar Singh

We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2.

ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions

1 code implementation24 May 2022 Difan Liu, Sandesh Shetty, Tobias Hinz, Matthew Fisher, Richard Zhang, Taesung Park, Evangelos Kalogerakis

We present ASSET, a neural architecture for automatically modifying an input high-resolution image according to a user's edits on its semantic segmentation map.

Semantic Segmentation Vocal Bursts Intensity Prediction

BlobGAN: Spatially Disentangled Scene Representations

no code implementations5 May 2022 Dave Epstein, Taesung Park, Richard Zhang, Eli Shechtman, Alexei A. Efros

Blobs are differentiably placed onto a feature grid that is decoded into an image by a generative adversarial network.

Generative Adversarial Network

Any-resolution Training for High-resolution Image Synthesis

1 code implementation14 Apr 2022 Lucy Chai, Michael Gharbi, Eli Shechtman, Phillip Isola, Richard Zhang

To take advantage of varied-size data, we introduce continuous-scale training, a process that samples patches at random scales to train a new generator with variable output resolutions.

2k Image Generation +1

Ensembling Off-the-shelf Models for GAN Training

1 code implementation CVPR 2022 Nupur Kumari, Richard Zhang, Eli Shechtman, Jun-Yan Zhu

Can the collective "knowledge" from a large bank of pretrained vision models be leveraged to improve GAN training?

Image Generation

GAN-Supervised Dense Visual Alignment

1 code implementation CVPR 2022 William Peebles, Jun-Yan Zhu, Richard Zhang, Antonio Torralba, Alexei A. Efros, Eli Shechtman

We propose GAN-Supervised Learning, a framework for learning discriminative models and their GAN-generated training data jointly end-to-end.

Data Augmentation Dense Pixel Correspondence Estimation

Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization

no code implementations NeurIPS 2021 Jialun Zhang, Salar Fattahi, Richard Zhang

This over-parameterized regime of matrix factorization significantly slows down the convergence of local search algorithms, from a linear rate with $r=r^{\star}$ to a sublinear rate when $r>r^{\star}$.

Contrastive Feature Loss for Image Prediction

1 code implementation12 Nov 2021 Alex Andonian, Taesung Park, Bryan Russell, Phillip Isola, Jun-Yan Zhu, Richard Zhang

Training supervised image synthesis models requires a critic to compare two images: the ground truth to the result.

Image Generation

Editing Conditional Radiance Fields

1 code implementation ICCV 2021 Steven Liu, Xiuming Zhang, Zhoutong Zhang, Richard Zhang, Jun-Yan Zhu, Bryan Russell

In this paper, we explore enabling user editing of a category-level NeRF - also known as a conditional radiance field - trained on a shape category.

Novel View Synthesis

Ensembling with Deep Generative Views

1 code implementation CVPR 2021 Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhang

Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.

Image Classification

On Aliased Resizing and Surprising Subtleties in GAN Evaluation

3 code implementations CVPR 2022 Gaurav Parmar, Richard Zhang, Jun-Yan Zhu

Furthermore, we show that if compression is used on real training images, FID can actually improve if the generated images are also subsequently compressed.

Image Generation

The Low-Rank Simplicity Bias in Deep Networks

1 code implementation18 Mar 2021 Minyoung Huh, Hossein Mobahi, Richard Zhang, Brian Cheung, Pulkit Agrawal, Phillip Isola

We show empirically that our claim holds true on finite width linear and non-linear models on practical learning paradigms and show that on natural data, these are often the solutions that generalize well.

Image Classification

Anycost GANs for Interactive Image Synthesis and Editing

1 code implementation CVPR 2021 Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu

Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing.

Image Generation

CDPAM: Contrastive learning for perceptual audio similarity

1 code implementation9 Feb 2021 Pranay Manocha, Zeyu Jin, Richard Zhang, Adam Finkelstein

The DPAM approach of Manocha et al. learns a full-reference metric trained directly on human judgments, and thus correlates well with human perception.

Contrastive Learning Speech Enhancement +1

How many samples is a good initial point worth in Low-rank Matrix Recovery?

no code implementations NeurIPS 2020 Jialun Zhang, Richard Zhang

Optimizing the threshold over regions of the landscape, we see that, for initial points not too close to the ground truth, a linear improvement in the quality of the initial guess amounts to a constant factor improvement in the sample complexity.

Contrastive Learning for Unpaired Image-to-Image Translation

10 code implementations30 Jul 2020 Taesung Park, Alexei A. Efros, Richard Zhang, Jun-Yan Zhu

Furthermore, we draw negatives from within the input image itself, rather than from the rest of the dataset.

Contrastive Learning Image-to-Image Translation +1

Swapping Autoencoder for Deep Image Manipulation

4 code implementations NeurIPS 2020 Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros, Richard Zhang

Deep generative models have become increasingly effective at producing realistic images from randomly sampled seeds, but using such models for controllable manipulation of existing images remains challenging.

Image Manipulation

Image Morphing with Perceptual Constraints and STN Alignment

1 code implementation29 Apr 2020 Noa Fish, Richard Zhang, Lilach Perry, Daniel Cohen-Or, Eli Shechtman, Connelly Barnes

In image morphing, a sequence of plausible frames are synthesized and composited together to form a smooth transformation between given instances.

Image Morphing

CNN-generated images are surprisingly easy to spot... for now

4 code implementations CVPR 2020 Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, Alexei A. Efros

In this work we ask whether it is possible to create a "universal" detector for telling apart real images from these generated by a CNN, regardless of architecture or dataset used.

Data Augmentation Image Generation +1

Detecting Photoshopped Faces by Scripting Photoshop

2 code implementations ICCV 2019 Sheng-Yu Wang, Oliver Wang, Andrew Owens, Richard Zhang, Alexei A. Efros

Most malicious photo manipulations are created using standard image editing tools, such as Adobe Photoshop.

Image Manipulation Detection

Deep Parametric Shape Predictions using Distance Fields

1 code implementation CVPR 2020 Dmitriy Smirnov, Matthew Fisher, Vladimir G. Kim, Richard Zhang, Justin Solomon

Many tasks in graphics and vision demand machinery for converting shapes into consistent representations with sparse sets of parameters; these representations facilitate rendering, editing, and storage.

Stochastic Adversarial Video Prediction

4 code implementations ICLR 2019 Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine

However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Representation Learning Video Generation +1

Self-Supervised Learning of Object Motion Through Adversarial Video Prediction

no code implementations ICLR 2018 Alex X. Lee, Frederik Ebert, Richard Zhang, Chelsea Finn, Pieter Abbeel, Sergey Levine

In this paper, we study the problem of multi-step video prediction, where the goal is to predict a sequence of future frames conditioned on a short context.

Object Self-Supervised Learning +1

Colorful Image Colorization

39 code implementations28 Mar 2016 Richard Zhang, Phillip Isola, Alexei A. Efros

We embrace the underlying uncertainty of the problem by posing it as a classification task and use class-rebalancing at training time to increase the diversity of colors in the result.

Colorization Image Colorization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.