Search Results for author: Richard Zhang

Found 51 papers, 37 papers with code

Aligning and Projecting Images to Class-conditional Generative Networks

no code implementations • ECCV 2020 • Minyoung Huh, Richard Zhang, Jun-Yan Zhu, Sylvain Paris, Aaron Hertzmann

We present a method for projecting an input image into the space of a class-conditional generative neural network.

Generative Adversarial Network Translation

Paper
Add Code

Customizing Text-to-Image Diffusion with Camera Viewpoint Control

no code implementations • 18 Apr 2024 • Nupur Kumari, Grace Su, Richard Zhang, Taesung Park, Eli Shechtman, Jun-Yan Zhu

Model customization introduces new concepts to existing text-to-image models, enabling the generation of the new concept in novel contexts.

Object Prompt Engineering

Paper
Add Code

Lazy Diffusion Transformer for Interactive Image Editing

no code implementations • 18 Apr 2024 • Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Daniel Cohen-Or, Taesung Park, Michaël Gharbi

We demonstrate that our approach is competitive with state-of-the-art inpainting methods in terms of quality and fidelity while providing a 10x speedup for typical user interactions, where the editing mask represents 10% of the image.

Paper
Add Code

VideoGigaGAN: Towards Detail-rich Video Super-Resolution

no code implementations • 18 Apr 2024 • Yiran Xu, Taesung Park, Richard Zhang, Yang Zhou, Eli Shechtman, Feng Liu, Jia-Bin Huang, Difan Liu

We introduce VideoGigaGAN, a new generative VSR model that can produce videos with high-frequency details and temporal consistency.

Video Super-Resolution

Paper
Add Code

Jump Cut Smoothing for Talking Heads

no code implementations • 9 Jan 2024 • Xiaojuan Wang, Taesung Park, Yang Zhou, Eli Shechtman, Richard Zhang

We leverage the appearance of the subject from the other source frames in the video, fusing it with a mid-level representation driven by DensePose keypoints and face landmarks.

Paper
Add Code

Customizing Motion in Text-to-Video Diffusion Models

no code implementations • 7 Dec 2023 • Joanna Materzynska, Josef Sivic, Eli Shechtman, Antonio Torralba, Richard Zhang, Bryan Russell

To avoid overfitting to the new custom motion, we introduce an approach for regularization over videos.

Text-to-Video Generation Video Generation

Paper
Add Code

One-step Diffusion with Distribution Matching Distillation

no code implementations • 30 Nov 2023 • Tianwei Yin, Michaël Gharbi, Richard Zhang, Eli Shechtman, Fredo Durand, William T. Freeman, Taesung Park

We introduce Distribution Matching Distillation (DMD), a procedure to transform a diffusion model into a one-step image generator with minimal impact on image quality.

Paper
Add Code

Online Detection of AI-Generated Images

no code implementations • 23 Oct 2023 • David C. Epstein, Ishan Jain, Oliver Wang, Richard Zhang

With advancements in AI-generated images coming on a continuous basis, it is increasingly difficult to distinguish traditionally-sourced images (e. g., photos, artwork) from AI-generated ones.

Image Inpainting

Paper
Add Code

DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data

1 code implementation • NeurIPS 2023 • Stephanie Fu, Netanel Tamir, Shobhita Sundaram, Lucy Chai, Richard Zhang, Tali Dekel, Phillip Isola

Furthermore, our metric outperforms both prior learned metrics and recent large vision models on these tasks.

299

Paper
Code

Evaluating Data Attribution for Text-to-Image Models

1 code implementation • ICCV 2023 • Sheng-Yu Wang, Alexei A. Efros, Jun-Yan Zhu, Richard Zhang

The problem of data attribution in such models -- which of the images in the training set are most responsible for the appearance of a given generated image -- is a difficult yet important one.

Paper
Code

Ablating Concepts in Text-to-Image Diffusion Models

1 code implementation • ICCV 2023 • Nupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, Jun-Yan Zhu

To achieve this goal, we propose an efficient method of ablating concepts in the pretrained model, i. e., preventing the generation of a target concept.

134

Paper
Code

Scaling up GANs for Text-to-Image Synthesis

1 code implementation • CVPR 2023 • Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park

From a technical standpoint, it also marked a drastic change in the favored architecture to design generative image models.

Ranked #18 on Image Generation on ImageNet 256x256

Text-to-Image Generation

1,581

Paper
Code

Zero-shot Image-to-Image Translation

2 code implementations • 6 Feb 2023 • Gaurav Parmar, Krishna Kumar Singh, Richard Zhang, Yijun Li, Jingwan Lu, Jun-Yan Zhu

However, it is still challenging to directly apply these models for editing real images for two reasons.

Ranked #13 on Text-based Image Editing on PIE-Bench

Image-to-Image Translation Text-based Image Editing +1

996

Paper
Code

Domain Expansion of Image Generators

1 code implementation • CVPR 2023 • Yotam Nitzan, Michaël Gharbi, Richard Zhang, Taesung Park, Jun-Yan Zhu, Daniel Cohen-Or, Eli Shechtman

First, we note the generator contains a meaningful, pretrained latent space.

27,836

Paper
Code

Multi-Concept Customization of Text-to-Image Diffusion

2 code implementations • CVPR 2023 • Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu

Can we teach a model to quickly acquire a new concept, given a few examples?

1,777

Paper
Code

3D-FM GAN: Towards 3D-Controllable Face Manipulation

no code implementations • 24 Aug 2022 • Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Richard Zhang, S. Y. Kung

While concatenating GAN inversion and a 3D-aware, noise-to-image GAN is a straight-forward solution, it is inefficient and may lead to noticeable drop in editing quality.

Paper
Add Code

Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

1 code implementation • CVPR 2022 • Gaurav Parmar, Yijun Li, Jingwan Lu, Richard Zhang, Jun-Yan Zhu, Krishna Kumar Singh

We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2.

168

Paper
Code

ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions

1 code implementation • 24 May 2022 • Difan Liu, Sandesh Shetty, Tobias Hinz, Matthew Fisher, Richard Zhang, Taesung Park, Evangelos Kalogerakis

We present ASSET, a neural architecture for automatically modifying an input high-resolution image according to a user's edits on its semantic segmentation map.

Semantic Segmentation Vocal Bursts Intensity Prediction

112

Paper
Code

BlobGAN: Spatially Disentangled Scene Representations

no code implementations • 5 May 2022 • Dave Epstein, Taesung Park, Richard Zhang, Eli Shechtman, Alexei A. Efros

Blobs are differentiably placed onto a feature grid that is decoded into an image by a generative adversarial network.

Generative Adversarial Network

Paper
Add Code

Any-resolution Training for High-resolution Image Synthesis

1 code implementation • 14 Apr 2022 • Lucy Chai, Michael Gharbi, Eli Shechtman, Phillip Isola, Richard Zhang

To take advantage of varied-size data, we introduce continuous-scale training, a process that samples patches at random scales to train a new generator with variable output resolutions.

2k Image Generation +1

238

Paper
Code

Ensembling Off-the-shelf Models for GAN Training

1 code implementation • CVPR 2022 • Nupur Kumari, Richard Zhang, Eli Shechtman, Jun-Yan Zhu

Can the collective "knowledge" from a large bank of pretrained vision models be leveraged to improve GAN training?

Ranked #1 on Image Generation on AFHQ Cat

Image Generation

373

Paper
Code

GAN-Supervised Dense Visual Alignment

1 code implementation • CVPR 2022 • William Peebles, Jun-Yan Zhu, Richard Zhang, Antonio Torralba, Alexei A. Efros, Eli Shechtman

We propose GAN-Supervised Learning, a framework for learning discriminative models and their GAN-generated training data jointly end-to-end.

Data Augmentation Dense Pixel Correspondence Estimation

1,008

Paper
Code

Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization

no code implementations • NeurIPS 2021 • Jialun Zhang, Salar Fattahi, Richard Zhang

This over-parameterized regime of matrix factorization significantly slows down the convergence of local search algorithms, from a linear rate with $r=r^{\star}$ to a sublinear rate when $r>r^{\star}$.

Paper
Add Code

Contrastive Feature Loss for Image Prediction

1 code implementation • 12 Nov 2021 • Alex Andonian, Taesung Park, Bryan Russell, Phillip Isola, Jun-Yan Zhu, Richard Zhang

Training supervised image synthesis models requires a critic to compare two images: the ground truth to the result.

Image Generation

Paper
Code

Editing Conditional Radiance Fields

1 code implementation • ICCV 2021 • Steven Liu, Xiuming Zhang, Zhoutong Zhang, Richard Zhang, Jun-Yan Zhu, Bryan Russell

In this paper, we explore enabling user editing of a category-level NeRF - also known as a conditional radiance field - trained on a shape category.

Ranked #1 on Novel View Synthesis on PhotoShape

Novel View Synthesis

253

Paper
Code

Ensembling with Deep Generative Views

1 code implementation • CVPR 2021 • Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhang

Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.

Image Classification

Paper
Code

On Aliased Resizing and Surprising Subtleties in GAN Evaluation

3 code implementations • CVPR 2022 • Gaurav Parmar, Richard Zhang, Jun-Yan Zhu

Furthermore, we show that if compression is used on real training images, FID can actually improve if the generated images are also subsequently compressed.

Image Generation

3,417

Paper
Code

Few-shot Image Generation via Cross-domain Correspondence

2 code implementations • CVPR 2021 • Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zhang

Training generative models, such as GANs, on a target domain containing limited examples (e. g., 10) can easily result in overfitting.

Ranked #3 on 10-shot image generation on Babies

10-shot image generation Image Generation

285

Paper
Code

The Low-Rank Simplicity Bias in Deep Networks

1 code implementation • 18 Mar 2021 • Minyoung Huh, Hossein Mobahi, Richard Zhang, Brian Cheung, Pulkit Agrawal, Phillip Isola

We show empirically that our claim holds true on finite width linear and non-linear models on practical learning paradigms and show that on natural data, these are often the solutions that generalize well.

Image Classification

Paper
Code

Anycost GANs for Interactive Image Synthesis and Editing

1 code implementation • CVPR 2021 • Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu

Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing.

Image Generation

769

Paper
Code

CDPAM: Contrastive learning for perceptual audio similarity

1 code implementation • 9 Feb 2021 • Pranay Manocha, Zeyu Jin, Richard Zhang, Adam Finkelstein

The DPAM approach of Manocha et al. learns a full-reference metric trained directly on human judgments, and thus correlates well with human perception.

Contrastive Learning Speech Enhancement +1

342

Paper
Code

Spatially-Adaptive Pixelwise Networks for Fast Image Translation

1 code implementation • CVPR 2021 • Tamar Rott Shaham, Michael Gharbi, Richard Zhang, Eli Shechtman, Tomer Michaeli

We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation.

Image-to-Image Translation Inductive Bias +1

119

Paper
Code

Few-shot Image Generation with Elastic Weight Consolidation

no code implementations • NeurIPS 2020 • Yijun Li, Richard Zhang, Jingwan Lu, Eli Shechtman

Few-shot image generation seeks to generate more data of a given domain, with only few available training examples.

Ranked #4 on 10-shot image generation on Babies

10-shot image generation Image Generation

Paper
Add Code

How many samples is a good initial point worth in Low-rank Matrix Recovery?

no code implementations • NeurIPS 2020 • Jialun Zhang, Richard Zhang

Optimizing the threshold over regions of the landscape, we see that, for initial points not too close to the ground truth, a linear improvement in the quality of the initial guess amounts to a constant factor improvement in the sample complexity.

Paper
Add Code

Contrastive Learning for Unpaired Image-to-Image Translation

10 code implementations • 30 Jul 2020 • Taesung Park, Alexei A. Efros, Richard Zhang, Jun-Yan Zhu

Furthermore, we draw negatives from within the input image itself, rather than from the rest of the dataset.

Contrastive Learning Image-to-Image Translation +1

2,102

Paper
Code

Swapping Autoencoder for Deep Image Manipulation

4 code implementations • NeurIPS 2020 • Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros, Richard Zhang

Deep generative models have become increasingly effective at producing realistic images from randomly sampled seeds, but using such models for controllable manipulation of existing images remains challenging.

Image Manipulation

506

Paper
Code

Transforming and Projecting Images into Class-conditional Generative Networks

2 code implementations • 4 May 2020 • Minyoung Huh, Richard Zhang, Jun-Yan Zhu, Sylvain Paris, Aaron Hertzmann

We present a method for projecting an input image into the space of a class-conditional generative neural network.

Generative Adversarial Network Translation

193

Paper
Code

Image Morphing with Perceptual Constraints and STN Alignment

1 code implementation • 29 Apr 2020 • Noa Fish, Richard Zhang, Lilach Perry, Daniel Cohen-Or, Eli Shechtman, Connelly Barnes

In image morphing, a sequence of plausible frames are synthesized and composited together to form a smooth transformation between given instances.

Image Morphing

Paper
Code

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

1 code implementation • 13 Jan 2020 • Pranay Manocha, Adam Finkelstein, Zeyu Jin, Nicholas J. Bryan, Richard Zhang, Gautham J. Mysore

Assessment of many audio processing tasks relies on subjective evaluation which is time-consuming and expensive.

Denoising Speech Enhancement

342

Paper
Code

CNN-generated images are surprisingly easy to spot... for now

4 code implementations • CVPR 2020 • Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, Alexei A. Efros

In this work we ask whether it is possible to create a "universal" detector for telling apart real images from these generated by a CNN, regardless of architecture or dataset used.

Data Augmentation Image Generation +1

768

Paper
Code

Interactive Sketch & Fill: Multiclass Sketch-to-Image Translation

1 code implementation • ICCV 2019 • Arnab Ghosh, Richard Zhang, Puneet K. Dokania, Oliver Wang, Alexei A. Efros, Philip H. S. Torr, Eli Shechtman

We propose an interactive GAN-based sketch-to-image translation method that helps novice users create images of simple objects.

Object Sketch-to-Image Translation +1

192

Paper
Code

Detecting Photoshopped Faces by Scripting Photoshop

2 code implementations • ICCV 2019 • Sheng-Yu Wang, Oliver Wang, Andrew Owens, Richard Zhang, Alexei A. Efros

Most malicious photo manipulations are created using standard image editing tools, such as Adobe Photoshop.

Image Manipulation Detection

1,563

Paper
Code

Making Convolutional Networks Shift-Invariant Again

7 code implementations • 25 Apr 2019 • Richard Zhang

The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling.

Ranked #26 on Domain Generalization on VizWiz-Classification

Classification Consistency Conditional Image Generation +1

29,735

Paper
Code

Deep Parametric Shape Predictions using Distance Fields

1 code implementation • CVPR 2020 • Dmitriy Smirnov, Matthew Fisher, Vladimir G. Kim, Richard Zhang, Justin Solomon

Many tasks in graphics and vision demand machinery for converting shapes into consistent representations with sparse sets of parameters; these representations facilitate rendering, editing, and storage.

Paper
Code

Stochastic Adversarial Video Prediction

4 code implementations • ICLR 2019 • Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine

However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction.

Ranked #1 on Video Prediction on KTH (Cond metric)

Representation Learning Video Generation +1

300

Paper
Code

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

24 code implementations • CVPR 2018 • Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, Oliver Wang

We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics.

Ranked #19 on Video Quality Assessment on MSU FR VQA Database

Image Quality Assessment SSIM +1

3,369

Paper
Code

Self-Supervised Learning of Object Motion Through Adversarial Video Prediction

no code implementations • ICLR 2018 • Alex X. Lee, Frederik Ebert, Richard Zhang, Chelsea Finn, Pieter Abbeel, Sergey Levine

In this paper, we study the problem of multi-step video prediction, where the goal is to predict a sequence of future frames conditioned on a short context.

Object Self-Supervised Learning +1

Paper
Add Code

Toward Multimodal Image-to-Image Translation

6 code implementations • NeurIPS 2017 • Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, Eli Shechtman

Our proposed method encourages bijective consistency between the latent encoding and output modes.

Ranked #2 on Multimodal Unsupervised Image-To-Image Translation on Edge-to-Shoes

Image-to-Image Translation Translation

15,701

Paper
Code

Real-Time User-Guided Image Colorization with Learned Deep Priors

3 code implementations • 8 May 2017 • Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S. Lin, Tianhe Yu, Alexei A. Efros

The system directly maps a grayscale image, along with sparse, local user "hints" to an output colorization with a Convolutional Neural Network (CNN).

Ranked #2 on Point-interactive Image Colorization on Oxford 102 Flowers

Image Colorization Point-interactive Image Colorization

2,670

Paper
Code

Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction

2 code implementations • CVPR 2017 • Richard Zhang, Phillip Isola, Alexei A. Efros

We propose split-brain autoencoders, a straightforward modification of the traditional autoencoder architecture, for unsupervised representation learning.

Ranked #127 on Self-Supervised Image Classification on ImageNet

Representation Learning Self-Supervised Image Classification +1

138

Paper
Code

Colorful Image Colorization

39 code implementations • 28 Mar 2016 • Richard Zhang, Phillip Isola, Alexei A. Efros

We embrace the underlying uncertainty of the problem by posing it as a classification task and use class-rebalancing at training time to increase the diversity of colors in the result.

Ranked #128 on Self-Supervised Image Classification on ImageNet

Colorization Image Colorization +1

3,279

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.