1 code implementation • 4 Dec 2023 • Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat
In this paper, we study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT).
Ranked #4 on Image Generation on ImageNet 256x256
no code implementations • ICCV 2023 • Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji
Despite tremendous progress in generating high-quality images using diffusion models, synthesizing a sequence of animated frames that are both photorealistic and temporally coherent is still in its infancy.
Ranked #8 on Text-to-Video Generation on UCF-101
no code implementations • NeurIPS 2021 • Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz
It is therefore interesting to study how these two tasks can be coupled to benefit each other.
no code implementations • CVPR 2021 • Anand Bhattad, Aysegul Dundar, Guilin Liu, Andrew Tao, Bryan Catanzaro
We describe a cycle consistency loss that encourages model textures to be aligned, so as to encourage sharing.
3 code implementations • ICCV 2021 • Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar
We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision.
1 code implementation • ICCV 2021 • Ning Yu, Guilin Liu, Aysegul Dundar, Andrew Tao, Bryan Catanzaro, Larry Davis, Mario Fritz
Lastly, we study different attention architectures in the discriminator, and propose a reference attention mechanism.
no code implementations • NeurIPS 2020 • Morteza Mardani, Guilin Liu, Aysegul Dundar, Shiqiu Liu, Andrew Tao, Bryan Catanzaro
The conventional CNNs, recently adopted for synthesis, require to train and test on the same set of images and fail to generalize to unseen images.
no code implementations • 14 Jul 2020 • Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum A. Reda, Karan Sapra, Andrew Tao, Bryan Catanzaro
Specifically, we directly treat the whole encoded feature map of the input texture as transposed convolution filters and the features' self-similarity map, which captures the auto-correlation information, as input to the transposed convolution.
no code implementations • CVPR 2020 • Aysegul Dundar, Karan Sapra, Guilin Liu, Andrew Tao, Bryan Catanzaro
Conditional image synthesis for generating photorealistic images serves various applications for content editing to content generation.
6 code implementations • NeurIPS 2019 • Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, Bryan Catanzaro
To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.
Ranked #1 on Video-to-Video Synthesis on YouTube Dancing
1 code implementation • ICCV 2019 • Fitsum A. Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin J. Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro
We further introduce a pseudo supervised loss term that enforces the interpolated frames to be consistent with predictions of a pre-trained interpolation model.
Ranked #1 on Video Frame Interpolation on UCF101 (PSNR (sRGB) metric)
no code implementations • ICCV 2019 • Soumyadip Sengupta, Jinwei Gu, Kihwan Kim, Guilin Liu, David W. Jacobs, Jan Kautz
Inverse rendering aims to estimate physical attributes of a scene, e. g., reflectance, geometry, and lighting, from image(s).
4 code implementations • 28 Nov 2018 • Guilin Liu, Kevin J. Shih, Ting-Chun Wang, Fitsum A. Reda, Karan Sapra, Zhiding Yu, Andrew Tao, Bryan Catanzaro
In this paper, we present a simple yet effective padding scheme that can be used as a drop-in module for existing convolutional neural networks.
2 code implementations • 2 Nov 2018 • Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro
We present an approach for high-resolution video frame prediction by conditioning on both past frames and past optical flows.
1 code implementation • ECCV 2018 • Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro
We present an approach for high-resolution video frame prediction by conditioning on both past frames and past optical flows.
Ranked #1 on Video Prediction on YouTube-8M
11 code implementations • NeurIPS 2018 • Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro
We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e. g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video.
60 code implementations • ECCV 2018 • Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro
Existing deep learning based image inpainting methods use a standard convolutional network over the corrupted image, using convolutional filter responses conditioned on both valid pixels as well as the substitute values in the masked holes (typically the mean value).
no code implementations • ICCV 2017 • Guilin Liu, Duygu Ceylan, Ersin Yumer, Jimei Yang, Jyh-Ming Lien
We propose an end-to-end network architecture that replicates the forward image formation process to accomplish this task.
no code implementations • 20 Apr 2016 • Guilin Liu, Chao Yang, Zimo Li, Duygu Ceylan, Qi-Xing Huang
Due to the abundance of 2D product images from the Internet, developing efficient and scalable algorithms to recover the missing depth information is central to many applications.
no code implementations • CVPR 2015 • Guilin Liu, Yotam Gingold, Jyh-Ming Lien
We say that a point q on the mesh is continuously visible from another point p if there exists a geodesic path connecting p and q that is entirely visible by p. In order to efficiently estimate the continuous visibility for all the vertices in a model, we propose two approaches that use specific CVF properties to avoid exhaustive visibility tests.
no code implementations • CVPR 2014 • Guilin Liu, Zhonghua Xi, Jyh-Ming Lien
In this paper, we propose a new decomposition method, called Dual-space Decomposition that handles complex 2D shapes by recognizing the importance of holes and classifying holes as either topological noise or structurally important features.