1 code implementation • 12 Dec 2023 • Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu
Though diffusion-based video generation has witnessed rapid progress, the inference results of existing models still exhibit unsatisfactory temporal consistency and unnatural dynamics.
no code implementations • 1 Dec 2023 • Yuming Jiang, Tianxing Wu, Shuai Yang, Chenyang Si, Dahua Lin, Yu Qiao, Chen Change Loy, Ziwei Liu
In this paper, we study the task of video generation with image prompts, which provide more accurate and direct content control beyond the text prompts.
1 code implementation • 29 Nov 2023 • Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, LiMin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
We will open-source VBench, including all prompts, evaluation methods, generated videos, and human preference annotations, and also include more video generation models in VBench to drive forward the field of video generation.
no code implementations • 13 Nov 2023 • Yuming Jiang, Devasahayam Arokia Balaya Rex, Dina Schuster, Benjamin A. Neely, Germán L. Rosano, Norbert Volkmar, Amanda Momenzadeh, Trenton M. Peters-Clarke, Susan B. Egbert, Simion Kreimer, Emma H. Doud, Oliver M. Crook, Amit Kumar Yadav, Muralidharan Vanuopadath, Martín L. Mayta, Anna G. Duboff, Nicholas M. Riley, Robert L. Moritz, Jesse G. Meyer
We expect this work to serve as a basic resource for new practitioners in the field of shotgun or bottom-up proteomics.
1 code implementation • 30 Sep 2023 • Lin Liu, Xinxin Fan, Haoyang Liu, Chulong Zhang, Weibin Kong, Jingjing Dai, Yuming Jiang, Yaoqin Xie, Xiaokun Liang
Rigid pre-registration involving local-global matching or other large deformation scenarios is crucial.
2 code implementations • 26 Sep 2023 • Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu
To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model.
Ranked #4 on Text-to-Video Generation on EvalCrafter Text-to-Video (ECTV) Dataset (using extra training data)
1 code implementation • ICCV 2023 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Wayne Wu, Ziwei Liu
A holistic human dataset inevitably has insufficient and low-resolution information on local parts.
1 code implementation • 20 Sep 2023 • Chenyang Si, Ziqi Huang, Yuming Jiang, Ziwei Liu
In this paper, we uncover the untapped potential of diffusion U-Net, which serves as a "free lunch" that substantially improves the generation quality on the fly.
1 code implementation • 5 Sep 2023 • Haonan Qiu, Zhaoxi Chen, Yuming Jiang, Hang Zhou, Xiangyu Fan, Lei Yang, Wayne Wu, Ziwei Liu
Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images.
1 code implementation • ICCV 2023 • Yanyan Huang, Weiqin Zhao, Shujun Wang, Yu Fu, Yuming Jiang, Lequan Yu
In this paper, we propose the FIRST continual learning framework for WSI analysis, named ConSlide, to tackle the challenges of enormous image size, utilization of hierarchical structure, and catastrophic forgetting by progressive model updating on multiple sequential datasets.
1 code implementation • CVPR 2023 • Ziqi Huang, Kelvin C. K. Chan, Yuming Jiang, Ziwei Liu
In this work, we present Collaborative Diffusion, where pre-trained uni-modal diffusion models collaborate to achieve multi-modal face generation and editing without re-training.
1 code implementation • ICCV 2023 • Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu
In this work, we present Text2Performer to generate vivid human videos with articulated motions from texts.
2 code implementations • 23 Mar 2023 • Ziqi Huang, Tianxing Wu, Yuming Jiang, Kelvin C. K. Chan, Ziwei Liu
Specifically, we propose a novel relation-steering contrastive learning scheme to impose two critical properties of the relation prompt: 1) The relation prompt should capture the interaction between objects, enforced by the preposition prior.
1 code implementation • 19 Dec 2022 • Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu
To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution.
1 code implementation • 16 Aug 2022 • Haonan Qiu, Yuming Jiang, Hang Zhou, Wayne Wu, Ziwei Liu
Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
no code implementations • 16 Aug 2022 • Chulong Zhang, Yuming Jiang, Na Li, Zhicheng Zhang, Md Tauhidul Islam, Jingjing Dai, Lin Liu, Wenfeng He, Wenjian Qin, Jing Xiong, Yaoqin Xie, Xiaokun Liang
Deformable image registration is a necessary technique for fusing multi-modal pathology slices.
2 code implementations • 31 May 2022 • Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu
In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.
4 code implementations • 25 Apr 2022 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu
In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.
1 code implementation • ICCV 2021 • Yuming Jiang, Ziqi Huang, Xingang Pan, Chen Change Loy, Ziwei Liu
In this work, we propose Talk-to-Edit, an interactive facial editing framework that performs fine-grained attribute manipulation through dialog between the user and the system.
Ranked #1 on Fine-Grained Facial Editing on CelebA-Dialog
1 code implementation • CVPR 2021 • Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu
However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e. g. scale and rotation) and the resolution gap (e. g. HR and LR).