Search Results for author: Tiankai Hang

Found 6 papers, 6 papers with code

Simplified Diffusion Schrödinger Bridge

1 code implementation • 21 Mar 2024 • Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guo

This paper introduces a novel theoretical simplification of the Diffusion Schr\"odinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling faster convergence and enhanced performance.

Paper
Code

CCA: Collaborative Competitive Agents for Image Editing

1 code implementation • 23 Jan 2024 • Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo

This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.

Paper
Code

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

1 code implementation • 7 Sep 2023 • Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo

We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.

Keypoint Detection

333

Paper
Code

Efficient Diffusion Training via Min-SNR Weighting Strategy

2 code implementations • ICCV 2023 • Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.

Ranked #1 on Image Generation on ImageNet 256x256

Denoising Image Generation +2

163

Paper
Code

Language-Guided Face Animation by Recurrent StyleGAN-based Generator

1 code implementation • 11 Aug 2022 • Tiankai Hang, Huan Yang, Bei Liu, Jianlong Fu, Xin Geng, Baining Guo

Specifically, we propose a recurrent motion generator to extract a series of semantic and motion information from the language and feed it along with visual information to a pre-trained StyleGAN to generate high-quality frames.

Image Manipulation

Paper
Code

Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

1 code implementation • CVPR 2022 • Hongwei Xue, Tiankai Hang, Yanhong Zeng, Yuchong Sun, Bei Liu, Huan Yang, Jianlong Fu, Baining Guo

To enable VL pre-training, we jointly optimize the HD-VILA model by a hybrid Transformer that learns rich spatiotemporal features, and a multimodal Transformer that enforces interactions of the learned video features with diversified texts.

Ranked #16 on Video Retrieval on MSR-VTT

Retrieval Super-Resolution +4

438

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.