Search Results for author: Zhaochen Yu

Found 3 papers, 3 papers with code

Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing

1 code implementation • 26 Feb 2024 • Ling Yang, Zhilong Zhang, Zhaochen Yu, Jingwei Liu, Minkai Xu, Stefano Ermon, Bin Cui

To address this issue, we propose a novel and general contextualized diffusion model (ContextDiff) by incorporating the cross-modal context encompassing interactions and alignments between text condition and visual sample into forward and reverse processes.

Text-to-Image Generation Text-to-Video Editing +1

Paper
Code

RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models

2 code implementations • 20 Feb 2024 • Xinchen Zhang, Ling Yang, Yaqi Cai, Zhaochen Yu, Jiake Xie, Ye Tian, Minkai Xu, Yong Tang, Yujiu Yang, Bin Cui

In this paper, we propose a new training-free and transferred-friendly text-to-image generation framework, namely RealCompo, which aims to leverage the advantages of text-to-image and layout-to-image models to enhance both realism and compositionality of the generated images.

Denoising Text-to-Image Generation

Paper
Code

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

1 code implementation • 22 Jan 2024 • Ling Yang, Zhaochen Yu, Chenlin Meng, Minkai Xu, Stefano Ermon, Bin Cui

In this paper, we propose a brand new training-free text-to-image generation/editing framework, namely Recaption, Plan and Generate (RPG), harnessing the powerful chain-of-thought reasoning ability of multimodal LLMs to enhance the compositionality of text-to-image diffusion models.

Diffusion Personalization Tuning Free Large Language Model

1,434

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.