no code implementations • 1 Mar 2024 • Mengqi Huang, Zhendong Mao, Mingcong Liu, Qian He, Yongdong Zhang
However, the inherent entangled influence scope of pseudo-words with the given text results in a dual-optimum paradox, i. e., the similarity of the given subjects and the controllability of the given text could not be optimal simultaneously.
no code implementations • 22 Feb 2024 • Hao Li, Mengqi Huang, Lei Zhang, Bo Hu, Yi Liu, Zhendong Mao
GAN-based image attribute editing firstly leverages GAN Inversion to project real images into the latent space of GAN and then manipulates corresponding latent codes.
no code implementations • 1 Jul 2023 • Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao
While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centric images, an intractable problem is how to preserve the face identity for conditioned face images.
1 code implementation • CVPR 2023 • Mengqi Huang, Zhendong Mao, Quan Wang, Yongdong Zhang
Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.
1 code implementation • CVPR 2023 • Mengqi Huang, Zhendong Mao, Zhuowei Chen, Yongdong Zhang
Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook.
no code implementations • 3 Sep 2022 • Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang
Text-to-image generation aims at generating realistic images which are semantically consistent with the given text.