no code implementations • 12 Nov 2023 • Tingfeng Cao, Chengyu Wang, Bingyan Liu, Ziheng Wu, Jinhui Zhu, Jun Huang
Then, to ensure that our generated prompts can generate more beautiful images, we further propose a Reinforcement Learning with Visual AI Feedback technique to fine-tune our model to maximize the reward values of the generated prompts, where the reward values are calculated based on the PickScore and the Aesthetic Scores.
no code implementations • 9 Oct 2023 • Weifeng Lin, Ziheng Wu, Jiayu Chen, Wentao Yang, Mingxin Huang, Jun Huang, Lianwen Jin
Fine-tuning pre-trained Vision Transformers (ViT) has consistently demonstrated promising performance in the realm of visual recognition.
2 code implementations • 7 Oct 2023 • Ziheng Wu, Jiaqi Xu, Xinyi Zou, Kunzhe Huang, Xing Shi, Jun Huang
By training a digital doppelganger of a specific user ID using 5 to 20 relevant images, the finetuned model (according to the trained LoRA model) allows for the generation of AI photos using arbitrary templates.
no code implementations • 21 Sep 2023 • Zhenzhen Chu, Jiayu Chen, Cen Chen, Chengyu Wang, Ziheng Wu, Jun Huang, Weining Qian
Position-aware global tokens also contain the position information of the image, which makes our model better for vision tasks.
1 code implementation • 28 Aug 2023 • Yang Liu, Cheng Yu, Lei Shang, Yongyi He, Ziheng Wu, Xingjun Wang, Chao Xu, Haoyu Xie, Weida Wang, Yuze Zhao, Lin Zhu, Chen Cheng, Weitao Chen, Yuan YAO, Wenmeng Zhou, Jiaqi Xu, Qiang Wang, Yingda Chen, Xuansong Xie, Baigui Sun
In this paper, we present FaceChain, a personalized portrait generation framework that combines a series of customized image-generation model and a rich set of face-related perceptual understanding models (\eg, face detection, deep face embedding extraction, and facial attribute recognition), to tackle aforementioned challenges and to generate truthful personalized portraits, with only a handful of portrait images as input.
no code implementations • 7 Aug 2023 • Zhongjie Duan, Lizhou You, Chengyu Wang, Cen Chen, Ziheng Wu, Weining Qian, Jun Huang
In recent years, diffusion models have emerged as the most powerful approach in image synthesis.
1 code implementation • ICCV 2023 • Weifeng Lin, Ziheng Wu, Jiayu Chen, Jun Huang, Lianwen Jin
Specifically, SMT with 11. 5M / 2. 4GFLOPs and 32M / 7. 7GFLOPs can achieve 82. 2% and 84. 3% top-1 accuracy on ImageNet-1K, respectively.
no code implementations • 4 Apr 2023 • Xinyao Shu, ShiYang Yan, Xu Yang, Ziheng Wu, Zhongfeng Chen, Zhenyu Lu
Unfortunately, language bias is a common problem in VQA, which refers to the model generating answers only by associating with the questions while ignoring the visual content, resulting in biased results.
3 code implementations • 27 Aug 2022 • Ziheng Wu, Xinyi Zou, Wenmeng Zhou, Jun Huang
We develop an all-in-one computer vision toolbox named EasyCV to facilitate the use of various SOTA computer vision methods.
no code implementations • 19 Dec 2021 • Jie Hu, Ziheng Wu, Vince Tan, Zhilin Lu, Mengze Zeng, Enhua Wu
For example, we raise the top-1 accuracy of binarized ResNet26 from 57. 9% to 64. 0%.