no code implementations • 3 Jun 2024 • Junhao Cheng, Xi Lu, Hanhui Li, Khun Loun Zai, Baiqiao Yin, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang
As cutting-edge Text-to-Image (T2I) generation models already excel at producing remarkable single images, an even more challenging task, i. e., multi-turn interactive image generation begins to attract the attention of related research communities.
1 code implementation • 29 Apr 2024 • Junhao Cheng, Baiqiao Yin, Kaixin Cai, Minbin Huang, Hanhui Li, Yuxin He, Xi Lu, Yue Li, Yifei Li, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang
To address this issue, we introduce TheaterGen, a training-free framework that integrates large language models (LLMs) and text-to-image (T2I) models to provide the capability of multi-turn image generation.
1 code implementation • 25 Apr 2024 • Jiehui Huang, Xiao Dong, Wenhui Song, Hanhui Li, Jun Zhou, Yuhao Cheng, Shutao Liao, Long Chen, Yiqiang Yan, Shengcai Liao, Xiaodan Liang
ConsistentID comprises two key components: a multimodal facial prompt generator that combines facial features, corresponding facial descriptions and the overall facial context to enhance precision in facial details, and an ID-preservation network optimized through the facial attention localization strategy, aimed at preserving ID consistency in facial regions.
1 code implementation • 2 Jan 2024 • Xuan Huang, Hanhui Li, Zejun Yang, Zhisheng Wang, Xiaodan Liang
Subsequently, a feature fusion module that exploits the visibility of query points and mesh vertices is introduced to adaptively merge features of both hands, enabling the recovery of features in unseen areas.
1 code implementation • 26 Dec 2023 • Hanhui Li, Xiaojian Lin, Xuan Huang, Zejun Yang, Zhisheng Wang, Xiaodan Liang
However, due to the fixed hand topology and complex hand poses, current models are hard to generate meshes that are aligned with the image well.
1 code implementation • 25 Nov 2022 • Zaiyu Huang, Hanhui Li, Zhenyu Xie, Michael Kampffmeyer, Qingling Cai, Xiaodan Liang
Existing methods are restricted in this setting as they estimate garment warping flows mainly based on 2D poses and appearance, which omits the geometric prior of the 3D human body shape.
no code implementations • CVPR 2022 • Chaojie Yang, Hanhui Li, Shengjie Wu, Shengkai Zhang, Haonan Yan, Nianhong Jiao, Jie Tang, Runnan Zhou, Xiaodan Liang, Tianxiang Zheng
This is because current methods mainly rely on a single pose/appearance model, which is limited in disentangling various poses and appearance in human images.
no code implementations • 29 May 2019 • Hefeng Wu, Yafei Hu, Keze Wang, Hanhui Li, Lin Nie, Hui Cheng
Multi-Person Tracking (MPT) is often addressed within the detection-to-association paradigm.
no code implementations • 28 Dec 2018 • Fei Wang, Shujin Lin, Hanhui Li, Hefeng Wu, Junkun Jiang, Ruomei Wang, Xiaonan Luo
Traditional sketch segmentation methods mainly rely on handcrafted features and complicate models, and their performance is far from satisfactory due to the abstract representation of sketches.
no code implementations • 5 Mar 2018 • Yue Xi, Jiangbin Zheng, Xiangjian He, Wenjing Jia, Hanhui Li
Tiny face detection aims to find faces with high degrees of variability in scale, resolution and occlusion in cluttered scenes.
no code implementations • 20 Jan 2018 • Hanhui Li, Xiangjian He, Hefeng Wu, Saeed Amirgholipour Kasmani, Ruomei Wang, Xiaonan Luo, Liang Lin
In this paper, we aim at tackling the problem of crowd counting in extremely high-density scenes, which contain hundreds, or even thousands of people.