1 code implementation • 15 Apr 2024 • Yaohui Li, Qifeng Zhou, Haoxing Chen, Jianbing Zhang, Xinyu Dai, Hao Zhou
Few-shot learning aims to further enhance the transfer capability of CLIP by giving few images in each class, aka 'few shots'.
1 code implementation • 15 Apr 2024 • Haoxing Chen, Yaohui Li, Zizheng Huang, Yan Hong, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Huijia Zhu, Weiqiang Wang
Recent advancements in efficient transfer learning (ETL) have shown remarkable success in fine-tuning VLMs within the scenario of limited data, introducing only a few parameters to harness task-specific insights from VLMs.
no code implementations • 20 Dec 2023 • Haoxing Chen, Yaohui Li, Zhangxuan Gu, Zhuoer Xu, Jun Lan, Huaxiong Li
Image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images.
1 code implementation • 21 Nov 2023 • Haoxing Chen, Yaohui Li, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Huijia Zhu, Weiqiang Wang
Recent methods mainly focus on learning multi-modal features aligned with class names to enhance the generalization ability to unseen categories.
Ranked #1 on GZSL Video Classification on ActivityNet-GZSL (cls)
1 code implementation • NeurIPS 2023 • Haoxing Chen, Zhuoer Xu, Zhangxuan Gu, Jun Lan, Xing Zheng, Yaohui Li, Changhua Meng, Huijia Zhu, Weiqiang Wang
Specifically, we build our model on a diffusion model and carefully modify the network structure to enable the model for drawing multilingual characters with the help of glyph and position information.
1 code implementation • CVPR 2023 • Zhangxuan Gu, Zhuoer Xu, Haoxing Chen, Jun Lan, Changhua Meng, Weiqiang Wang
Recent object detection approaches rely on pretrained vision-language models for image-text alignment.
2 code implementations • 6 Dec 2022 • Zhangxuan Gu, Haoxing Chen, Zhuoer Xu, Jun Lan, Changhua Meng, Weiqiang Wang
Extensive experimental results on COCO and LVIS show that DiffusionInst achieves competitive performance compared to existing instance segmentation models with various backbones, such as ResNet and Swin Transformers.
Ranked #8 on Instance Segmentation on LVIS v1.0 val
1 code implementation • 16 Nov 2022 • Haoxing Chen, Zhangxuan Gu, Yaohui Li, Jun Lan, Changhua Meng, Weiqiang Wang, Huaxiong Li
The MGD effectively applies distinct convolution to the foreground and background, learning the representations of foreground and background regions as well as their correlations to the global harmonization, facilitating local visual consistency for the images much more efficiently.
Ranked #2 on Image Harmonization on HAdobe5k(1024$\times$1024)
1 code implementation • 16 Jul 2022 • Zizheng Huang, Haoxing Chen, Ziqi Wen, Chao Zhang, Huaxiong Li, Bo wang, Chunlin Chen
Contrastive learning (CL) continuously achieves significant breakthroughs across multiple domains.
1 code implementation • 13 Dec 2021 • Haoxing Chen, Huaxiong Li, Yaohui Li, Chunlin Chen
Under the guidance of attribute modality, our method can learn enhanced semantic-aware representation for classification.
1 code implementation • 27 Sep 2021 • Haoxing Chen, Huaxiong Li, Yaohui Li, Chunlin Chen
Finally, we propose using an image patch-matching module to calculate the distance between dense local representations, thus determining which category the query image belongs to in the support set.
Ranked #16 on Few-Shot Image Classification on FC100 5-way (1-shot)
no code implementations • 21 Mar 2021 • Yaohui Li, Huaxiong Li, Haoxing Chen, Chunlin Chen
Few-shot image classification aims at recognizing unseen categories with a small number of labeled training data.
no code implementations • 21 Mar 2021 • Haoxing Chen, Huaxiong Li, Yaohui Li, Chunlin Chen
Moreover, a Multi-level Metric Learning (MML) method is proposed, which not only calculates the pixel-level similarity but also considers the similarity of part-level features and global-level features.
no code implementations • 30 Nov 2020 • Haoxing Chen, Huaxiong Li, Yaohui Li, Chunlin Chen
Then, an adaptive task attention module is proposed to select the most important LRs among the entire task.