1 code implementation • ICCV 2023 • Shibo Jie, Haoqing Wang, Zhi-Hong Deng
Current state-of-the-art results in computer vision depend in part on fine-tuning large pre-trained vision models.
1 code implementation • CVPR 2023 • Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, Kai Han
The lower layers are not explicitly guided and the interaction among their patches is only used for calculating new activations.
1 code implementation • European Conference on Computer Vision 2022 • Haoqing Wang, Zhi-Hong Deng
This results in our CPN (Contrastive Prototypical Network) model, which combines the prototypical loss with pairwise contrast and outperforms the existing models from this paradigm with modestly large batch size.
1 code implementation • CVPR 2022 • Haoqing Wang, Xun Guo, Zhi-Hong Deng, Yan Lu
It significantly improves the performance of several classic contrastive learning models in downstream tasks.
no code implementations • 29 Sep 2021 • Haoqing Wang, Xun Guo, Zhi-Hong Deng, Yan Lu
Therefore, we assume the task-relevant information that is not shared between views can not be ignored and theoretically prove that the minimal sufficient representation in contrastive learning is not sufficient for the downstream tasks, which causes performance degradation.
1 code implementation • 29 Apr 2021 • Haoqing Wang, Zhi-Hong Deng
However, when there exists the domain shift between the training tasks and the test tasks, the obtained inductive bias fails to generalize across domains, which degrades the performance of the meta-learning models.
Ranked #2 on Cross-Domain Few-Shot on cars
1 code implementation • 12 Sep 2020 • Haoqing Wang, Zhi-Hong Deng
The performance of meta-learning approaches for few-shot learning generally depends on three aspects: features suitable for comparison, the classifier ( base learner ) suitable for low-data scenarios, and valuable information from the samples to classify.
1 code implementation • NeurIPS 2019 • Zhiqing Sun, Zhuohan Li, Haoqing Wang, Zi Lin, Di He, Zhi-Hong Deng
However, these models assume that the decoding process of each token is conditionally independent of others.