no code implementations • 1 Feb 2024 • Zhiquan Tan, Chenghai Li, Weiran Huang
This paper investigates the information encoded in the embeddings of large language models (LLMs).
1 code implementation • 30 Jan 2024 • Lai Wei, Zhiquan Tan, Chenghai Li, Jindong Wang, Weiran Huang
Large language models (LLMs) have revolutionized the field of natural language processing, extending their strong capabilities into multi-modal domains.
no code implementations • 11 Nov 2023 • Zhiquan Tan, Weiran Huang
Recently, an interesting phenomenon called grokking has gained much attention, where generalization occurs long after the models have initially overfitted the training data.
no code implementations • 26 Oct 2023 • Zhiquan Tan, Kaipeng Zheng, Weiran Huang
In this paper, we present a new approach called OTMatch, which leverages semantic relationships among classes by employing an optimal transport loss function.
2 code implementations • 29 Sep 2023 • Zhiquan Tan, Jingqin Yang, Weiran Huang, Yang Yuan, Yifan Zhang
In this paper, we provide a comprehensive toolbox for understanding and enhancing self-supervised learning (SSL) methods through the lens of matrix information theory.
3 code implementations • 27 May 2023 • Yifan Zhang, Zhiquan Tan, Jingqin Yang, Weiran Huang, Yang Yuan
Inspired by this framework, we introduce Matrix-SSL, a novel approach that leverages matrix information theory to interpret the maximum entropy encoding loss as matrix uniformity loss.
Ranked #1 on Contrastive Learning on imagenet-1k
1 code implementation • 17 May 2023 • Yifan Zhang, Jingqin Yang, Zhiquan Tan, Yang Yuan
Semi-supervised learning has achieved notable success by leveraging very few labeled data and exploiting the wealth of information derived from unlabeled data.
1 code implementation • 26 Apr 2023 • Zhiquan Tan, ZiHao Wang, Yifan Zhang
Label hierarchy is an important source of external knowledge that can enhance classification performance.
1 code implementation • 27 Mar 2023 • Zhiquan Tan, Yifan Zhang, Jingqin Yang, Yang Yuan
Contrastive learning is a powerful self-supervised learning method, but we have a limited theoretical understanding of how it works and why it works.