Search Results for author: Yibo Jiang

Found 7 papers, 1 papers with code

On the Origins of Linear Representations in Large Language Models

no code implementations6 Mar 2024 Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam, Victor Veitch

To that end, we introduce a simple latent variable model to abstract and formalize the concept dynamics of the next token prediction.

Language Modelling Large Language Model

Direct Acquisition Optimization for Low-Budget Active Learning

no code implementations8 Feb 2024 Zhuokai Zhao, Yibo Jiang, Yuxin Chen

Active Learning (AL) has gained prominence in integrating data-intensive machine learning (ML) models into domains with limited labeled data.

Active Learning

Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints

no code implementations28 Sep 2023 Chaoqi Wang, Yibo Jiang, Chenghao Yang, Han Liu, Yuxin Chen

The increasing capabilities of large language models (LLMs) raise opportunities for artificial general intelligence but concurrently amplify safety concerns, such as potential misuse of AI systems, necessitating effective AI alignment.

Invariant and Transportable Representations for Anti-Causal Domain Shifts

1 code implementation4 Jul 2022 Yibo Jiang, Victor Veitch

In this paper, we study representation learning under a particular notion of domain shift that both respects causal invariance and that naturally handles the "anti-causal" structure.

Representation Learning

Associative Memory in Iterated Overparameterized Sigmoid Autoencoders

no code implementations ICML 2020 Yibo Jiang, Cengiz Pehlevan

Recent work showed that overparameterized autoencoders can be trained to implement associative memory via iterative maps, when the trained input-output Jacobian of the network has all of its eigenvalue norms strictly below one.

Learning Theory regression

Meta-Learning to Cluster

no code implementations30 Oct 2019 Yibo Jiang, Nakul Verma

By providing multiple types of training datasets as inputs, our model has the ability to generalize well on unseen datasets (new clustering tasks).

Clustering Meta-Learning

Model-Agnostic Meta-Learning using Runge-Kutta Methods

no code implementations16 Oct 2019 Daniel Jiwoong Im, Yibo Jiang, Nakul Verma

By leveraging this refined control, we demonstrate that there are multiple principled ways to update MAML and show that the classic MAML optimization is simply a special case of second-order Runge-Kutta method that mainly focuses on fast-adaptation.

Meta-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.