Search Results for author: Fahim Tajwar

Found 7 papers, 7 papers with code

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

1 code implementation • 22 Apr 2024 • Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar

Our main finding is that, in general, approaches that use on-policy sampling or attempt to push down the likelihood on certain responses (i. e., employ a "negative gradient") outperform offline and maximum likelihood objectives.

Contrastive Learning Reinforcement Learning (RL)

Paper
Code

Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias

1 code implementation • 12 Oct 2023 • Max Sobol Mark, Archit Sharma, Fahim Tajwar, Rafael Rafailov, Sergey Levine, Chelsea Finn

Can we leverage offline RL to recover better policies from online interaction?

D4RL Offline RL +2

Paper
Code

Conservative Prediction via Data-Driven Confidence Minimization

1 code implementation • 8 Jun 2023 • Caroline Choi, Fahim Tajwar, Yoonho Lee, Huaxiu Yao, Ananya Kumar, Chelsea Finn

Taking inspiration from this result, we present data-driven confidence minimization (DCM), which minimizes confidence on an uncertainty dataset containing examples that the model is likely to misclassify at test time.

Paper
Code

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

1 code implementation • 20 Oct 2022 • Yoonho Lee, Annie S. Chen, Fahim Tajwar, Ananya Kumar, Huaxiu Yao, Percy Liang, Chelsea Finn

A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task.

Transfer Learning

Paper
Code

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

1 code implementation • 19 Oct 2022 • Annie Xie, Fahim Tajwar, Archit Sharma, Chelsea Finn

A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world.

Continuous Control reinforcement-learning +1

Paper
Code

Do Deep Networks Transfer Invariances Across Classes?

1 code implementation • ICLR 2022 • Allan Zhou, Fahim Tajwar, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn

Based on this analysis, we show how a generative approach for learning the nuisance transformations can help transfer invariances across classes and improve performance on a set of imbalanced image classification benchmarks.

Ranked #21 on Long-tail Learning on CIFAR-10-LT (ρ=100)

Image Classification Long-tail Learning

Paper
Code

No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets

1 code implementation • 12 Sep 2021 • Fahim Tajwar, Ananya Kumar, Sang Michael Xie, Percy Liang

Out-of-distribution detection is an important component of reliable ML systems.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.