Search Results for author: Idan Shenfeld

Found 6 papers, 5 papers with code

Value Augmented Sampling for Language Model Alignment and Personalization

1 code implementation • 10 May 2024 • Seungwook Han, Idan Shenfeld, Akash Srivastava, Yoon Kim, Pulkit Agrawal

Aligning Large Language Models (LLMs) to cater to different human preferences, learning new skills, and unlearning harmful behavior is an important problem.

Language Modelling Reinforcement Learning (RL)

Paper
Code

JUICER: Data-Efficient Imitation Learning for Robotic Assembly

1 code implementation • 4 Apr 2024 • Lars Ankile, Anthony Simeonov, Idan Shenfeld, Pulkit Agrawal

While learning from demonstrations is powerful for acquiring visuomotor policies, high-performance imitation without large demonstration datasets remains challenging for tasks requiring precise, long-horizon manipulation.

Data Augmentation Imitation Learning

Paper
Code

Curiosity-driven Red-teaming for Large Language Models

1 code implementation • 29 Feb 2024 • Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James Glass, Akash Srivastava, Pulkit Agrawal

To probe when an LLM generates unwanted content, the current paradigm is to recruit a \textit{red team} of human testers to design input prompts (i. e., test cases) that elicit undesirable responses from LLMs.

Reinforcement Learning (RL)

Paper
Code

TGRL: An Algorithm for Teacher Guided Reinforcement Learning

no code implementations • 6 Jul 2023 • Idan Shenfeld, Zhang-Wei Hong, Aviv Tamar, Pulkit Agrawal

To combine the benefits of these different forms of learning, it is common to train a policy to maximize a combination of reinforcement and teacher-student learning objectives.

counterfactual Decision Making +1

Paper
Add Code

Offline Meta Reinforcement Learning -- Identifiability Challenges and Effective Data Collection Strategies

1 code implementation • NeurIPS 2021 • Ron Dorfman, Idan Shenfeld, Aviv Tamar

Consider the following instance of the Offline Meta Reinforcement Learning (OMRL) problem: given the complete training logs of $N$ conventional RL agents, trained on $N$ different tasks, design a meta-agent that can quickly maximize reward in a new, unseen task from the same task distribution.

Meta Reinforcement Learning reinforcement-learning +1

Paper
Code

Offline Meta Learning of Exploration

1 code implementation • NeurIPS 2021 • Ron Dorfman, Idan Shenfeld, Aviv Tamar

Meta-Learning Meta Reinforcement Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.