Search Results for author: Guy Tennenholtz

Found 24 papers, 4 papers with code

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

no code implementations25 Feb 2024 Anthony Liang, Guy Tennenholtz, Chih-Wei Hsu, Yinlam Chow, Erdem Biyik, Craig Boutilier

We introduce DynaMITE-RL, a meta-reinforcement learning (meta-RL) approach to approximate inference in environments where the latent state evolves at varying rates.

Continuous Control Meta Reinforcement Learning

Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation

1 code implementation29 Oct 2023 Li Ding, Masrour Zoghi, Guy Tennenholtz, Maryam Karimzadehgan

We introduce EV3, a novel meta-optimization framework designed to efficiently train scalable machine learning models through an intuitive explore-assess-adapt protocol.

Evolutionary Algorithms Knowledge Distillation +2

Factual and Personalized Recommendations using Language Models and Reinforcement Learning

no code implementations9 Oct 2023 Jihwan Jeong, Yinlam Chow, Guy Tennenholtz, Chih-Wei Hsu, Azamat Tulepbergenov, Mohammad Ghavamzadeh, Craig Boutilier

Recommender systems (RSs) play a central role in connecting users to content, products, and services, matching candidate items to users based on their preferences.

Language Modelling Recommendation Systems +1

Demystifying Embedding Spaces using Large Language Models

no code implementations6 Oct 2023 Guy Tennenholtz, Yinlam Chow, Chih-Wei Hsu, Jihwan Jeong, Lior Shani, Azamat Tulepbergenov, Deepak Ramachandran, Martin Mladenov, Craig Boutilier

Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format.

Dimensionality Reduction Recommendation Systems

Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models

no code implementations8 Sep 2023 Craig Boutilier, Martin Mladenov, Guy Tennenholtz

Modern recommender systems lie at the heart of complex ecosystems that couple the behavior of users, content providers, advertisers, and other actors.

Recommendation Systems

Bayesian Regret Minimization in Offline Bandits

no code implementations2 Jun 2023 Mohammad Ghavamzadeh, Marek Petrik, Guy Tennenholtz

We study how to make decisions that minimize Bayesian regret in offline linear bandits.

Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding

no code implementations1 Jun 2023 Alizée Pace, Hugo Yèche, Bernhard Schölkopf, Gunnar Rätsch, Guy Tennenholtz

A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding: unobserved variables may influence both the actions taken by the agent and the observed outcomes.

Management Offline RL +2

Ranking with Popularity Bias: User Welfare under Self-Amplification Dynamics

no code implementations24 May 2023 Guy Tennenholtz, Martin Mladenov, Nadav Merlis, Robert L. Axtell, Craig Boutilier

We highlight the importance of exploration, not to eliminate popularity bias, but to mitigate its negative impact on welfare.

Reinforcement Learning with History-Dependent Dynamic Contexts

no code implementations4 Feb 2023 Guy Tennenholtz, Nadav Merlis, Lior Shani, Martin Mladenov, Craig Boutilier

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts change over time.

reinforcement-learning Reinforcement Learning (RL)

Reinforcement Learning with a Terminator

1 code implementation30 May 2022 Guy Tennenholtz, Nadav Merlis, Lior Shani, Shie Mannor, Uri Shalit, Gal Chechik, Assaf Hallak, Gal Dalal

We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds.

Autonomous Driving reinforcement-learning +1

On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning

no code implementations ICLR 2022 Guy Tennenholtz, Assaf Hallak, Gal Dalal, Shie Mannor, Gal Chechik, Uri Shalit

We analyze the limitations of learning from such data with and without external reward, and propose an adjustment of standard imitation learning algorithms to fit this setup.

Imitation Learning Recommendation Systems +2

Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

no code implementations22 Sep 2021 Roy Zohar, Shie Mannor, Guy Tennenholtz

Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.

Multi-agent Reinforcement Learning reinforcement-learning +1

Maximum Entropy Reinforcement Learning with Mixture Policies

no code implementations18 Mar 2021 Nir Baram, Guy Tennenholtz, Shie Mannor

However, using mixture policies in the Maximum Entropy (MaxEnt) framework is not straightforward.

Continuous Control reinforcement-learning +1

Uncertainty Estimation Using Riemannian Model~Dynamics for Offline Reinforcement Learning

no code implementations22 Feb 2021 Guy Tennenholtz, Shie Mannor

In this work, we combine parametric and nonparametric methods for uncertainty estimation through a novel latent space based metric.

Autonomous Driving Continuous Control +3

Action Redundancy in Reinforcement Learning

no code implementations22 Feb 2021 Nir Baram, Guy Tennenholtz, Shie Mannor

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning paradigm which seeks to maximize return under entropy regularization.

reinforcement-learning Reinforcement Learning (RL)

Bandits with Partially Observable Confounded Data

no code implementations11 Jun 2020 Guy Tennenholtz, Uri Shalit, Shie Mannor, Yonathan Efroni

We construct a linear bandit algorithm that takes advantage of the projected information, and prove regret bounds.

Multi-Armed Bandits

Never Worse, Mostly Better: Stable Policy Improvement in Deep Reinforcement Learning

no code implementations2 Oct 2019 Pranav Khanna, Guy Tennenholtz, Nadav Merlis, Shie Mannor, Chen Tessler

In recent years, there has been significant progress in applying deep reinforcement learning (RL) for solving challenging problems across a wide variety of domains.

Continuous Control reinforcement-learning +1

Off-Policy Evaluation in Partially Observable Environments

no code implementations9 Sep 2019 Guy Tennenholtz, Shie Mannor, Uri Shalit

This work studies the problem of batch off-policy evaluation for Reinforcement Learning in partially observable environments.

Off-policy evaluation

Distributional Policy Optimization: An Alternative Approach for Continuous Control

3 code implementations NeurIPS 2019 Chen Tessler, Guy Tennenholtz, Shie Mannor

We show that optimizing over such sets results in local movement in the action space and thus convergence to sub-optimal solutions.

Continuous Control Policy Gradient Methods

The Natural Language of Actions

1 code implementation4 Feb 2019 Guy Tennenholtz, Shie Mannor

We introduce Act2Vec, a general framework for learning context-based action representation for Reinforcement Learning.

reinforcement-learning Reinforcement Learning (RL) +2

Train on Validation: Squeezing the Data Lemon

no code implementations16 Feb 2018 Guy Tennenholtz, Tom Zahavy, Shie Mannor

We define the notion of on-average-validation-stable algorithms as one in which using small portions of validation data for training does not overfit the model selection process.

Model Selection

The Stochastic Firefighter Problem

no code implementations22 Nov 2017 Guy Tennenholtz, Constantine Caramanis, Shie Mannor

We devise a simple policy that only vaccinates neighbors of infected nodes and is optimal on regular trees and on general graphs for a sufficiently large budget.

Cannot find the paper you are looking for? You can Submit a new open access paper.