Search Results for author: Andi Nika

Found 6 papers, 2 papers with code

Corruption-Robust Offline Two-Player Zero-Sum Markov Games

no code implementations • 4 Mar 2024 • Andi Nika, Debmalya Mandal, Adish Singla, Goran Radanović

We note that we are the first to provide such a characterization of the problem of learning approximate Nash Equilibrium policies in offline two-player zero-sum Markov games under data corruption.

Paper
Add Code

Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences

no code implementations • 4 Mar 2024 • Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban, Georgios Tzannetos, Goran Radanović, Adish Singla

Moreover, we extend our analysis to the approximate optimization setting and derive exponentially decaying convergence rates for both RLHF and DPO.

Paper
Add Code

Corruption Robust Offline Reinforcement Learning with Human Feedback

no code implementations • 9 Feb 2024 • Debmalya Mandal, Andi Nika, Parameswaran Kamalaruban, Adish Singla, Goran Radanović

We aim to design algorithms that identify a near-optimal policy from the corrupted data, with provable guarantees.

Adversarial Attack reinforcement-learning

Paper
Add Code

Contextual Combinatorial Bandits with Changing Action Sets via Gaussian Processes

no code implementations • 5 Oct 2021 • Andi Nika, Sepehr Elahi, Cem Tekin

We consider a contextual bandit problem with a combinatorial action set and time-varying base arm availability.

Gaussian Processes

Paper
Add Code

Contextual Combinatorial Volatile Multi-armed Bandit with Adaptive Discretization

1 code implementation • 28 Aug 2020 • Andi Nika, Sepehr Elahi, Cem Tekin

We consider contextual combinatorial volatile multi-armed bandit (CCV-MAB), in which at each round, the learner observes a set of available base arms and their contexts, and then, selects a super arm that contains $K$ base arms in order to maximize its cumulative reward.

Paper
Code

Pareto Active Learning with Gaussian Processes and Adaptive Discretization

1 code implementation • 24 Jun 2020 • Andi Nika, Kerem Bozgan, Sepehr Elahi, Çağın Ararat, Cem Tekin

We consider the problem of optimizing a vector-valued objective function $\boldsymbol{f}$ sampled from a Gaussian Process (GP) whose index set is a well-behaved, compact metric space $({\cal X}, d)$ of designs.

Active Learning Gaussian Processes

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.