Search Results for author: Nirjhar Das

Found 4 papers, 1 papers with code

Optimal Regret with Limited Adaptivity for Generalized Linear Contextual Bandits

1 code implementation • 10 Apr 2024 • Ayush Sawarni, Nirjhar Das, Siddharth Barman, Gaurav Sinha

For our batch learning algorithm B-GLinCB, with $\Omega\left( \log{\log T} \right)$ batches, the regret scales as $\tilde{O}(\sqrt{T})$.

Multi-Armed Bandits

Paper
Code

Provably Sample Efficient RLHF via Active Preference Optimization

no code implementations • 16 Feb 2024 • Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury

Experimental evaluations on a human preference dataset validate \texttt{APO}'s efficacy as a sample-efficient and practical solution to data collection for RLHF, facilitating alignment of LLMs with human preferences in a cost-effective and scalable manner.

Paper
Add Code

Inverse Reinforcement Learning With Constraint Recovery

no code implementations • 14 May 2023 • Nirjhar Das, Arpan Chattopadhyay

In this work, we propose a novel inverse reinforcement learning (IRL) algorithm for constrained Markov decision process (CMDP) problems.

reinforcement-learning

Paper
Add Code

A View Independent Classification Framework for Yoga Postures

no code implementations • 27 Jun 2022 • Mustafa Chasmai, Nirjhar Das, Aman Bhardwaj, Rahul Garg

We argue that for most of the applications, validation accuracies on unseen subjects and unseen camera angles would be most important.

Classification Pose Estimation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.