Search Results for author: Nirjhar Das

Found 4 papers, 1 papers with code

Optimal Regret with Limited Adaptivity for Generalized Linear Contextual Bandits

1 code implementation10 Apr 2024 Ayush Sawarni, Nirjhar Das, Siddharth Barman, Gaurav Sinha

For our batch learning algorithm B-GLinCB, with $\Omega\left( \log{\log T} \right)$ batches, the regret scales as $\tilde{O}(\sqrt{T})$.

Multi-Armed Bandits

Provably Sample Efficient RLHF via Active Preference Optimization

no code implementations16 Feb 2024 Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury

Experimental evaluations on a human preference dataset validate \texttt{APO}'s efficacy as a sample-efficient and practical solution to data collection for RLHF, facilitating alignment of LLMs with human preferences in a cost-effective and scalable manner.

Inverse Reinforcement Learning With Constraint Recovery

no code implementations14 May 2023 Nirjhar Das, Arpan Chattopadhyay

In this work, we propose a novel inverse reinforcement learning (IRL) algorithm for constrained Markov decision process (CMDP) problems.

reinforcement-learning

A View Independent Classification Framework for Yoga Postures

no code implementations27 Jun 2022 Mustafa Chasmai, Nirjhar Das, Aman Bhardwaj, Rahul Garg

We argue that for most of the applications, validation accuracies on unseen subjects and unseen camera angles would be most important.

Classification Pose Estimation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.