no code implementations • 22 Apr 2024 • Lili Wu, Ben Evans, Riashat Islam, Raihan Seraj, Yonathan Efroni, Alex Lamb
In this work, we consider the problem of discovering the agent-centric state in the more challenging high-dimensional non-Markovian setting, when the state can be decoded from a sequence of past observations.
no code implementations • 6 Nov 2023 • Anurag Koul, Shivakanth Sujit, Shaoru Chen, Ben Evans, Lili Wu, Byron Xu, Rajan Chari, Riashat Islam, Raihan Seraj, Yonathan Efroni, Lekan Molu, Miro Dudik, John Langford, Alex Lamb
Goal-conditioned planning benefits from learned low-dimensional representations of rich, high-dimensional observations.
no code implementations • 3 Oct 2023 • Qi Yan, Raihan Seraj, JiaWei He, Lili Meng, Tristan Sylvain
Following this, the chosen articles are subjected to zero-shot summarization to attain succinct context.
1 code implementation • 4 Feb 2022 • Raihan Seraj, Jivitesh Sharma, Ole-Christoffer Granmo
This paper introduces an interpretable contextual bandit algorithm using Tsetlin Machines, which solves complex pattern recognition tasks using propositional logic.
1 code implementation • 17 Oct 2020 • Jayakumar Subramanian, Amit Sinha, Raihan Seraj, Aditya Mahajan
Our key result is to show that if a function of the history (called approximate information state (AIS)) approximately satisfies the properties of the information state, then there is a corresponding approximate dynamic program.
no code implementations • 11 Dec 2019 • Riashat Islam, Raihan Seraj, Samin Yeasar Arnob, Doina Precup
Furthermore, in cases where the reward function is stochastic that can lead to high variance, doubly robust critic estimation can improve performance under corrupted, stochastic reward signals, indicating its usefulness for robust and safe reinforcement learning.
no code implementations • 11 Dec 2019 • Riashat Islam, Raihan Seraj, Pierre-Luc Bacon, Doina Precup
In this work, we propose exploration in policy gradient methods based on maximizing entropy of the discounted future state distribution.