no code implementations • 1 Dec 2023 • Sacha Morin, Somjit Nath, Samira Ebrahimi Kahou, Guy Wolf
This work is concerned with the temporal contrastive learning (TCL) setting where the sequential structure of the data is used instead to define positive pairs, which is more commonly used in RL and robotics contexts.
1 code implementation • 19 Jun 2023 • Nikunj Gupta, Somjit Nath, Samira Ebrahimi Kahou
Before taking actions in an environment with more than one intelligent agent, an autonomous agent may benefit from reasoning about the other agents and utilizing a notion of a guarantee or confidence about the behavior of the system.
1 code implementation • 27 Apr 2023 • Somjit Nath, Gopeshh Raaj Subbaraj, Khimya Khetarpal, Samira Ebrahimi Kahou
Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards.
no code implementations • 20 Sep 2022 • Somjit Nath, Rushiv Arora, Samira Ebrahimi Kahou
This encourages the representations to be driven not only by the value/policy learning but also by an additional loss that constrains the representations from over-fitting to the value loss.
no code implementations • 2 Mar 2022 • Hardik Meisheri, Somjit Nath, Mayank Baranwal, Harshad Khadilkar
Through empirical evaluations, it is further shown that the inventory management with uncertain lead times is not only equivalent to that of delay in information sharing across multiple echelons (\emph{observation delay}), a model trained to handle one kind of delay is capable to handle delays of another kind without requiring to be retrained.
no code implementations • 2 Mar 2022 • Durgesh Kalwar, Omkar Shelke, Somjit Nath, Hardik Meisheri, Harshad Khadilkar
Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorporated where the reward is sparse.
1 code implementation • 17 Aug 2021 • Somjit Nath, Mayank Baranwal, Harshad Khadilkar
Several real-world scenarios, such as remote control and sensing, are comprised of action and observation delays.
1 code implementation • 7 Jun 2020 • Nazneen N Sultana, Hardik Meisheri, Vinita Baniwal, Somjit Nath, Balaraman Ravindran, Harshad Khadilkar
This paper describes the application of reinforcement learning (RL) to multi-product inventory management in supply chains.
no code implementations • ICLR 2020 • Somjit Nath, Vincent Liu, Alan Chan, Xin Li, Adam White, Martha White
Recurrent neural networks (RNNs) allow an agent to construct a state-representation from a stream of experience, which is essential in partially observable problems.
no code implementations • 21 Apr 2020 • Somjit Nath, Richa Verma, Abhik Ray, Harshad Khadilkar
We propose a generic reward shaping approach for improving the rate of convergence in reinforcement learning (RL), called Self Improvement Based REwards, or SIBRE.
no code implementations • ICLR 2019 • Wesley Chung, Somjit Nath, Ajin Joseph, Martha White
A key component for many reinforcement learning agents is to learn a value function, either for policy evaluation or control.