Search Results for author: Omkar Shelke

Found 5 papers, 0 papers with code

Multi-Agent Learning of Efficient Fulfilment and Routing Strategies in E-Commerce

no code implementations • 20 Nov 2023 • Omkar Shelke, Pranavi Pathakota, Anandsingh Chauhan, Harshad Khadilkar, Hardik Meisheri, Balaraman Ravindran

This paper presents an integrated algorithmic framework for minimising product delivery costs in e-commerce (known as the cost-to-serve or C2S).

Decision Making

Paper
Add Code

Using General Value Functions to Learn Domain-Backed Inventory Management Policies

no code implementations • 3 Nov 2023 • Durgesh Kalwar, Omkar Shelke, Harshad Khadilkar

We consider the inventory management problem, where the goal is to balance conflicting objectives such as availability and wastage of a large range of products in a store.

Decision Making Management +1

Paper
Add Code

Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

no code implementations • 2 Mar 2022 • Durgesh Kalwar, Omkar Shelke, Somjit Nath, Hardik Meisheri, Harshad Khadilkar

Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorporated where the reward is sparse.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget

no code implementations • 23 Feb 2021 • Omkar Shelke, Hardik Meisheri, Harshad Khadilkar

In this paper, we focus on developing a curriculum for learning a robust and promising policy in a constrained computational budget of 100, 000 games, starting from a fixed base policy (which is itself trained to imitate a noisy expert policy).

Reinforcement Learning (RL)

Paper
Add Code

Accelerating Training in Pommerman with Imitation and Reinforcement Learning

no code implementations • 12 Nov 2019 • Hardik Meisheri, Omkar Shelke, Richa Verma, Harshad Khadilkar

Our methodology involves training an agent initially through imitation learning on a noisy expert policy, followed by a proximal-policy optimization (PPO) reinforcement learning algorithm.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.