Search Results for author: Harshad Khadilkar

Found 26 papers, 7 papers with code

Multi-Agent Learning of Efficient Fulfilment and Routing Strategies in E-Commerce

no code implementations20 Nov 2023 Omkar Shelke, Pranavi Pathakota, Anandsingh Chauhan, Harshad Khadilkar, Hardik Meisheri, Balaraman Ravindran

This paper presents an integrated algorithmic framework for minimising product delivery costs in e-commerce (known as the cost-to-serve or C2S).

Decision Making

Using General Value Functions to Learn Domain-Backed Inventory Management Policies

no code implementations3 Nov 2023 Durgesh Kalwar, Omkar Shelke, Harshad Khadilkar

We consider the inventory management problem, where the goal is to balance conflicting objectives such as availability and wastage of a large range of products in a store.

Decision Making Management +1

Using Linear Regression for Iteratively Training Neural Networks

no code implementations11 Jul 2023 Harshad Khadilkar

The key idea is the observation that the input to every neuron in a neural network is a linear combination of the activations of neurons in the previous layer, as well as the parameters (weights and biases) of the layer.

regression

DCT: Dual Channel Training of Action Embeddings for Reinforcement Learning with Large Discrete Action Spaces

no code implementations28 Jun 2023 Pranavi Pathakota, Hardik Meisheri, Harshad Khadilkar

The ability to learn robust policies while generalizing over large discrete action spaces is an open challenge for intelligent systems, especially in noisy environments that face the curse of dimensionality.

Product Recommendation

Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas

no code implementations10 May 2023 Harshad Khadilkar

We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL), through the use of evolutionary operators.

reinforcement-learning Reinforcement Learning (RL)

Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

no code implementations2 Mar 2022 Durgesh Kalwar, Omkar Shelke, Somjit Nath, Hardik Meisheri, Harshad Khadilkar

Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorporated where the reward is sparse.

reinforcement-learning Reinforcement Learning (RL)

A Learning Based Framework for Handling Uncertain Lead Times in Multi-Product Inventory Management

no code implementations2 Mar 2022 Hardik Meisheri, Somjit Nath, Mayank Baranwal, Harshad Khadilkar

Through empirical evaluations, it is further shown that the inventory management with uncertain lead times is not only equivalent to that of delay in information sharing across multiple echelons (\emph{observation delay}), a model trained to handle one kind of delay is capable to handle delays of another kind without requiring to be retrained.

Management Q-Learning

A simulation driven optimization algorithm for scheduling sorting center operations

no code implementations7 Dec 2021 Supratim Ghosh, Aritra Pal, Prashant Kumar, Ankush Ojha, Aditya Paranjape, Souvik Barat, Harshad Khadilkar

Parcel sorting operations in logistics enterprises aim to achieve a high throughput of parcels through sorting centers.

Scheduling

Fast Approximate Solutions using Reinforcement Learning for Dynamic Capacitated Vehicle Routing with Time Windows

no code implementations24 Feb 2021 Nazneen N Sultana, Vinita Baniwal, Ansuma Basumatary, Piyush Mittal, Supratim Ghosh, Harshad Khadilkar

This paper develops an inherently parallelised, fast, approximate learning-based solution to the generic class of Capacitated Vehicle Routing Problems with Time Windows and Dynamic Routing (CVRP-TWDR).

Reinforcement Learning (RL)

School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget

no code implementations23 Feb 2021 Omkar Shelke, Hardik Meisheri, Harshad Khadilkar

In this paper, we focus on developing a curriculum for learning a robust and promising policy in a constrained computational budget of 100, 000 games, starting from a fixed base policy (which is itself trained to imitate a noisy expert policy).

Reinforcement Learning (RL)

Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication

no code implementations1 Nov 2020 Hardik Meisheri, Harshad Khadilkar

We describe our solution approach for Pommerman TeamRadio, a competition environment associated with NeurIPS 2019.

Imitation Learning

A Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing

no code implementations1 Jul 2020 Richa Verma, Aniruddha Singhal, Harshad Khadilkar, Ansuma Basumatary, Siddharth Nayak, Harsh Vardhan Singh, Swagat Kumar, Rajesh Sinha

We propose a Deep Reinforcement Learning (Deep RL) algorithm for solving the online 3D bin packing problem for an arbitrary number of bins and any bin size.

3D Bin Packing reinforcement-learning +1

SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning

no code implementations21 Apr 2020 Somjit Nath, Richa Verma, Abhik Ray, Harshad Khadilkar

We propose a generic reward shaping approach for improving the rate of convergence in reinforcement learning (RL), called Self Improvement Based REwards, or SIBRE.

reinforcement-learning Reinforcement Learning (RL)

Optimising Lockdown Policies for Epidemic Control using Reinforcement Learning

1 code implementation31 Mar 2020 Harshad Khadilkar, Tanuja Ganu, Deva P Seetharam

In the context of the ongoing Covid-19 pandemic, several reports and studies have attempted to model and predict the spread of the disease.

reinforcement-learning Reinforcement Learning (RL)

Accelerating Training in Pommerman with Imitation and Reinforcement Learning

no code implementations12 Nov 2019 Hardik Meisheri, Omkar Shelke, Richa Verma, Harshad Khadilkar

Our methodology involves training an agent initially through imitation learning on a noisy expert policy, followed by a proximal-policy optimization (PPO) reinforcement learning algorithm.

Imitation Learning reinforcement-learning +1

Reinforcement Learning for Multi-Objective Optimization of Online Decisions in High-Dimensional Systems

no code implementations1 Oct 2019 Hardik Meisheri, Vinita Baniwal, Nazneen N Sultana, Balaraman Ravindran, Harshad Khadilkar

This paper describes a purely data-driven solution to a class of sequential decision-making problems with a large number of concurrent online decisions, with applications to computing systems and operations research.

Decision Making Management +2

Cannot find the paper you are looking for? You can Submit a new open access paper.