Search Results for author: Dhruv Madeka

Found 12 papers, 1 papers with code

A Study on the Calibration of In-context Learning

no code implementations7 Dec 2023 HANLIN ZHANG, Yi-Fan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric Xing, Himabindu Lakkaraju, Sham Kakade

Accurate uncertainty quantification is crucial for the safe deployment of machine learning models, and prior research has demonstrated improvements in the calibration of modern language models (LMs).

In-Context Learning Natural Language Understanding +1

Learning an Inventory Control Policy with General Inventory Arrival Dynamics

no code implementations26 Oct 2023 Sohrab Andaz, Carson Eisenach, Dhruv Madeka, Kari Torkkola, Randy Jia, Dean Foster, Sham Kakade

In this paper we address the problem of learning and backtesting inventory control policies in the presence of general arrival dynamics -- which we term as a quantity-over-time arrivals model (QOT).

Contextual Bandits for Evaluating and Improving Inventory Control Policies

no code implementations24 Oct 2023 Dean Foster, Randy Jia, Dhruv Madeka

Solutions to address the periodic review inventory control problem with nonstationary random demand, lost sales, and stochastic vendor lead times typically involve making strong assumptions on the dynamics for either approximation or simulation, and applying methods such as optimization, dynamic programming, or reinforcement learning.

Multi-Armed Bandits

Scaling Laws for Imitation Learning in Single-Agent Games

no code implementations18 Jul 2023 Jens Tuyls, Dhruv Madeka, Kari Torkkola, Dean Foster, Karthik Narasimhan, Sham Kakade

Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.

Atari Games Imitation Learning +1

Linear Reinforcement Learning with Ball Structure Action Space

no code implementations14 Nov 2022 Zeyu Jia, Randy Jia, Dhruv Madeka, Dean P. Foster

We study the problem of Reinforcement Learning (RL) with linear function approximation, i. e. assuming the optimal action-value function is linear in a known $d$-dimensional feature mapping.

reinforcement-learning Reinforcement Learning (RL)

Deep Inventory Management

no code implementations6 Oct 2022 Dhruv Madeka, Kari Torkkola, Carson Eisenach, Anna Luo, Dean P. Foster, Sham M. Kakade

This work provides a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching.

Management Model-based Reinforcement Learning +2

MQRetNN: Multi-Horizon Time Series Forecasting with Retrieval Augmentation

no code implementations21 Jul 2022 Sitan Yang, Carson Eisenach, Dhruv Madeka

For example, MQTransformer - an improvement of MQCNN - has shown the state-of-the-art performance in probabilistic demand forecasting.

Probabilistic Time Series Forecasting Retrieval +1

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

no code implementations18 Jul 2022 Philip Amortila, Nan Jiang, Dhruv Madeka, Dean P. Foster

Towards establishing the minimal amount of expert queries needed, we show that, in the same setting, any learner whose exploration budget is polynomially-bounded (in terms of $d, H,$ and $|\mathcal{A}|$) will require at least $\tilde\Omega(\sqrt{d})$ oracle calls to recover a policy competing with the expert's value function.

Imitation Learning Reinforcement Learning (RL)

Meta-Analysis of Randomized Experiments with Applications to Heavy-Tailed Response Data

no code implementations14 Dec 2021 Nilesh Tripuraneni, Dhruv Madeka, Dean Foster, Dominique Perrault-Joncas, Michael I. Jordan

The key insight of our procedure is that the noisy (but unbiased) difference-of-means estimate can be used as a ground truth ``label" on a portion of the RCT, to test the performance of an estimator trained on the other portion.

MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention

no code implementations30 Sep 2020 Carson Eisenach, Yagna Patel, Dhruv Madeka

In this work, we propose novel improvements to the current state of the art by incorporating changes inspired by recent advances in Transformer architectures for Natural Language Processing.

Scatteract: Automated extraction of data from scatter plots

no code implementations21 Apr 2017 Mathieu Cliche, David Rosenberg, Dhruv Madeka, Connie Yee

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points.

Optical Character Recognition Optical Character Recognition (OCR)

Cannot find the paper you are looking for? You can Submit a new open access paper.