Search Results for author: Dhruv Madeka

Found 12 papers, 1 papers with code

A Study on the Calibration of In-context Learning

no code implementations • 7 Dec 2023 • HANLIN ZHANG, Yi-Fan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric Xing, Himabindu Lakkaraju, Sham Kakade

Accurate uncertainty quantification is crucial for the safe deployment of machine learning models, and prior research has demonstrated improvements in the calibration of modern language models (LMs).

In-Context Learning Natural Language Understanding +1

Paper
Add Code

Learning an Inventory Control Policy with General Inventory Arrival Dynamics

no code implementations • 26 Oct 2023 • Sohrab Andaz, Carson Eisenach, Dhruv Madeka, Kari Torkkola, Randy Jia, Dean Foster, Sham Kakade

In this paper we address the problem of learning and backtesting inventory control policies in the presence of general arrival dynamics -- which we term as a quantity-over-time arrivals model (QOT).

Paper
Add Code

Contextual Bandits for Evaluating and Improving Inventory Control Policies

no code implementations • 24 Oct 2023 • Dean Foster, Randy Jia, Dhruv Madeka

Solutions to address the periodic review inventory control problem with nonstationary random demand, lost sales, and stochastic vendor lead times typically involve making strong assumptions on the dynamics for either approximation or simulation, and applying methods such as optimization, dynamic programming, or reinforcement learning.

Multi-Armed Bandits

Paper
Add Code

Scaling Laws for Imitation Learning in Single-Agent Games

no code implementations • 18 Jul 2023 • Jens Tuyls, Dhruv Madeka, Kari Torkkola, Dean Foster, Karthik Narasimhan, Sham Kakade

Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.

Atari Games Imitation Learning +1

Paper
Add Code

Linear Reinforcement Learning with Ball Structure Action Space

no code implementations • 14 Nov 2022 • Zeyu Jia, Randy Jia, Dhruv Madeka, Dean P. Foster

We study the problem of Reinforcement Learning (RL) with linear function approximation, i. e. assuming the optimal action-value function is linear in a known $d$-dimensional feature mapping.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Deep Inventory Management

no code implementations • 6 Oct 2022 • Dhruv Madeka, Kari Torkkola, Carson Eisenach, Anna Luo, Dean P. Foster, Sham M. Kakade

This work provides a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching.

Management Model-based Reinforcement Learning +2

Paper
Add Code

MQRetNN: Multi-Horizon Time Series Forecasting with Retrieval Augmentation

no code implementations • 21 Jul 2022 • Sitan Yang, Carson Eisenach, Dhruv Madeka

For example, MQTransformer - an improvement of MQCNN - has shown the state-of-the-art performance in probabilistic demand forecasting.

Probabilistic Time Series Forecasting Retrieval +1

Paper
Add Code

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

no code implementations • 18 Jul 2022 • Philip Amortila, Nan Jiang, Dhruv Madeka, Dean P. Foster

Towards establishing the minimal amount of expert queries needed, we show that, in the same setting, any learner whose exploration budget is polynomially-bounded (in terms of $d, H,$ and $|\mathcal{A}|$) will require at least $\tilde\Omega(\sqrt{d})$ oracle calls to recover a policy competing with the expert's value function.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

Meta-Analysis of Randomized Experiments with Applications to Heavy-Tailed Response Data

no code implementations • 14 Dec 2021 • Nilesh Tripuraneni, Dhruv Madeka, Dean Foster, Dominique Perrault-Joncas, Michael I. Jordan

The key insight of our procedure is that the noisy (but unbiased) difference-of-means estimate can be used as a ground truth ``label" on a portion of the RCT, to test the performance of an estimator trained on the other portion.

Paper
Add Code

MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention

no code implementations • 30 Sep 2020 • Carson Eisenach, Yagna Patel, Dhruv Madeka

In this work, we propose novel improvements to the current state of the art by incorporating changes inspired by recent advances in Transformer architectures for Natural Language Processing.

Paper
Add Code

A Multi-Horizon Quantile Recurrent Forecaster

5 code implementations • 29 Nov 2017 • Ruofeng Wen, Kari Torkkola, Balakrishnan Narayanaswamy, Dhruv Madeka

We propose a framework for general probabilistic multi-step time series regression.

regression Time Series +1

4,300

Paper
Code

Scatteract: Automated extraction of data from scatter plots

no code implementations • 21 Apr 2017 • Mathieu Cliche, David Rosenberg, Dhruv Madeka, Connie Yee

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.