Search Results for author: Yunfan Zhao

Found 8 papers, 4 papers with code

A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

no code implementations22 Feb 2024 Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe

Efforts to reduce maternal mortality rate, a key UN Sustainable Development target (SDG Target 3. 1), rely largely on preventative care programs to spread critical health information to high-risk populations.

Language Modelling

Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization

no code implementations23 Oct 2023 Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe

Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective.

Multi-agent Reinforcement Learning Multi-Armed Bandits +1

Scalable Neural Network Kernels

1 code implementation20 Oct 2023 Arijit Sehanobish, Krzysztof Choromanski, Yunfan Zhao, Avinava Dubey, Valerii Likhosherstov

We introduce the concept of scalable neural network kernels (SNNKs), the replacements of regular feedforward layers (FFLs), capable of approximating the latter, but with favorable computational properties.

Balanced Off-Policy Evaluation for Personalized Pricing

2 code implementations24 Feb 2023 Adam N. Elmachtoub, Vishal Gupta, Yunfan Zhao

We consider a personalized pricing problem in which we have data consisting of feature information, historical pricing decisions, and binary realized demand.

Off-policy evaluation

Implicit Two-Tower Policies

no code implementations2 Aug 2022 Yunfan Zhao, Qingkai Pan, Krzysztof Choromanski, Deepali Jain, Vikas Sindhwani

We present a new class of structured reinforcement learning policy-architectures, Implicit Two-Tower (ITT) policies, where the actions are chosen based on the attention scores of their learnable latent representations with those of the input states.

OpenAI Gym Vocal Bursts Valence Prediction

Nuances in Margin Conditions Determine Gains in Active Learning

no code implementations16 Oct 2021 Samory Kpotufe, Gan Yuan, Yunfan Zhao

We consider nonparametric classification with smooth regression functions, where it is well known that notions of margin in $E[Y|X]$ determine fast or slow rates in both active and passive learning.

Active Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.