Search Results for author: Tiancheng Yu

Found 16 papers, 0 papers with code

Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition

no code implementations • ICML 2020 • Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the task of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

Paper
Add Code

The Power of Regularization in Solving Extensive-Form Games

no code implementations • 19 Jun 2022 • Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang

Second, we show that regularized counterfactual regret minimization (\texttt{Reg-CFR}), with a variant of optimistic mirror descent algorithm as regret-minimizer, can achieve $O(1/T^{1/4})$ best-iterate, and $O(1/T^{3/4})$ average-iterate convergence rate for finding NE in EFGs.

counterfactual

Paper
Add Code

Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent

no code implementations • 30 May 2022 • Yu Bai, Chi Jin, Song Mei, Ziang Song, Tiancheng Yu

A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs).

Paper
Add Code

Near-Optimal Learning of Extensive-Form Games with Imperfect Information

no code implementations • 3 Feb 2022 • Yu Bai, Chi Jin, Song Mei, Tiancheng Yu

This improves upon the best known sample complexity of $\widetilde{\mathcal{O}}((X^2A+Y^2B)/\varepsilon^2)$ by a factor of $\widetilde{\mathcal{O}}(\max\{X, Y\})$, and matches the information-theoretic lower bound up to logarithmic factors.

counterfactual Open-Ended Question Answering

Paper
Add Code

V-Learning -- A Simple, Efficient, Decentralized Algorithm for Multiagent RL

no code implementations • 27 Oct 2021 • Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu

We design a new class of fully decentralized algorithms -- V-learning, which provably learns Nash equilibria (in the two-player zero-sum setting), correlated equilibria and coarse correlated equilibria (in the multiplayer general-sum setting) in a number of samples that only scales with $\max_{i\in[m]} A_i$, where $A_i$ is the number of actions for the $i^{\rm th}$ player.

Medical Visual Question Answering Q-Learning

Paper
Add Code

The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces

no code implementations • 7 Jun 2021 • Chi Jin, Qinghua Liu, Tiancheng Yu

Modern reinforcement learning (RL) commonly engages practical problems with large state spaces, where function approximation must be deployed to approximate either the value function or the policy.

Reinforcement Learning (RL)

Paper
Add Code

Provably Efficient Algorithms for Multi-Objective Competitive RL

no code implementations • 5 Feb 2021 • Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

To our knowledge, this work provides the first provably efficient algorithms for vector-valued Markov games and our theoretical guarantees are near-optimal.

Multi-Objective Reinforcement Learning

Paper
Add Code

Online Learning in Unknown Markov Games

no code implementations • 28 Oct 2020 • Yi Tian, Yuanhao Wang, Tiancheng Yu, Suvrit Sra

We study online learning in unknown Markov games, a problem that arises in episodic multi-agent reinforcement learning where the actions of the opponents are unobservable.

Multi-agent Reinforcement Learning

Paper
Add Code

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

no code implementations • 4 Oct 2020 • Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

However, for multi-agent reinforcement learning in Markov games, the current best known sample complexity for model-based algorithms is rather suboptimal and compares unfavorably against recent model-free approaches.

Model-based Reinforcement Learning Multi-agent Reinforcement Learning +2

Paper
Add Code

Near-Optimal Reinforcement Learning with Self-Play

no code implementations • NeurIPS 2020 • Yu Bai, Chi Jin, Tiancheng Yu

This paper considers the problem of designing optimal algorithms for reinforcement learning in two-player zero-sum games.

Q-Learning reinforcement-learning +1

Paper
Add Code

A General Framework for Analyzing Stochastic Dynamics in Learning Algorithms

no code implementations • 11 Jun 2020 • Chi-Ning Chou, Juspreet Singh Sandhu, Mien Brabeeba Wang, Tiancheng Yu

In this work, we present a streamlined three-step recipe to tackle the "chicken and egg" problem and give a general framework for analyzing stochastic dynamics in learning algorithms.

Paper
Add Code

Reward-Free Exploration for Reinforcement Learning

no code implementations • ICML 2020 • Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Adversarial MDPs with Bandit Feedback and Unknown Transition

no code implementations • 3 Dec 2019 • Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the problem of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

Paper
Add Code

Efficient Policy Learning for Non-Stationary MDPs under Adversarial Manipulation

no code implementations • 22 Jul 2019 • Tiancheng Yu, Suvrit Sra

A Markov Decision Process (MDP) is a popular model for reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Near Optimal Stratified Sampling

no code implementations • 26 Jun 2019 • Tiancheng Yu, Xiyu Zhai, Suvrit Sra

The performance of a machine learning system is usually evaluated by using i. i. d.\ observations with true labels.

Paper
Add Code

Entropy Rate Estimation for Markov Chains with Large State Space

no code implementations • NeurIPS 2018 • Yanjun Han, Jiantao Jiao, Chuan-Zheng Lee, Tsachy Weissman, Yihong Wu, Tiancheng Yu

For estimating the Shannon entropy of a distribution on $S$ elements with independent samples, [Paninski2004] showed that the sample complexity is sublinear in $S$, and [Valiant--Valiant2011] showed that consistent estimation of Shannon entropy is possible if and only if the sample size $n$ far exceeds $\frac{S}{\log S}$.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.