Search Results for author: Tiancheng Yu

Found 16 papers, 0 papers with code

Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition

no code implementations ICML 2020 Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the task of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

The Power of Regularization in Solving Extensive-Form Games

no code implementations19 Jun 2022 Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang

Second, we show that regularized counterfactual regret minimization (\texttt{Reg-CFR}), with a variant of optimistic mirror descent algorithm as regret-minimizer, can achieve $O(1/T^{1/4})$ best-iterate, and $O(1/T^{3/4})$ average-iterate convergence rate for finding NE in EFGs.

counterfactual

Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent

no code implementations30 May 2022 Yu Bai, Chi Jin, Song Mei, Ziang Song, Tiancheng Yu

A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs).

Near-Optimal Learning of Extensive-Form Games with Imperfect Information

no code implementations3 Feb 2022 Yu Bai, Chi Jin, Song Mei, Tiancheng Yu

This improves upon the best known sample complexity of $\widetilde{\mathcal{O}}((X^2A+Y^2B)/\varepsilon^2)$ by a factor of $\widetilde{\mathcal{O}}(\max\{X, Y\})$, and matches the information-theoretic lower bound up to logarithmic factors.

counterfactual Open-Ended Question Answering

V-Learning -- A Simple, Efficient, Decentralized Algorithm for Multiagent RL

no code implementations27 Oct 2021 Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu

We design a new class of fully decentralized algorithms -- V-learning, which provably learns Nash equilibria (in the two-player zero-sum setting), correlated equilibria and coarse correlated equilibria (in the multiplayer general-sum setting) in a number of samples that only scales with $\max_{i\in[m]} A_i$, where $A_i$ is the number of actions for the $i^{\rm th}$ player.

Medical Visual Question Answering Q-Learning

The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces

no code implementations7 Jun 2021 Chi Jin, Qinghua Liu, Tiancheng Yu

Modern reinforcement learning (RL) commonly engages practical problems with large state spaces, where function approximation must be deployed to approximate either the value function or the policy.

Reinforcement Learning (RL)

Provably Efficient Algorithms for Multi-Objective Competitive RL

no code implementations5 Feb 2021 Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

To our knowledge, this work provides the first provably efficient algorithms for vector-valued Markov games and our theoretical guarantees are near-optimal.

Multi-Objective Reinforcement Learning

Online Learning in Unknown Markov Games

no code implementations28 Oct 2020 Yi Tian, Yuanhao Wang, Tiancheng Yu, Suvrit Sra

We study online learning in unknown Markov games, a problem that arises in episodic multi-agent reinforcement learning where the actions of the opponents are unobservable.

Multi-agent Reinforcement Learning

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

no code implementations4 Oct 2020 Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

However, for multi-agent reinforcement learning in Markov games, the current best known sample complexity for model-based algorithms is rather suboptimal and compares unfavorably against recent model-free approaches.

Model-based Reinforcement Learning Multi-agent Reinforcement Learning +2

Near-Optimal Reinforcement Learning with Self-Play

no code implementations NeurIPS 2020 Yu Bai, Chi Jin, Tiancheng Yu

This paper considers the problem of designing optimal algorithms for reinforcement learning in two-player zero-sum games.

Q-Learning reinforcement-learning +1

A General Framework for Analyzing Stochastic Dynamics in Learning Algorithms

no code implementations11 Jun 2020 Chi-Ning Chou, Juspreet Singh Sandhu, Mien Brabeeba Wang, Tiancheng Yu

In this work, we present a streamlined three-step recipe to tackle the "chicken and egg" problem and give a general framework for analyzing stochastic dynamics in learning algorithms.

Reward-Free Exploration for Reinforcement Learning

no code implementations ICML 2020 Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

reinforcement-learning Reinforcement Learning (RL)

Learning Adversarial MDPs with Bandit Feedback and Unknown Transition

no code implementations3 Dec 2019 Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the problem of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

Near Optimal Stratified Sampling

no code implementations26 Jun 2019 Tiancheng Yu, Xiyu Zhai, Suvrit Sra

The performance of a machine learning system is usually evaluated by using i. i. d.\ observations with true labels.

Entropy Rate Estimation for Markov Chains with Large State Space

no code implementations NeurIPS 2018 Yanjun Han, Jiantao Jiao, Chuan-Zheng Lee, Tsachy Weissman, Yihong Wu, Tiancheng Yu

For estimating the Shannon entropy of a distribution on $S$ elements with independent samples, [Paninski2004] showed that the sample complexity is sublinear in $S$, and [Valiant--Valiant2011] showed that consistent estimation of Shannon entropy is possible if and only if the sample size $n$ far exceeds $\frac{S}{\log S}$.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.