Search Results for author: Yang Cai

Found 15 papers, 1 papers with code

Tractable Local Equilibria in Non-Concave Games

no code implementations13 Mar 2024 Yang Cai, Constantinos Daskalakis, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng

While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when the utilities are non-concave, a situation that is common in machine learning applications where the agents' strategies are parameterized by deep neural networks, or the agents' utilities are computed by a neural network, or both.

Multi-Scale Semantic Segmentation with Modified MBConv Blocks

no code implementations7 Feb 2024 Xi Chen, Yang Cai, Yuan Wu, Bo Xiong, Taesung Park

Recently, MBConv blocks, initially designed for efficiency in resource-limited settings and later adapted for cutting-edge image classification performances, have demonstrated significant potential in image classification tasks.

Classification Image Classification +2

Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

no code implementations26 Jan 2024 Yang Cai, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng

In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal $\tilde{O}(T^{-1})$ convergence rate for computing a correlated equilibrium.

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

no code implementations1 Nov 2023 Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice.

Curvature-Independent Last-Iterate Convergence for Games on Riemannian Manifolds

no code implementations29 Jun 2023 Yang Cai, Michael I. Jordan, Tianyi Lin, Argyris Oikonomou, Emmanouil-Vasileios Vlatakis-Gkaragkounis

Numerous applications in machine learning and data analytics can be formulated as equilibrium computation over Riemannian manifolds.

User Response in Ad Auctions: An MDP Formulation of Long-Term Revenue Optimization

no code implementations16 Feb 2023 Yang Cai, Zhe Feng, Christopher Liaw, Aranyak Mehta

We propose a new Markov Decision Process (MDP) model for ad auctions to capture the user response to the quality of ads, with the objective of maximizing the long-term discounted revenue.

Doubly Optimal No-Regret Learning in Monotone Games

1 code implementation30 Jan 2023 Yang Cai, Weiqiang Zheng

We propose the accelerated optimistic gradient (AOG) algorithm, the first doubly optimal no-regret learning algorithm for smooth monotone games.

Accelerated Single-Call Methods for Constrained Min-Max Optimization

no code implementations6 Oct 2022 Yang Cai, Weiqiang Zheng

Finally, we show that the Reflected Gradient (RG) method, another single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ last-iterate convergence rate for constrained convex-concave min-max optimization, answering an open problem of [Heish et al, 2019].

Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone Inclusion

no code implementations10 Jun 2022 Yang Cai, Argyris Oikonomou, Weiqiang Zheng

In our first contribution, we extend the Extra Anchored Gradient (EAG) algorithm, originally proposed by Yoon and Ryu (2021) for unconstrained min-max optimization, to constrained comonotone min-max optimization and comonotone inclusion, achieving an optimal convergence rate of $O\left(\frac{1}{T}\right)$ among all first-order methods.

Tight Last-Iterate Convergence of the Extragradient and the Optimistic Gradient Descent-Ascent Algorithm for Constrained Monotone Variational Inequalities

no code implementations20 Apr 2022 Yang Cai, Argyris Oikonomou, Weiqiang Zheng

We use the tangent residual (or a slight variation of the tangent residual) as the the potential function in our analysis of the extragradient algorithm (or the optimistic gradient descent-ascent algorithm) and prove that it is non-increasing between two consecutive iterates.

Recommender Systems meet Mechanism Design

no code implementations25 Oct 2021 Yang Cai, Constantinos Daskalakis

We propose a mechanism design framework for this setting, building on a recent robustification framework by Brustle et al., which disentangles the statistical challenge of estimating a multi-dimensional prior from the task of designing a good mechanism for it, and robustifies the performance of the latter against the estimation error of the former.

Recommendation Systems Topic Models

Multi-Item Mechanisms without Item-Independence: Learnability via Robustness

no code implementations6 Nov 2019 Johaness Brustle, Yang Cai, Constantinos Daskalakis

When item values are sampled from more general graphical models, we combine our robustness theorem with novel sample complexity results for learning Markov Random Fields or Bayesian Networks in Prokhorov distance, which may be of independent interest.

Learning Safe Policies with Expert Guidance

no code implementations NeurIPS 2018 Jessie Huang, Fa Wu, Doina Precup, Yang Cai

We propose a framework for ensuring safe behavior of a reinforcement learning agent when the reward function may be difficult to specify.

reinforcement-learning Reinforcement Learning (RL)

Learning Multi-item Auctions with (or without) Samples

no code implementations1 Sep 2017 Yang Cai, Constantinos Daskalakis

The second is a more general max-min learning setting that we introduce, where we are given "approximate distributions," and we seek to compute an auction whose revenue is approximately optimal simultaneously for all "true distributions" that are close to the given ones.

Optimum Statistical Estimation with Strategic Data Sources

no code implementations11 Aug 2014 Yang Cai, Constantinos Daskalakis, Christos H. Papadimitriou

We propose an optimum mechanism for providing monetary incentives to the data sources of a statistical estimator such as linear regression, so that high quality data is provided at low cost, in the sense that the sum of payments and estimation error is minimized.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.