Search Results for author: Yan Dai

Found 11 papers, 1 papers with code

Refined Sample Complexity for Markov Games with Independent Linear Function Approximation

no code implementations • 11 Feb 2024 • Yan Dai, Qiwen Cui, Simon S. Du

Markov Games (MG) is an important model for Multi-Agent Reinforcement Learning (MARL).

Paper
Add Code

Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise

no code implementations • 2 Feb 2024 • Kwangjun Ahn, ZhiYu Zhang, Yunbum Kook, Yan Dai

In this work, we provide a different perspective based on online learning that underscores the importance of Adam's algorithmic components.

Paper
Add Code

Refined Regret for Adversarial MDPs with Linear Function Approximation

no code implementations • 30 Jan 2023 • Yan Dai, Haipeng Luo, Chen-Yu Wei, Julian Zimmert

This analysis allows the loss estimators to be arbitrarily negative and might be of independent interest.

Paper
Add Code

Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

no code implementations • 25 Jan 2023 • Jiatai Huang, Yan Dai, Longbo Huang

\texttt{Banker-OMD} leads to the first delayed scale-free adversarial MAB algorithm achieving $\widetilde{\mathcal O}(\sqrt{K}L(\sqrt T+\sqrt D))$ regret and the first delayed adversarial linear bandit algorithm achieving $\widetilde{\mathcal O}(\text{poly}(n)(\sqrt{T} + \sqrt{D}))$ regret.

Multi-Armed Bandits

Paper
Add Code

Skeleton-based Action Recognition via Adaptive Cross-Form Learning

1 code implementation • 30 Jun 2022 • Xuanhan Wang, Yan Dai, Lianli Gao, Jingkuan Song

Specifically, each GCN model in ACFL not only learns action representation from the single-form skeletons, but also adaptively mimics useful representations derived from other forms of skeletons.

Action Recognition Skeleton Based Action Recognition

Paper
Code

Variance-Aware Sparse Linear Bandits

no code implementations • 26 May 2022 • Yan Dai, Ruosong Wang, Simon S. Du

On the other hand, in the benign setting where there is no noise and the action set is the unit sphere, one can use divide-and-conquer to achieve $\widetilde{\mathcal O}(1)$ regret, which is (nearly) independent of $d$ and $T$.

Paper
Add Code

Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

no code implementations • 26 May 2022 • Yan Dai, Haipeng Luo, Liyu Chen

More importantly, we then find two significant applications: First, the analysis of FTPL turns out to be readily generalizable to delayed bandit feedback with order-optimal regret, while OMD methods exhibit extra difficulties (Jin et al., 2022).

Paper
Add Code

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits

no code implementations • 28 Jan 2022 • Jiatai Huang, Yan Dai, Longbo Huang

Specifically, we design an algorithm \texttt{HTINF}, when the heavy-tail parameters $\alpha$ and $\sigma$ are known to the agent, \texttt{HTINF} simultaneously achieves the optimal regret for both stochastic and adversarial environments, without knowing the actual environment type a-priori.

Multi-Armed Bandits

Paper
Add Code

Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays

no code implementations • 26 Oct 2021 • Jiatai Huang, Yan Dai, Longbo Huang

We consider the Scale-Free Adversarial Multi-Armed Bandit (MAB) problem with unrestricted feedback delays.

Paper
Add Code

From Knowledge Map to Mind Map: Artificial Imagination

no code implementations • 4 Mar 2019 • Ruixue Liu, Baoyang Chen, XIAOYU GUO, Yan Dai, Meng Chen, Zhijie Qiu, Xiaodong He

Imagination is one of the most important factors which makes an artistic painting unique and impressive.

Paper
Add Code

Deploy Large-Scale Deep Neural Networks in Resource Constrained IoT Devices with Local Quantization Region

no code implementations • 24 May 2018 • Yi Yang, Andy Chen, Xiaoming Chen, Jiang Ji, Zhenyang Chen, Yan Dai

Implementing large-scale deep neural networks with high computational complexity on low-cost IoT devices may inevitably be constrained by limited computation resource, making the devices hard to respond in real-time.

Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.