Search Results for author: Yan Dai

Found 11 papers, 1 papers with code

Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise

no code implementations2 Feb 2024 Kwangjun Ahn, ZhiYu Zhang, Yunbum Kook, Yan Dai

In this work, we provide a different perspective based on online learning that underscores the importance of Adam's algorithmic components.

Refined Regret for Adversarial MDPs with Linear Function Approximation

no code implementations30 Jan 2023 Yan Dai, Haipeng Luo, Chen-Yu Wei, Julian Zimmert

This analysis allows the loss estimators to be arbitrarily negative and might be of independent interest.

Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

no code implementations25 Jan 2023 Jiatai Huang, Yan Dai, Longbo Huang

\texttt{Banker-OMD} leads to the first delayed scale-free adversarial MAB algorithm achieving $\widetilde{\mathcal O}(\sqrt{K}L(\sqrt T+\sqrt D))$ regret and the first delayed adversarial linear bandit algorithm achieving $\widetilde{\mathcal O}(\text{poly}(n)(\sqrt{T} + \sqrt{D}))$ regret.

Multi-Armed Bandits

Skeleton-based Action Recognition via Adaptive Cross-Form Learning

1 code implementation30 Jun 2022 Xuanhan Wang, Yan Dai, Lianli Gao, Jingkuan Song

Specifically, each GCN model in ACFL not only learns action representation from the single-form skeletons, but also adaptively mimics useful representations derived from other forms of skeletons.

Action Recognition Skeleton Based Action Recognition

Variance-Aware Sparse Linear Bandits

no code implementations26 May 2022 Yan Dai, Ruosong Wang, Simon S. Du

On the other hand, in the benign setting where there is no noise and the action set is the unit sphere, one can use divide-and-conquer to achieve $\widetilde{\mathcal O}(1)$ regret, which is (nearly) independent of $d$ and $T$.

Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

no code implementations26 May 2022 Yan Dai, Haipeng Luo, Liyu Chen

More importantly, we then find two significant applications: First, the analysis of FTPL turns out to be readily generalizable to delayed bandit feedback with order-optimal regret, while OMD methods exhibit extra difficulties (Jin et al., 2022).

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits

no code implementations28 Jan 2022 Jiatai Huang, Yan Dai, Longbo Huang

Specifically, we design an algorithm \texttt{HTINF}, when the heavy-tail parameters $\alpha$ and $\sigma$ are known to the agent, \texttt{HTINF} simultaneously achieves the optimal regret for both stochastic and adversarial environments, without knowing the actual environment type a-priori.

Multi-Armed Bandits

Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays

no code implementations26 Oct 2021 Jiatai Huang, Yan Dai, Longbo Huang

We consider the Scale-Free Adversarial Multi-Armed Bandit (MAB) problem with unrestricted feedback delays.

From Knowledge Map to Mind Map: Artificial Imagination

no code implementations4 Mar 2019 Ruixue Liu, Baoyang Chen, XIAOYU GUO, Yan Dai, Meng Chen, Zhijie Qiu, Xiaodong He

Imagination is one of the most important factors which makes an artistic painting unique and impressive.

Deploy Large-Scale Deep Neural Networks in Resource Constrained IoT Devices with Local Quantization Region

no code implementations24 May 2018 Yi Yang, Andy Chen, Xiaoming Chen, Jiang Ji, Zhenyang Chen, Yan Dai

Implementing large-scale deep neural networks with high computational complexity on low-cost IoT devices may inevitably be constrained by limited computation resource, making the devices hard to respond in real-time.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.