no code implementations • 9 Jan 2024 • Qinyi Luo, Penghan Wang, Wei zhang, Fan Lai, Jiachen Mao, Xiaohan Wei, Jun Song, Wei-Yu Tsai, Shuai Yang, Yuxi Hu, Xuehai Qian
Huge embedding tables in modern Deep Learning Recommender Models (DLRM) require prohibitively large memory during training and inference.
no code implementations • 31 May 2023 • Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, Mihailo R. Jovanović
We examine online safe multi-agent reinforcement learning using constrained Markov games in which agents compete by maximizing their expected total rewards under a constraint on expected total utilities.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 25 Jul 2022 • Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang
Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization in a non-stationary environment.
no code implementations • 11 Mar 2022 • Buyun Zhang, Liang Luo, Xi Liu, Jay Li, Zeliang Chen, Weilin Zhang, Xiaohan Wei, Yuchen Hao, Michael Tsang, Wenjun Wang, Yang Liu, Huayu Li, Yasmine Badr, Jongsoo Park, Jiyan Yang, Dheevatsa Mudigere, Ellie Wen
To overcome the challenge brought by DHEN's deeper and multi-layer structure in training, we propose a novel co-designed training system that can further improve the training efficiency of DHEN.
no code implementations • ICLR 2022 • Yan Li, Dhruv Choudhary, Xiaohan Wei, Baichuan Yuan, Bhargav Bhushanam, Tuo Zhao, Guanghui Lan
We show that incorporating frequency information of tokens in the embedding learning problems leads to provably efficient algorithms, and demonstrate that common adaptive algorithms implicitly exploit the frequency information to a large extent.
no code implementations • 26 May 2021 • Zhaoxia, Deng, Jongsoo Park, Ping Tak Peter Tang, Haixin Liu, Jie, Yang, Hector Yuen, Jianyu Huang, Daya Khudia, Xiaohan Wei, Ellie Wen, Dhruv Choudhary, Raghuraman Krishnamoorthi, Carole-Jean Wu, Satish Nadathur, Changkyu Kim, Maxim Naumov, Sam Naghshineh, Mikhail Smelyanskiy
We share in this paper our search strategies to adapt reference recommendation models to low-precision hardware, our optimization of low-precision compute kernels, and the design and development of tool chain so as to maintain our models' accuracy throughout their lifespan during which topic trends and users' interests inevitably evolve.
no code implementations • 18 Oct 2020 • Vipul Gupta, Dhruv Choudhary, Ping Tak Peter Tang, Xiaohan Wei, Xing Wang, Yuzhen Huang, Arun Kejariwal, Kannan Ramchandran, Michael W. Mahoney
This is done by identifying and updating only the most relevant neurons of the neural network for each training sample in the data.
no code implementations • 23 Aug 2020 • Shuang Qiu, Zhuoran Yang, Xiaohan Wei, Jieping Ye, Zhaoran Wang
Existing approaches for this problem are based on two-timescale or double-loop stochastic gradient algorithms, which may also require sampling large-batch data.
no code implementations • 22 Jun 2020 • Shuang Qiu, Xiaohan Wei, Mladen Kolar
We study online convex optimization with constraints consisting of multiple functional constraints and a relatively simple constraint set, such as a Euclidean ball.
no code implementations • NeurIPS 2020 • Shuang Qiu, Xiaohan Wei, Zhuoran Yang, Jieping Ye, Zhaoran Wang
In particular, we prove that the proposed algorithm achieves $\widetilde{\mathcal{O}}(L|\mathcal{S}|\sqrt{|\mathcal{A}|T})$ upper bounds of both the regret and the constraint violation, where $L$ is the length of each episode.
no code implementations • 1 Mar 2020 • Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, Mihailo R. Jovanović
To this end, we present an \underline{O}ptimistic \underline{P}rimal-\underline{D}ual Proximal Policy \underline{OP}timization (OPDOP) algorithm where the value function is estimated by combining the least-squares policy evaluation and an additional bonus term for safe exploration.
no code implementations • NeurIPS Workshop Deep_Invers 2019 • Shuang Qiu, Xiaohan Wei, Zhuoran Yang
In this paper, we consider a new framework for the one-bit sensing problem where the sparsity is implicitly enforced via mapping a low dimensional representation $x_0$ through a known $n$-layer ReLU generative network $G:\mathbb{R}^k\rightarrow\mathbb{R}^d$.
no code implementations • ICML 2020 • Shuang Qiu, Xiaohan Wei, Zhuoran Yang
Specifically, we consider a new framework for this problem where the sparsity is implicitly enforced via mapping a low dimensional representation $x_0 \in \mathbb{R}^k$ through a known $n$-layer ReLU generative network $G:\mathbb{R}^k\rightarrow\mathbb{R}^d$ such that $\theta_0 = G(x_0)$.
no code implementations • 7 Aug 2019 • Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, Mihailo R. Jovanović
We study the policy evaluation problem in multi-agent reinforcement learning where a group of agents, with jointly observed states and private local actions and rewards, collaborate to learn the value function of a given policy via local computation and communication over a connected undirected network.
no code implementations • NeurIPS 2018 • Xiaohan Wei, Hao Yu, Qing Ling, Michael Neely
In this paper, we show that by leveraging a local error bound condition on the dual function, the proposed algorithm can achieve a better primal convergence time of $\mathcal{O}\l(\varepsilon^{-2/(2+\beta)}\log_2(\varepsilon^{-1})\r)$, where $\beta\in(0, 1]$ is a local error bound parameter.
no code implementations • NeurIPS 2017 • Hao Yu, Michael J. Neely, Xiaohan Wei
This paper considers online convex optimization (OCO) with stochastic constraints, which generalizes Zinkevich's OCO over a known simple fixed set by introducing multiple stochastic functional constraints that are i. i. d.