no code implementations • ICML 2020 • Lingxiao Wang, Zhuoran Yang, Zhaoran Wang
We highlight that MF-FQI algorithm enjoys a ``blessing of many agents'' property in the sense that a larger number of observed agents improves the performance of MF-FQI algorithm.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 29 Nov 2023 • Kumar Kshitij Patel, Lingxiao Wang, Aadirupa Saha, Nati Sebro
Furthermore, we delve into the more challenging setting of federated online optimization with bandit (zeroth-order) feedback, where the machines can only access values of the cost functions at the queried points.
no code implementations • 28 Nov 2023 • Minbiao Han, Kumar Kshitij Patel, Han Shao, Lingxiao Wang
Federated learning is a machine learning protocol that enables a large population of agents to collaborate over multiple rounds to produce a single consensus model.
no code implementations • 6 Nov 2023 • Lingxiao Wang, Gert Aarts, Kai Zhou
This study delves into the connection between machine learning and lattice field theory by linking generative diffusion models (DMs) with stochastic quantization, from a stochastic differential equation perspective.
no code implementations • 29 Sep 2023 • Lingxiao Wang, Gert Aarts, Kai Zhou
In this work, we establish a direct connection between generative diffusion models (DMs) and stochastic quantization (SQ).
1 code implementation • 29 May 2023 • Haoran He, Chenjia Bai, Hang Lai, Lingxiao Wang, Weinan Zhang
In this paper, we propose a novel single-stage privileged knowledge distillation method called the Historical Information Bottleneck (HIB) to narrow the sim-to-real gap.
no code implementations • 17 Feb 2023 • Shuai Han, Lukas Stelz, Horst Stoecker, Lingxiao Wang, Kai Zhou
A physics-informed neural network (PINN) embedded with the susceptible-infected-removed (SIR) model is devised to understand the temporal evolution dynamics of infectious diseases.
no code implementations • 9 Jan 2023 • Lingxiao Wang, Ping Li
We further extend our theory to generalized function approximation and identified conditions for reward randomization to attain provably efficient exploration.
no code implementations • 30 Dec 2022 • Yufeng Zhang, Boyi Liu, Qi Cai, Lingxiao Wang, Zhaoran Wang
In particular, such a representation instantiates the posterior distribution of the latent variable given input tokens, which plays a central role in predicting output labels and solving downstream tasks.
1 code implementation • 29 Jul 2022 • Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang
Moreover, under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
no code implementations • 26 May 2022 • Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang
For a class of POMDPs with a low-rank structure in the transition kernel, ETC attains an $O(1/\epsilon^2)$ sample complexity that scales polynomially with the horizon and the intrinsic dimension (that is, the rank).
1 code implementation • ICLR 2022 • Chenjia Bai, Lingxiao Wang, Zhuoran Yang, Zhihong Deng, Animesh Garg, Peng Liu, Zhaoran Wang
We show that such OOD sampling and pessimistic bootstrapping yields provable uncertainty quantifier in linear MDPs, thus providing the theoretical underpinning for PBRL.
no code implementations • 14 Feb 2022 • Junde Wu, Huihui Fang, Fei Li, Huazhu Fu, Fengbin Lin, Jiongcheng Li, Lexing Huang, Qinji Yu, Sifan Song, Xinxing Xu, Yanyu Xu, Wensai Wang, Lingxiao Wang, Shuai Lu, Huiqi Li, Shihua Huang, Zhichao Lu, Chubin Ou, Xifei Wei, Bingyuan Liu, Riadh Kobbi, Xiaoying Tang, Li Lin, Qiang Zhou, Qiang Hu, Hrvoje Bogunovic, José Ignacio Orlando, Xiulan Zhang, Yanwu Xu
However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment.
1 code implementation • 28 Dec 2021 • Boxin Zhao, Lingxiao Wang, Mladen Kolar, Ziqi Liu, Zhiqiang Zhang, Jun Zhou, Chaochao Chen
As a result, client sampling plays an important role in FL systems as it affects the convergence rate of optimization algorithms used to train machine learning models.
no code implementations • 12 Dec 2021 • Lingxiao Wang, Shuzhe Shi, Kai Zhou
Reconstructing spectral functions from Euclidean Green's functions is an important inverse problem in physics.
1 code implementation • 29 Nov 2021 • Lingxiao Wang, Shuzhe Shi, Kai Zhou
Exploiting the neural networks' regularization as a non-local smoothness regulator of the spectral function, we represent spectral functions by neural networks and use the propagator's reconstruction error to optimize the network parameters unsupervisedly.
1 code implementation • 24 Oct 2021 • Zhihong Deng, Zuyue Fu, Lingxiao Wang, Zhuoran Yang, Chenjia Bai, Tianyi Zhou, Zhaoran Wang, Jing Jiang
Offline reinforcement learning (RL) harnesses the power of massive datasets for resolving sequential decision problems.
1 code implementation • NeurIPS 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang
Exploration methods based on pseudo-count of transitions or curiosity of dynamics have achieved promising results in solving reinforcement learning with sparse rewards.
no code implementations • 14 Oct 2021 • Xiaoxia Wu, Lingxiao Wang, Irina Cristali, Quanquan Gu, Rebecca Willett
We propose an adaptive (stochastic) gradient perturbation method for differentially private empirical risk minimization.
no code implementations • 29 Sep 2021 • Yan Li, Lingxiao Wang, Jiachen Yang, Ethan Wang, Zhaoran Wang, Tuo Zhao, Hongyuan Zha
To exploit the permutation invariance therein, we propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation- invariant actor-critic neural architecture.
no code implementations • NAACL 2021 • Lingxiao Wang, Kevin Huang, Tengyu Ma, Quanquan Gu, Jing Huang
The core of our algorithm is to introduce a novel variance reduction term to the gradient estimation when performing the task adaptation.
no code implementations • 18 May 2021 • Yan Li, Lingxiao Wang, Jiachen Yang, Ethan Wang, Zhaoran Wang, Tuo Zhao, Hongyuan Zha
To exploit the permutation invariance therein, we propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture.
1 code implementation • 13 May 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang
In this paper, we propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I).
no code implementations • 1 Jan 2021 • Chenjia Bai, Lingxiao Wang, Peng Liu, Zhaoran Wang, Jianye Hao, Yingnan Zhao
However, such an approach is challenging in developing practical exploration algorithms for Deep Reinforcement Learning (DRL).
no code implementations • 30 Nov 2020 • Lingxiao Wang, Tian Xu, Till Hannes Stoecker, Horst Stoecker, Yin Jiang, Kai Zhou
As the COVID-19 pandemic continues to ravage the world, it is of critical significance to provide a timely risk prediction of the COVID-19 in multi-level.
no code implementations • 17 Oct 2020 • Chenjia Bai, Peng Liu, Kaiyu Liu, Lingxiao Wang, Yingnan Zhao, Lei Han
Efficient exploration remains a challenging problem in reinforcement learning, especially for tasks where extrinsic rewards from environments are sparse or even totally disregarded.
no code implementations • ICML 2020 • Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang
Model-agnostic meta-learning (MAML) formulates meta-learning as a bilevel optimization problem, where the inner level solves each subtask based on a shared prior, while the outer level searches for the optimal shared prior by optimizing its aggregated performance over all the subtasks.
no code implementations • NeurIPS 2021 • Lingxiao Wang, Zhuoran Yang, Zhaoran Wang
Empowered by expressive function approximators such as neural networks, deep reinforcement learning (DRL) achieves tremendous empirical successes.
no code implementations • 21 Jun 2020 • Lingxiao Wang, Zhuoran Yang, Zhaoran Wang
We highlight that MF-FQI algorithm enjoys a "blessing of many agents" property in the sense that a larger number of observed agents improves the performance of MF-FQI algorithm.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 21 May 2020 • Bargav Jayaraman, Lingxiao Wang, Katherine Knipmeyer, Quanquan Gu, David Evans
Since previous inference attacks fail in imbalanced prior setting, we develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function, and show that an attack that combines this with thresholds on the per-instance loss can achieve high PPV even in settings where other attacks appear to be ineffective.
no code implementations • ICLR 2020 • Lingxiao Wang, Jing Huang, Kevin Huang, Ziniu Hu, Guangtao Wang, Quanquan Gu
Recent Transformer-based models such as Transformer-XL and BERT have achieved huge success on various natural language processing tasks.
no code implementations • NeurIPS 2019 • Lingxiao Wang, Zhuoran Yang, Zhaoran Wang
Using the statistical query model to characterize the computational cost of an algorithm, we show that when $\cov(Y, X^\top\beta^*)=0$ and $\cov(Y,(X^\top\beta^*)^2)>0$, no computationally tractable algorithms can achieve the information-theoretic limit of the minimax risk.
no code implementations • 30 Oct 2019 • Lingxiao Wang, Bargav Jayaraman, David Evans, Quanquan Gu
While many solutions for privacy-preserving convex empirical risk minimization (ERM) have been developed, privacy-preserving nonconvex ERM remains a challenge.
no code implementations • 13 Sep 2019 • Lingxiao Wang, Quanquan Gu
We study the problem of estimating high dimensional models with underlying sparse structures while preserving the privacy of each training example.
no code implementations • ICLR 2020 • Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang
In detail, we prove that neural natural policy gradient converges to a globally optimal policy at a sublinear rate.
1 code implementation • NeurIPS 2018 • Bargav Jayaraman, Lingxiao Wang, David Evans, Quanquan Gu
We explore two popular methods of differential privacy, output perturbation and gradient perturbation, and advance the state-of-the-art for both methods in the distributed learning setting.
no code implementations • ICML 2018 • Xiao Zhang, Lingxiao Wang, Yaodong Yu, Quanquan Gu
We propose a primal-dual based framework for analyzing the global optimality of nonconvex low-rank matrix recovery.
no code implementations • ICML 2018 • Jinghui Chen, Pan Xu, Lingxiao Wang, Jian Ma, Quanquan Gu
We propose a nonconvex estimator for the covariate adjusted precision matrix estimation problem in the high dimensional regime, under sparsity constraints.
no code implementations • 20 Jun 2018 • Xiao Zhang, Yaodong Yu, Lingxiao Wang, Quanquan Gu
We study the problem of learning one-hidden-layer neural networks with Rectified Linear Unit (ReLU) activation function, where the inputs are sampled from standard Gaussian distribution and the outputs are generated from a noisy teacher network.
no code implementations • 27 Nov 2017 • Lingxiao Wang, Ya-Li Li, Shengjin Wang
Comprehensive experiments demonstrate that our proposed method can handle various blur kenels and achieve state-of-the-art results for small size blurry face images restoration.
no code implementations • ICML 2017 • Rongda Zhu, Lingxiao Wang, ChengXiang Zhai, Quanquan Gu
We apply our generic algorithm to two illustrative latent variable models: Gaussian mixture model and mixture of linear regression, and demonstrate the advantages of our algorithm by both theoretical analysis and numerical experiments.
no code implementations • ICML 2017 • Lingxiao Wang, Quanquan Gu
In particular, we show that provided that the number of corrupted samples $n_2$ for each variable satisfies $n_2 \lesssim \sqrt{n}/\sqrt{\log d}$, where $n$ is the sample size and $d$ is the number of variables, the proposed robust precision matrix estimator attains the same statistical rate as the standard estimator for Gaussian graphical models.
no code implementations • ICML 2017 • Lingxiao Wang, Xiao Zhang, Quanquan Gu
We propose a generic framework based on a new stochastic variance-reduced gradient descent algorithm for accelerating nonconvex low-rank matrix recovery.
no code implementations • 20 Apr 2017 • Jinghui Chen, Lingxiao Wang, Xiao Zhang, Quanquan Gu
We consider the robust phase retrieval problem of recovering the unknown signal from the magnitude-only measurements, where the measurements can be contaminated by both sparse arbitrary corruption and bounded random noise.
no code implementations • 21 Feb 2017 • Xiao Zhang, Lingxiao Wang, Quanquan Gu
We propose a unified framework to solve general low-rank plus sparse matrix recovery problems based on matrix factorization, which covers a broad family of objective functions satisfying the restricted strong convexity and smoothness conditions.
no code implementations • 9 Jan 2017 • Lingxiao Wang, Xiao Zhang, Quanquan Gu
We propose a generic framework based on a new stochastic variance-reduced gradient descent algorithm for accelerating nonconvex low-rank matrix recovery.
no code implementations • 2 Jan 2017 • Xiao Zhang, Lingxiao Wang, Quanquan Gu
And in the noiseless setting, our algorithm is guaranteed to linearly converge to the unknown low-rank matrix and achieves exact recovery with optimal sample complexity.
no code implementations • WS 2016 • Ruslan Kalitvianski, Lingxiao Wang, Val{\'e}rie Bellynck, Christian Boitet
This paper describes a corpus of nearly 10K French-Chinese aligned segments, produced by post-editing machine translated computer science courseware.
no code implementations • 17 Oct 2016 • Lingxiao Wang, Xiao Zhang, Quanquan Gu
In the general case with noisy observations, we show that our algorithm is guaranteed to linearly converge to the unknown low-rank matrix up to minimax optimal statistical error, provided an appropriate initial estimator.