Search Results for author: Yingbin Liang

Found 99 papers, 12 papers with code

Transformers Provably Learn Feature-Position Correlations in Masked Image Modeling

no code implementations • 4 Mar 2024 • Yu Huang, Zixin Wen, Yuejie Chi, Yingbin Liang

Masked image modeling (MIM), which predicts randomly masked patches from unmasked ones, has emerged as a promising approach in self-supervised vision pretraining.

Position

Paper
Add Code

Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization

1 code implementation • 22 Feb 2024 • Xuxi Chen, Zhendong Wang, Daouda Sow, Junjie Yang, Tianlong Chen, Yingbin Liang, Mingyuan Zhou, Zhangyang Wang

Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets, with a specific focus on selective retention of samples that incur moderately high losses.

Paper
Code

Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate

no code implementations • 21 Feb 2024 • Yuchen Liang, Peizhong Ju, Yingbin Liang, Ness Shroff

In this paper, we establish the convergence guarantee for substantially larger classes of distributions under discrete-time diffusion models and further improve the convergence rate for distributions with bounded support.

Denoising

Paper
Add Code

Sample Complexity Characterization for Linear Contextual MDPs

no code implementations • 5 Feb 2024 • Junze Deng, Yuan Cheng, Shaofeng Zou, Yingbin Liang

Our result for the second model is the first-known result for such a type of function approximation models.

Paper
Add Code

Rethinking PGD Attack: Is Sign Function Necessary?

1 code implementation • 3 Dec 2023 • Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang

Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.

Paper
Code

Meta ControlNet: Enhancing Task Adaptation via Meta Learning

1 code implementation • 3 Dec 2023 • Junjie Yang, Jinze Zhao, Peihao Wang, Zhangyang Wang, Yingbin Liang

However, vanilla ControlNet generally requires extensive training of around 5000 steps to achieve a desirable control for a single task.

Edge Detection Image Generation +1

Paper
Code

Provable Benefits of Multi-task RL under Non-Markovian Decision Making Processes

no code implementations • 20 Oct 2023 • Ruiquan Huang, Yuan Cheng, Jing Yang, Vincent Tan, Yingbin Liang

To this end, we posit a joint model class for tasks and use the notion of $\eta$-bracketing number to quantify its complexity; this number also serves as a general metric to capture the similarity of tasks and thus determines the benefit of multi-task over single-task RL.

Decision Making Multi-Task Learning +1

Paper
Add Code

In-Context Convergence of Transformers

no code implementations • 8 Oct 2023 • Yu Huang, Yuan Cheng, Yingbin Liang

For data with balanced features, we establish the finite-time convergence guarantee with near-zero prediction error by navigating our analysis over two phases of the training dynamics of the attention map.

In-Context Learning

Paper
Add Code

Model-Free Algorithm with Improved Sample Efficiency for Zero-Sum Markov Games

no code implementations • 17 Aug 2023 • Songtao Feng, Ming Yin, Yu-Xiang Wang, Jing Yang, Yingbin Liang

In this work, we propose a model-free stage-based Q-learning algorithm and show that it achieves the same sample complexity as the best model-based algorithm, and hence for the first time demonstrate that model-free algorithms can enjoy the same optimality in the $H$ dependence as model-based algorithms.

Multi-agent Reinforcement Learning Q-Learning +1

Paper
Add Code

Doubly Robust Instance-Reweighted Adversarial Training

no code implementations • 1 Aug 2023 • Daouda Sow, Sen Lin, Zhangyang Wang, Yingbin Liang

Experiments on standard classification datasets demonstrate that our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance, and at the same time improves the robustness against attacks on the weakest data points.

Paper
Add Code

Provably Efficient UCB-type Algorithms For Learning Predictive State Representations

no code implementations • 1 Jul 2023 • Ruiquan Huang, Yingbin Liang, Jing Yang

The general sequential decision-making problem, which includes Markov decision processes (MDPs) and partially observable MDPs (POMDPs) as special cases, aims at maximizing a cumulative reward by making a sequence of decisions based on a history of observations and actions over time.

Computational Efficiency Decision Making

Paper
Add Code

Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information

no code implementations • 14 Jun 2023 • Ming Shi, Yingbin Liang, Ness Shroff

However, existing theoretical results have shown that learning in POMDPs is intractable in the worst case, where the main challenge lies in the lack of latent state information.

Paper
Add Code

Generalization Performance of Transfer Learning: Overparameterized and Underparameterized Regimes

no code implementations • 8 Jun 2023 • Peizhong Ju, Sen Lin, Mark S. Squillante, Yingbin Liang, Ness B. Shroff

For example, when the total number of features in the source task's learning model is fixed, we show that it is more advantageous to allocate a greater number of redundant features to the task-specific part rather than the common part.

Transfer Learning

Paper
Add Code

Non-stationary Reinforcement Learning under General Function Approximation

no code implementations • 1 Jun 2023 • Songtao Feng, Ming Yin, Ruiquan Huang, Yu-Xiang Wang, Jing Yang, Yingbin Liang

To the best of our knowledge, this is the first dynamic regret analysis in non-stationary MDPs with general function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning

no code implementations • 9 Apr 2023 • Peizhong Ju, Yingbin Liang, Ness B. Shroff

However, due to the uniqueness of meta-learning such as task-specific gradient descent inner training and the diversity/fluctuation of the ground-truth signals among training tasks, we find new and interesting properties that do not exist in single-task linear regression.

Meta-Learning regression

Paper
Add Code

Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

no code implementations • 20 Mar 2023 • Yuan Cheng, Ruiquan Huang, Jing Yang, Yingbin Liang

In this work, we first provide the first known sample complexity lower bound that holds for any algorithm under low-rank MDPs.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

1 code implementation • 28 Feb 2023 • Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang

This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution".

Paper
Code

Learning to Generalize Provably in Learning to Optimize

1 code implementation • 22 Feb 2023 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.

248

Paper
Code

Theory on Forgetting and Generalization of Continual Learning

no code implementations • 12 Feb 2023 • Sen Lin, Peizhong Ju, Yingbin Liang, Ness Shroff

In particular, there is a lack of understanding on what factors are important and how they affect "catastrophic forgetting" and generalization performance.

Continual Learning

Paper
Add Code

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

no code implementations • 8 Feb 2023 • Ming Shi, Yingbin Liang, Ness Shroff

In many applications of Reinforcement Learning (RL), it is critically important that the algorithm performs safely, such that instantaneous hard constraints are satisfied at each step, and unsafe states and actions are avoided.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Near-Optimal Adversarial Reinforcement Learning with Switching Costs

no code implementations • 8 Feb 2023 • Ming Shi, Yingbin Liang, Ness Shroff

Our lower bound indicates that, due to the fundamental challenge of switching costs in adversarial RL, the best achieved regret (whose dependency on $T$ is $\tilde{O}(\sqrt{T})$) in static RL with switching costs (as well as adversarial RL without switching costs) is no longer achievable.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Algorithm Design for Online Meta-Learning with Task Boundary Detection

no code implementations • 2 Feb 2023 • Daouda Sow, Sen Lin, Yingbin Liang, Junshan Zhang

More specifically, we first propose two simple but effective detection mechanisms of task switches and distribution shift based on empirical observations, which serve as a key building block for more elegant online model updates in our algorithm: the task switch detection mechanism allows reusing of the best model available for the current task at hand, and the distribution shift detection mechanism differentiates the meta model update in order to preserve the knowledge for in-distribution tasks and quickly learn the new knowledge for out-of-distribution tasks.

Boundary Detection Meta-Learning

Paper
Add Code

Pruning Before Training May Improve Generalization, Provably

no code implementations • 1 Jan 2023 • Hongru Yang, Yingbin Liang, Xiaojie Guo, Lingfei Wu, Zhangyang Wang

It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero and the network exhibits good generalization performance.

Network Pruning

Paper
Add Code

Convergence and Generalization of Wide Neural Networks with Large Bias

no code implementations • 1 Jan 2023 • Hongru Yang, Ziyu Jiang, Ruizhe Zhang, Zhangyang Wang, Yingbin Liang

This work studies training one-hidden-layer overparameterized ReLU networks via gradient descent in the neural tangent kernel (NTK) regime, where the networks' biases are initialized to some constant rather than zero.

Paper
Add Code

Global Convergence of Two-timescale Actor-Critic for Solving Linear Quadratic Regulator

no code implementations • 18 Aug 2022 • Xuyang Chen, Jingliang Duan, Yingbin Liang, Lin Zhao

To our knowledge, this is the first finite-time convergence analysis for the single sample two-timescale AC for solving LQR with global optimality.

Paper
Add Code

Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL

no code implementations • 28 Jun 2022 • Ruiquan Huang, Jing Yang, Yingbin Liang

In particular, we consider the scenario where a safe baseline policy is known beforehand, and propose a unified Safe reWard-frEe ExploraTion (SWEET) framework.

Safe Exploration

Paper
Add Code

Provable Generalization of Overparameterized Meta-learning Trained with SGD

no code implementations • 18 Jun 2022 • Yu Huang, Yingbin Liang, Longbo Huang

Despite the superior empirical success of deep meta-learning, theoretical understanding of overparameterized meta-learning is still limited.

Generalization Bounds Meta-Learning

Paper
Add Code

Provable Benefit of Multitask Representation Learning in Reinforcement Learning

no code implementations • 13 Jun 2022 • Yuan Cheng, Songtao Feng, Jing Yang, Hong Zhang, Yingbin Liang

To the best of our knowledge, this is the first theoretical study that characterizes the benefit of representation learning in exploration-based reward-free multitask RL for both upstream and downstream tasks.

Offline RL reinforcement-learning +2

Paper
Add Code

Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward

no code implementations • 13 Jun 2022 • Tengyu Xu, Yue Wang, Shaofeng Zou, Yingbin Liang

The remarkable success of reinforcement learning (RL) heavily relies on observing the reward of every visited state-action pair.

Offline RL reinforcement-learning +1

Paper
Add Code

Will Bilevel Optimizers Benefit from Loops

no code implementations • 27 May 2022 • Kaiyi Ji, Mingrui Liu, Yingbin Liang, Lei Ying

Existing studies in the literature cover only some of those implementation choices, and the complexity bounds available are not refined enough to enable rigorous comparison among different implementations.

Bilevel Optimization Computational Efficiency

Paper
Add Code

Data Sampling Affects the Complexity of Online SGD over Dependent Data

no code implementations • 31 Mar 2022 • Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang

Moreover, we show that online SGD with mini-batch sampling can further substantially improve the sample complexity over online SGD with periodic data-subsampling over highly dependent data.

Stochastic Optimization

Paper
Add Code

A Primal-Dual Approach to Bilevel Optimization with Multiple Inner Minima

no code implementations • 1 Mar 2022 • Daouda Sow, Kaiyi Ji, Ziwei Guan, Yingbin Liang

Existing algorithms designed for such a problem were applicable to restricted situations and do not come with a full guarantee of convergence.

Bilevel Optimization Hyperparameter Optimization +2

Paper
Add Code

Model-Based Offline Meta-Reinforcement Learning with Regularization

no code implementations • ICLR 2022 • Sen Lin, Jialin Wan, Tengyu Xu, Yingbin Liang, Junshan Zhang

In particular, we devise a new meta-Regularized model-based Actor-Critic (RAC) method for within-task policy optimization, as a key building block of MerPO, using conservative policy evaluation and regularized policy improvement; and the intrinsic tradeoff therein is achieved via striking the right balance between two regularizers, one based on the behavior policy and the other on the meta-policy.

Meta Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Faster Non-asymptotic Convergence for Double Q-learning

no code implementations • NeurIPS 2021 • Lin Zhao, Huaqing Xiong, Yingbin Liang

This paper tackles the more challenging case of a constant learning rate, and develops new analytical tools that improve the existing convergence rate by orders of magnitude.

Q-Learning

Paper
Add Code

Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

no code implementations • 20 Oct 2021 • Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang, Guanghui Lan

Despite the challenge of the nonconcave objective subject to nonconcave constraints, the proposed approach is shown to converge to the global optimum with a complexity of $\tilde{\mathcal O}(1/\epsilon)$ in terms of the optimality gap and the constraint violation, which improves the complexity of the existing primal-dual approach by a factor of $\mathcal O(1/\epsilon)$ \citep{ding2020natural, paternain2019constrained}.

Paper
Add Code

PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method

no code implementations • ICLR 2022 • Ziwei Guan, Tengyu Xu, Yingbin Liang

Although ETD has been shown to converge asymptotically to a desirable value function, it is well-known that ETD often encounters a large variance so that its sample complexity can increase exponentially fast with the number of iterations.

Paper
Add Code

On the Convergence Theory for Hessian-Free Bilevel Algorithms

1 code implementation • 13 Oct 2021 • Daouda Sow, Kaiyi Ji, Yingbin Liang

Bilevel optimization has arisen as a powerful tool in modern machine learning.

Bilevel Optimization Meta-Learning

Paper
Code

How to Improve Sample Complexity of SGD over Highly Dependent Data?

no code implementations • 29 Sep 2021 • Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang

Specifically, with a $\phi$-mixing model that captures both exponential and polynomial decay of the data dependence over time, we show that SGD with periodic data-subsampling achieves an improved sample complexity over the standard SGD in the full spectrum of the $\phi$-mixing data dependence.

Stochastic Optimization

Paper
Add Code

Generalizable Learning to Optimize into Wide Valleys

no code implementations • 29 Sep 2021 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

Learning to optimize (L2O) has gained increasing popularity in various optimization tasks, since classical optimizers usually require laborious, problem-specific design and hyperparameter tuning.

Paper
Add Code

ES-Based Jacobian Enables Faster Bilevel Optimization

no code implementations • 29 Sep 2021 • Daouda Sow, Kaiyi Ji, Yingbin Liang

Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.

Bilevel Optimization Meta-Learning

Paper
Add Code

A Unified Off-Policy Evaluation Approach for General Value Function

no code implementations • 6 Jul 2021 • Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang

We further show that unlike GTD, the learned GVFs by GenTD are guaranteed to converge to the ground truth GVFs as long as the function approximation power is sufficiently large.

Anomaly Detection Off-policy evaluation

Paper
Add Code

Provably Faster Algorithms for Bilevel Optimization

1 code implementation • NeurIPS 2021 • Junjie Yang, Kaiyi Ji, Yingbin Liang

Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Code

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

no code implementations • 23 Feb 2021 • Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang

We also show that the overall convergence of DR-Off-PAC is doubly robust to the approximation errors that depend only on the expressive power of approximation functions.

Paper
Add Code

Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry

no code implementations • ICLR 2021 • Ziyi Chen, Yi Zhou, Tengyu Xu, Yingbin Liang

By leveraging this Lyapunov function and the K{\L} geometry that parameterizes the local geometries of general nonconvex functions, we formally establish the variable convergence of proximal-GDA to a critical point $x^*$, i. e., $x_t\to x^*, y_t\to y^*(x^*)$.

Paper
Add Code

Lower Bounds and Accelerated Algorithms for Bilevel Optimization

no code implementations • 7 Feb 2021 • Kaiyi Ji, Yingbin Liang

Bilevel optimization has recently attracted growing interests due to its wide applications in modern machine learning problems.

Bilevel Optimization

Paper
Add Code

Double Q-learning: New Analysis and Sharper Finite-time Bound

no code implementations • 1 Jan 2021 • Lin Zhao, Huaqing Xiong, Yingbin Liang, Wei zhang

Double Q-learning (Hasselt 2010) has gained significant success in practice due to its effectiveness in overcoming the overestimation issue of Q-learning.

Q-Learning

Paper
Add Code

CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee

no code implementations • 11 Nov 2020 • Tengyu Xu, Yingbin Liang, Guanghui Lan

To demonstrate the theoretical performance of CRPO, we adopt natural policy gradient (NPG) for each policy update step and show that CRPO achieves an $\mathcal{O}(1/\sqrt{T})$ convergence rate to the global optimal policy in the constrained policy set and an $\mathcal{O}(1/\sqrt{T})$ error bound on constraint satisfaction.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms

no code implementations • 10 Nov 2020 • Tengyu Xu, Yingbin Liang

For linear TDC, we provide a novel non-asymptotic analysis and show that it attains an $\epsilon$-accurate solution with the optimal sample complexity of $\mathcal{O}(\epsilon^{-1}\log(1/\epsilon))$ under a constant stepsize.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Bilevel Optimization: Convergence Analysis and Enhanced Design

2 code implementations • 15 Oct 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang

For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Code

Finite-Time Analysis for Double Q-learning

no code implementations • NeurIPS 2020 • Huaqing Xiong, Lin Zhao, Yingbin Liang, Wei zhang

Although Q-learning is one of the most successful algorithms for finding the best action-value function (and thus the optimal policy) in reinforcement learning, its implementation often suffers from large overestimation of Q-function values incurred by random sampling.

Q-Learning

Paper
Add Code

Provably Faster Algorithms for Bilevel Optimization and Applications to Meta-Learning

no code implementations • 28 Sep 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang

For the AID-based method, we orderwisely improve the previous finite-time convergence analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Add Code

Enhanced First and Zeroth Order Variance Reduced Algorithms for Min-Max Optimization

no code implementations • 28 Sep 2020 • Tengyu Xu, Zhe Wang, Yingbin Liang, H. Vincent Poor

Specifically, a novel variance reduction algorithm SREDA was proposed recently by (Luo et al. 2020) to solve such a problem, and was shown to achieve the optimal complexity dependence on the required accuracy level $\epsilon$.

Paper
Add Code

A Primal Approach to Constrained Policy Optimization: Global Optimality and Finite-Time Analysis

no code implementations • 28 Sep 2020 • Tengyu Xu, Yingbin Liang, Guanghui Lan

Safe Reinforcement Learning

Paper
Add Code

Spectral Algorithms for Community Detection in Directed Networks

no code implementations • 9 Aug 2020 • Zhe Wang, Yingbin Liang, Pengsheng Ji

Community detection in large social networks is affected by degree heterogeneity of nodes.

Clustering Community Detection

Paper
Add Code

Momentum Q-learning with Finite-Sample Convergence Guarantee

no code implementations • 30 Jul 2020 • Bowen Weng, Huaqing Xiong, Lin Zhao, Yingbin Liang, Wei zhang

For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling.

Q-Learning

Paper
Add Code

Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent

no code implementations • 15 Jul 2020 • Bowen Weng, Huaqing Xiong, Yingbin Liang, Wei zhang

In this paper, we first characterize the convergence rate for Q-AMSGrad, which is the Q-learning algorithm with AMSGrad update (a commonly adopted alternative of Adam for theoretical analysis).

Atari Games Q-Learning

Paper
Add Code

When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence

no code implementations • 24 Jun 2020 • Ziwei Guan, Tengyu Xu, Yingbin Liang

Generative adversarial imitation learning (GAIL) is a popular inverse reinforcement learning approach for jointly optimizing policy and reward from expert trajectories.

Imitation Learning

Paper
Add Code

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

no code implementations • NeurIPS 2020 • Kaiyi Ji, Jason D. Lee, Yingbin Liang, H. Vincent Poor

Although model-agnostic meta-learning (MAML) is a very successful algorithm in meta-learning practice, it can have high computational cost because it updates all model parameters over both the inner loop of task-specific adaptation and the outer-loop of meta initialization training.

Meta-Learning

Paper
Add Code

Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence

no code implementations • 16 Jun 2020 • Tengyu Xu, Zhe Wang, Yingbin Liang, H. Vincent Poor

In this paper, we focus on such a gradient-free setting, and consider the nonconvex-strongly-concave minimax stochastic optimization problem.

Stochastic Optimization

Paper
Add Code

Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms

no code implementations • 7 May 2020 • Tengyu Xu, Zhe Wang, Yingbin Liang

In the first nested-loop design, actor's one update of policy is followed by an entire loop of critic's updates of the value function, and the finite-sample analysis of such AC and NAC algorithms have been recently well established.

Paper
Add Code

Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms

no code implementations • NeurIPS 2020 • Tengyu Xu, Zhe Wang, Yingbin Liang

We show that the overall sample complexity for a mini-batch AC to attain an $\epsilon$-accurate stationary point improves the best known sample complexity of AC by an order of $\mathcal{O}(\epsilon^{-1}\log(1/\epsilon))$, and the overall sample complexity for a mini-batch NAC to attain an $\epsilon$-accurate globally optimal point improves the existing sample complexity of NAC by an order of $\mathcal{O}(\epsilon^{-1}/\log(1/\epsilon))$.

Paper
Add Code

Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart for Nonconvex Optimization

no code implementations • 26 Feb 2020 • Yi Zhou, Zhe Wang, Kaiyi Ji, Yingbin Liang, Vahid Tarokh

Our APG-restart is designed to 1) allow for adopting flexible parameter restart schemes that cover many existing ones; 2) have a global sub-linear convergence rate in nonconvex and nonsmooth optimization; and 3) have guaranteed convergence to a critical point and have various types of asymptotic convergence rates depending on the parameterization of local geometry in nonconvex and nonsmooth optimization.

Paper
Add Code

Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning

2 code implementations • 18 Feb 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang

As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algorithm has been widely used due to its simplicity and effectiveness.

Meta-Learning

Paper
Code

Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack

no code implementations • 17 Feb 2020 • Ziwei Guan, Kaiyi Ji, Donald J Bucci Jr, Timothy Y Hu, Joseph Palombo, Michael Liston, Yingbin Liang

This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks.

Adversarial Attack

Paper
Add Code

Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling

no code implementations • 15 Feb 2020 • Huaqing Xiong, Tengyu Xu, Yingbin Liang, Wei zhang

Despite the wide applications of Adam in reinforcement learning (RL), the theoretical convergence of Adam-type RL algorithms has not been established.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Reanalysis of Variance Reduced Temporal Difference Learning

no code implementations • ICLR 2020 • Tengyu Xu, Zhe Wang, Yi Zhou, Yingbin Liang

Furthermore, the variance error (for both i. i. d.\ and Markovian sampling) and the bias error (for Markovian sampling) of VRTD are significantly reduced by the batch size of variance reduction in comparison to those of vanilla TD.

Paper
Add Code

SpiderBoost and Momentum: Faster Variance Reduction Algorithms

no code implementations • NeurIPS 2019 • Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh

SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization.

Paper
Add Code

Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization

no code implementations • 27 Oct 2019 • Kaiyi Ji, Zhe Wang, Yi Zhou, Yingbin Liang

Two types of zeroth-order stochastic algorithms have recently been designed for nonconvex optimization respectively based on the first-order techniques SVRG and SARAH/SPIDER.

Paper
Add Code

History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

no code implementations • ICML 2020 • Kaiyi Ji, Zhe Wang, Bowen Weng, Yi Zhou, Wei zhang, Yingbin Liang

In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients.

Paper
Add Code

Distributed SGD Generalizes Well Under Asynchrony

no code implementations • 29 Sep 2019 • Jayanth Regatti, Gaurav Tendolkar, Yi Zhou, Abhishek Gupta, Yingbin Liang

The performance of fully synchronized distributed systems has faced a bottleneck due to the big data trend, under which asynchronous distributed systems are becoming a major popularity due to their powerful scalability.

Paper
Add Code

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

no code implementations • NeurIPS 2019 • Tengyu Xu, Shaofeng Zou, Yingbin Liang

Gradient-based temporal difference (GTD) algorithms are widely used in off-policy learning scenarios.

Paper
Add Code

CAN ALTQ LEARN FASTER: EXPERIMENTS AND THEORY

no code implementations • 25 Sep 2019 • Bowen Weng, Huaqing Xiong, Yingbin Liang, Wei zhang

Differently from the popular Deep Q-Network (DQN) learning, Alternating Q-learning (AltQ) does not fully fit a target Q-function at each iteration, and is generally known to be unstable and inefficient.

Atari Games Q-Learning

Paper
Add Code

Momentum Schemes with Stochastic Variance Reduction for Nonconvex Composite Optimization

no code implementations • 7 Feb 2019 • Yi Zhou, Zhe Wang, Kaiyi Ji, Yingbin Liang, Vahid Tarokh

In this paper, we develop novel momentum schemes with flexible coefficient settings to accelerate SPIDER for nonconvex and nonsmooth composite optimization, and show that the resulting algorithms achieve the near-optimal gradient oracle complexity for achieving a generalized first-order stationary condition.

Paper
Add Code

Finite-Sample Analysis for SARSA with Linear Function Approximation

no code implementations • NeurIPS 2019 • Shaofeng Zou, Tengyu Xu, Yingbin Liang

For this fitted SARSA algorithm, we also provide its finite-sample analysis.

Q-Learning

Paper
Add Code

SGD Converges to Global Minimum in Deep Learning via Star-convex Path

no code implementations • ICLR 2019 • Yi Zhou, Junjie Yang, Huishuai Zhang, Yingbin Liang, Vahid Tarokh

Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a variety of deep neural networks.

Paper
Add Code

MR-GAN: Manifold Regularized Generative Adversarial Networks

no code implementations • 22 Nov 2018 • Qunwei Li, Bhavya Kailkhura, Rushil Anirudh, Yi Zhou, Yingbin Liang, Pramod Varshney

Despite the growing interest in generative adversarial networks (GANs), training GANs remains a challenging problem, both from a theoretical and a practical standpoint.

Paper
Add Code

Minimax Estimation of Neural Net Distance

no code implementations • NeurIPS 2018 • Kaiyi Ji, Yingbin Liang

An important class of distance metrics proposed for training generative adversarial networks (GANs) is the integral probability metric (IPM), in which the neural net distance captures the practical GAN training via two neural networks.

Paper
Add Code

SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms

1 code implementation • 25 Oct 2018 • Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh

SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization.

Paper
Code

Cubic Regularization with Momentum for Nonconvex Optimization

no code implementations • 9 Oct 2018 • Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan

However, such a successful acceleration technique has not yet been proposed for second-order algorithms in nonconvex optimization. In this paper, we apply the momentum scheme to cubic regularized (CR) Newton's method and explore the potential for acceleration.

Paper
Add Code

A Note on Inexact Condition for Cubic Regularized Newton's Method

no code implementations • 22 Aug 2018 • Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan

This note considers the inexact cubic-regularized Newton's method (CR), which has been shown in \cite{Cartis2011a} to achieve the same order-level convergence rate to a secondary stationary point as the exact CR \citep{Nesterov2006}.

Paper
Add Code

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

no code implementations • NeurIPS 2018 • Yi Zhou, Zhe Wang, Yingbin Liang

Cubic-regularized Newton's method (CR) is a popular algorithm that guarantees to produce a second-order stationary solution for solving nonconvex optimization problems.

Paper
Add Code

K-medoids Clustering of Data Sequences with Composite Distributions

no code implementations • 31 Jul 2018 • Tiexing Wang, Qunwei Li, Donald J. Bucci, Yingbin Liang, Biao Chen, Pramod K. Varshney

In particular, the error exponent is characterized when either the Kolmogrov-Smirnov distance or the maximum mean discrepancy are used as the distance metric.

Clustering

Paper
Add Code

When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?

1 code implementation • ICLR 2019 • Tengyu Xu, Yi Zhou, Kaiyi Ji, Yingbin Liang

We study the implicit bias of gradient descent methods in solving a binary classification problem over a linearly separable dataset.

Binary Classification

Paper
Code

Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization

no code implementations • 20 Feb 2018 • Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan

Cubic regularization (CR) is an optimization method with emerging popularity due to its capability to escape saddle points and converge to second-order stationary solutions for nonconvex optimization.

Paper
Add Code

Generalization Error Bounds with Probabilistic Guarantee for SGD in Nonconvex Optimization

no code implementations • 19 Feb 2018 • Yi Zhou, Yingbin Liang, Huishuai Zhang

With strongly convex regularizers, we further establish the generalization error bounds for nonconvex loss functions under proximal SGD with high-probability guarantee, i. e., exponential concentration in probability.

Paper
Add Code

Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy

no code implementations • ICLR 2019 • Haoyu Fu, Yuejie Chi, Yingbin Liang

We prove that with Gaussian inputs, the empirical risk based on cross entropy exhibits strong convexity and smoothness {\em uniformly} in a local neighborhood of the ground truth, as soon as the sample complexity is sufficiently large.

Paper
Add Code

Critical Points of Linear Neural Networks: Analytical Forms and Landscape Properties

no code implementations • ICLR 2018 • Yi Zhou, Yingbin Liang

In this paper, we provide a necessary and sufficient characterization of the analytical forms for the critical points (as well as global minimizers) of the square loss functions for linear neural networks.

Paper
Add Code

Critical Points of Neural Networks: Analytical Forms and Landscape Properties

no code implementations • 30 Oct 2017 • Yi Zhou, Yingbin Liang

We show that the analytical forms of the critical points characterize the values of the corresponding loss functions as well as the necessary and sufficient conditions to achieve global minimum.

Paper
Add Code

Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

no code implementations • 18 Oct 2017 • Yi Zhou, Yingbin Liang

The past decade has witnessed a successful application of deep learning to solving many challenging problems in machine learning and artificial intelligence.

Paper
Add Code

Nonconvex Low-Rank Matrix Recovery with Arbitrary Outliers via Median-Truncated Gradient Descent

no code implementations • 23 Sep 2017 • Yuanxin Li, Yuejie Chi, Huishuai Zhang, Yingbin Liang

Recent work has demonstrated the effectiveness of gradient descent for directly recovering the factors of low-rank matrices from random linear measurements in a globally convergent manner when initialized properly.

Paper
Add Code

Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization

no code implementations • ICML 2017 • Qunwei Li, Yi Zhou, Yingbin Liang, Pramod K. Varshney

Then, by exploiting the Kurdyka-{\L}ojasiewicz (\KL) property for a broad class of functions, we establish the linear and sub-linear convergence rates of the function value sequence generated by APGnc.

Paper
Add Code

Reshaped Wirtinger Flow for Solving Quadratic System of Equations

no code implementations • NeurIPS 2016 • Huishuai Zhang, Yingbin Liang

In contrast to the smooth loss function used in WF, we adopt a nonsmooth but lower-order loss function, and design a gradient-like algorithm (referred to as reshaped-WF).

Paper
Add Code

Reshaped Wirtinger Flow and Incremental Algorithm for Solving Quadratic System of Equations

1 code implementation • 25 May 2016 • Huishuai Zhang, Yi Zhou, Yingbin Liang, Yuejie Chi

We further develop the incremental (stochastic) reshaped Wirtinger flow (IRWF) and show that IRWF converges linearly to the true signal.

Retrieval

Paper
Code

Nonparametric Detection of Geometric Structures over Networks

no code implementations • 5 Apr 2016 • Shaofeng Zou, Yingbin Liang, H. Vincent Poor

Sufficient conditions on minimum and maximum sizes of candidate anomalous intervals are characterized in order to guarantee the proposed test to be consistent.

Paper
Add Code

Median-Truncated Nonconvex Approach for Phase Retrieval with Outliers

no code implementations • 11 Mar 2016 • Huishuai Zhang, Yuejie Chi, Yingbin Liang

This paper investigates the phase retrieval problem, which aims to recover a signal from the magnitudes of its linear measurements.

Retrieval

Paper
Add Code

Analysis of Robust PCA via Local Incoherence

no code implementations • NeurIPS 2015 • Huishuai Zhang, Yi Zhou, Yingbin Liang

We investigate the robust PCA problem of decomposing an observed matrix into the sum of a low-rank and a sparse error matrices via convex programming Principal Component Pursuit (PCP).

Paper
Add Code

Nonparametric Detection of Anomalous Data Streams

no code implementations • 25 Apr 2014 • Shaofeng Zou, Yingbin Liang, H. Vincent Poor, Xinghua Shi

samples drawn from a distribution p, whereas each anomalous sequence contains m i. i. d.

Two-sample testing

Paper
Add Code

A Kernel-Based Nonparametric Test for Anomaly Detection over Line Networks

no code implementations • 1 Apr 2014 • Shaofeng Zou, Yingbin Liang, H. Vincent Poor

If anomalous interval does not exist, then all nodes receive samples generated by p. It is assumed that the distributions p and q are arbitrary, and are unknown.

Anomaly Detection

Paper
Add Code

Sharp Threshold for Multivariate Multi-Response Linear Regression via Block Regularized Lasso

no code implementations • 30 Jul 2013 • Weiguang Wang, Yingbin Liang, Eric P. Xing

The goal is to recover the support union of all regression vectors using $l_1/l_2$-regularized Lasso.

regression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.