Search Results for author: Yangchen Pan

Found 24 papers, 10 papers with code

An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models

no code implementations • 23 Apr 2024 • Yangchen Pan, Junfeng Wen, Chenjun Xiao, Philip Torr

In traditional statistical learning, data points are usually assumed to be independently and identically distributed (i. i. d.)

Image Classification Reinforcement Learning (RL)

Paper
Add Code

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

no code implementations • 17 Mar 2024 • Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart

Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications.

Paper
Add Code

Improving Adversarial Transferability via Model Alignment

no code implementations • 30 Nov 2023 • Avery Ma, Amir-Massoud Farahmand, Yangchen Pan, Philip Torr, Jindong Gu

During the alignment process, the parameters of the source model are fine-tuned to minimize an alignment loss.

Paper
Add Code

Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods

1 code implementation • 13 Aug 2023 • Avery Ma, Yangchen Pan, Amir-Massoud Farahmand

In the context of deep learning, our experiments show that SGD-trained neural networks have smaller Lipschitz constants, explaining the better robustness to input perturbations than those trained with adaptive gradient methods.

Paper
Code

Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning

1 code implementation • 16 Mar 2023 • Xutong Zhao, Yangchen Pan, Chenjun Xiao, Sarath Chandar, Janarthanan Rajendran

Efficient exploration is critical in cooperative deep Multi-Agent Reinforcement Learning (MARL).

Efficient Exploration Multi-agent Reinforcement Learning +2

Paper
Code

The In-Sample Softmax for Offline Reinforcement Learning

4 code implementations • 28 Feb 2023 • Chenjun Xiao, Han Wang, Yangchen Pan, Adam White, Martha White

We highlight a simple fact: it is more straightforward to approximate an in-sample \emph{softmax} using only actions in the dataset.

Offline RL reinforcement-learning +1

Paper
Code

Label Alignment Regularization for Distribution Shift

no code implementations • 27 Nov 2022 • Ehsan Imani, Guojun Zhang, Runjia Li, Jun Luo, Pascal Poupart, Philip H. S. Torr, Yangchen Pan

Recent work has highlighted the label alignment property (LAP) in supervised learning, where the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix.

Representation Learning Sentiment Analysis +1

Paper
Add Code

Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation

1 code implementation • 22 May 2022 • Qingfeng Lan, Yangchen Pan, Jun Luo, A. Rupam Mahmood

The experience replay buffer, a standard component in deep reinforcement learning, is often used to reduce forgetting and improve sample efficiency by storing experiences in a large buffer and using them for training later.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence

no code implementations • 24 Jan 2022 • Liangliang Xu, Daoming Lyu, Yangchen Pan, Aiwen Jiang, Bo Liu

This paper proposes Short-Term VOlatility-controlled Policy Search (STOPS), a novel algorithm that solves risk-averse problems by learning from short-term trajectories instead of long-term trajectories.

Paper
Add Code

An Alternate Policy Gradient Estimator for Softmax Policies

1 code implementation • 22 Dec 2021 • Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, A. Rupam Mahmood

Policy gradient (PG) estimators are ineffective in dealing with softmax policies that are sub-optimally saturated, which refers to the situation when the policy concentrates its probability mass on sub-optimal actions.

Paper
Code

Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities

1 code implementation • 28 Sep 2020 • Jincheng Mei, Yangchen Pan, Martha White, Amir-Massoud Farahmand, Hengshuai Yao

The prioritized Experience Replay (ER) method has attracted great attention; however, there is little theoretical understanding of such prioritization strategy and why they help.

2,361

Paper
Code

Understanding and Mitigating the Limitations of Prioritized Experience Replay

2 code implementations • 19 Jul 2020 • Yangchen Pan, Jincheng Mei, Amir-Massoud Farahmand, Martha White, Hengshuai Yao, Mohsen Rohani, Jun Luo

Prioritized Experience Replay (ER) has been empirically shown to improve sample efficiency across many domains and attracted great attention; however, there is little theoretical understanding of why such prioritized sampling helps and its limitations.

Autonomous Driving Continuous Control +1

2,361

Paper
Code

Maxmin Q-learning: Controlling the Estimation Bias of Q-learning

1 code implementation • ICLR 2020 • Qingfeng Lan, Yangchen Pan, Alona Fyshe, Martha White

Q-learning suffers from overestimation bias, because it approximates the maximum action value using the maximum estimated action value.

Q-Learning

Paper
Code

An implicit function learning approach for parametric modal regression

no code implementations • NeurIPS 2020 • Yangchen Pan, Ehsan Imani, Martha White, Amir-Massoud Farahmand

We empirically demonstrate on several synthetic problems that our method (i) can learn multi-valued functions and produce the conditional modes, (ii) scales well to high-dimensional inputs, and (iii) can even be more effective for certain uni-modal problems, particularly for high-frequency functions.

regression

Paper
Add Code

Frequency-based Search-control in Dyna

no code implementations • ICLR 2020 • Yangchen Pan, Jincheng Mei, Amir-Massoud Farahmand

This suggests a search-control strategy: we should use states from high frequency regions of the value function to query the model to acquire more samples.

Model-based Reinforcement Learning

Paper
Add Code

Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online

1 code implementation • ICLR 2021 • Yangchen Pan, Kirby Banman, Martha White

Recent work has shown that sparse representations -- where only a small percentage of units are active -- can significantly reduce interference.

Continual Learning Continuous Control +2

48,169

Paper
Code

Hill Climbing on Value Estimates for Search-control in Dyna

no code implementations • 18 Jun 2019 • Yangchen Pan, Hengshuai Yao, Amir-Massoud Farahmand, Martha White

In this work, we propose to generate such states by using the trajectory obtained from Hill Climbing (HC) the current estimate of the value function.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Add Code

Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement

1 code implementation • 22 Oct 2018 • Samuel Neumann, Sungsu Lim, Ajin Joseph, Yangchen Pan, Adam White, Martha White

We first provide a policy improvement result in an idealized setting, and then prove that our conditional CEM (CCEM) strategy tracks a CEM update per state, even with changing action-values.

Policy Gradient Methods Q-Learning

Paper
Code

Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control

no code implementations • ICML 2018 • Yangchen Pan, Amir-Massoud Farahmand, Martha White, Saleh Nabi, Piyush Grover, Daniel Nikovski

Recent work has shown that reinforcement learning (RL) is a promising approach to control dynamical systems described by partial differential equations (PDE).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains

no code implementations • 12 Jun 2018 • Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White

We show that a model, as opposed to a replay buffer, is particularly useful for specifying which states to sample from during planning, such as predecessor states that propagate information in reverse from a state more quickly.

Paper
Add Code

Effective sketching methods for value function approximation

no code implementations • 3 Aug 2017 • Yangchen Pan, Erfan Sadeqi Azer, Martha White

As a remedy, we demonstrate how to use sketching more sparingly, with only a left-sided sketch, that can still enable significant computational gains and the use of these matrix-based learning algorithms that are less sensitive to parameters.

Reinforcement Learning (RL)

Paper
Add Code

Adapting Kernel Representations Online Using Submodular Maximization

no code implementations • ICML 2017 • Matthew Schlegel, Yangchen Pan, Jiecao Chen, Martha White

In this work, we develop an approximately submodular criterion for this setting, and an efficient online greedy submodular maximization algorithm for optimizing the criterion.

Continual Learning

Paper
Add Code

Accelerated Gradient Temporal Difference Learning

no code implementations • 28 Nov 2016 • Yangchen Pan, Adam White, Martha White

The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD({\lambda}) to data efficient least squares methods.

Paper
Add Code

Incremental Truncated LSTD

no code implementations • 26 Nov 2015 • Clement Gehring, Yangchen Pan, Martha White

Balancing between computational efficiency and sample efficiency is an important goal in reinforcement learning.

Computational Efficiency

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.