Search Results for author: Paul Weng

Found 29 papers, 9 papers with code

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

no code implementations • ICML 2020 • Umer Siddique, Paul Weng, Matthieu Zimmer

During this analysis, we notably derive a new result in the standard RL setting, which is of independent interest: it states a novel bound on the approximation error with respect to the optimal average reward of that of a policy optimal for the discounted reward.

Fairness Reinforcement Learning (RL)

Paper
Add Code

Revisiting Data Augmentation in Deep Reinforcement Learning

1 code implementation • 19 Feb 2024 • Jianshu Hu, Yunpeng Jiang, Paul Weng

To tackle this question, we analyze existing methods to better understand them and to uncover how they are connected.

Data Augmentation reinforcement-learning

Paper
Code

INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer

1 code implementation • 4 Feb 2024 • Han Fang, Zhihao Song, Paul Weng, Yutong Ban

Recently, deep reinforcement learning has shown promising results for learning fast heuristics to solve routing problems.

Paper
Code

A Survey of Reinforcement Learning from Human Feedback

no code implementations • 22 Dec 2023 • Timo Kaufmann, Paul Weng, Viktor Bengs, Eyke Hüllermeier

Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL) that learns from human feedback instead of relying on an engineered reward function.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Rewards to Optimize Global Performance Metrics in Deep Reinforcement Learning

no code implementations • 16 Mar 2023 • Junqi Qian, Paul Weng, Chenmien Tan

LR4GPM alternates between two phases: (1) learning a (possibly vector) reward function used to fit the performance metric, and (2) training a policy to optimize an approximation of this performance metric based on the learned rewards.

Autonomous Driving reinforcement-learning +1

Paper
Add Code

Neuro-Symbolic Hierarchical Rule Induction

no code implementations • 26 Dec 2021 • Claire Glanois, Xuening Feng, Zhaohui Jiang, Paul Weng, Matthieu Zimmer, Dong Li, Wulong Liu

We propose an efficient interpretable neuro-symbolic model to solve Inductive Logic Programming (ILP) problems.

Inductive logic programming reinforcement-learning +1

Paper
Add Code

A Survey on Interpretable Reinforcement Learning

no code implementations • 24 Dec 2021 • Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu

To that aim, we distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy) and discuss them in the context of RL with an emphasis on the former notion.

Autonomous Driving Decision Making +2

Paper
Add Code

Generalization in Deep RL for TSP Problems via Equivariance and Local Search

no code implementations • 7 Oct 2021 • Wenbin Ouyang, Yisen Wang, Paul Weng, Shaochen Han

Since training on large instances is impractical, we design a novel deep RL approach with a focus on generalizability.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

no code implementations • 6 Oct 2021 • Wenbin Ouyang, Yisen Wang, Shaochen Han, Zhejian Jin, Paul Weng

In this work, we propose a novel approach named MAGIC that includes a deep learning architecture and a DRL training method.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Symbolic Rules for Interpretable Deep Reinforcement Learning

no code implementations • 15 Mar 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Hankz Hankui Zhuo, Dong Li, Wulong Liu, Jianye Hao

To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Safe Distributional Reinforcement Learning

no code implementations • 26 Feb 2021 • Jianyi Zhang, Paul Weng

Safety in reinforcement learning (RL) is a key property in both training and execution in many domains such as autonomous driving or finance.

Autonomous Driving Distributional Reinforcement Learning +2

Paper
Add Code

Differentiable Logic Machines

no code implementations • 23 Feb 2021 • Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui Jiang, Jianyi Zhang, Paul Weng, Dong Li, Jianye Hao, Wulong Liu

As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a first-order logic program.

Decision Making Inductive logic programming +1

Paper
Add Code

Analytics and Machine Learning in Vehicle Routing Research

no code implementations • 19 Feb 2021 • Ruibin Bai, Xinan Chen, Zhi-Long Chen, Tianxiang Cui, Shuhui Gong, Wentao He, Xiaoping Jiang, Huan Jin, Jiahuan Jin, Graham Kendall, Jiawei Li, Zheng Lu, Jianfeng Ren, Paul Weng, Ning Xue, Huayan Zhang

The Vehicle Routing Problem (VRP) is one of the most intensively studied combinatorial optimisation problems for which numerous models and algorithms have been proposed.

BIG-bench Machine Learning

Paper
Add Code

Interpretable Reinforcement Learning With Neural Symbolic Logic

no code implementations • 1 Jan 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Dong Li, Kun Shao, Wulong Liu, Hankz Hankui Zhuo, Jianye Hao

Recent progress in deep reinforcement learning (DRL) can be largely attributed to the use of neural networks.

Hierarchical Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning

3 code implementations • 17 Dec 2020 • Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng

As a solution method, we propose a novel neural network architecture, which is composed of two sub-networks specifically designed for taking into account the two aspects of fairness.

Fairness Multi-agent Reinforcement Learning +2

Paper
Code

Hyperparameter Auto-tuning in Self-Supervised Robotic Learning

2 code implementations • 16 Oct 2020 • Jiancong Huang, Juan Rojas, Matthieu Zimmer, Hongmin Wu, Yisheng Guan, Paul Weng

Insufficient learning (due to convergence to local optima) results in under-performing policies whilst redundant learning wastes time and resources.

Multi-Task Learning reinforcement-learning +1

Paper
Code

Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards

1 code implementation • 18 Aug 2020 • Umer Siddique, Paul Weng, Matthieu Zimmer

Since learning with discounted rewards is generally easier, this discussion further justifies finding a fair policy for the average reward by learning a fair policy for the discounted reward.

Fairness reinforcement-learning +1

Paper
Code

Reinforcement Learning

no code implementations • 29 May 2020 • Olivier Buffet, Olivier Pietquin, Paul Weng

Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e. g., board games, video games or autonomous vehicles.

Autonomous Vehicles Board Games +3

Paper
Add Code

Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation

1 code implementation • 19 Oct 2019 • Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Juan Rojas, Paul Weng

Deep reinforcement learning (DRL) is a promising approach for adaptive robot control, but its current application to robotics is currently hindered by high sample requirements.

Data Augmentation reinforcement-learning +1

Paper
Code

Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning

1 code implementation • 24 Sep 2019 • Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Yisheng Guan, Juan Rojas, Paul Weng

Our work demonstrates that invariant transformations on RL trajectories are a promising methodology to speed up learning in deep RL.

Data Augmentation OpenAI Gym +2

Paper
Code

Fairness in Reinforcement Learning

no code implementations • 24 Jul 2019 • Paul Weng

Decision support systems (e. g., for ecological conservation) and autonomous systems (e. g., adaptive controllers in smart cities) start to be deployed in real applications.

BIG-bench Machine Learning Fairness +2

Paper
Add Code

Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains

1 code implementation • 10 Jun 2019 • Matthieu Zimmer, Paul Weng

In the context of learning deterministic policies in continuous domains, we revisit an approach, which was first proposed in Continuous Actor Critic Learning Automaton (CACLA) and later extended in Neural Fitted Actor Critic (NFAC).

Paper
Code

Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems

1 code implementation • 25 Mar 2019 • Qitian Wu, Hengrui Zhang, Xiaofeng Gao, Peng He, Paul Weng, Han Gao, Guihai Chen

Social recommendation leverages social information to solve data sparsity and cold-start problems in traditional collaborative filtering methods.

Ranked #1 on Recommendation Systems on WeChat

Collaborative Filtering Graph Attention +1

Paper
Code

Multi-objective Bandits: Optimizing the Generalized Gini Index

no code implementations • ICML 2017 • Robert Busa-Fekete, Balazs Szorenyi, Paul Weng, Shie Mannor

We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized.

Paper
Add Code

Finding Risk-Averse Shortest Path with Time-dependent Stochastic Costs

no code implementations • 3 Jan 2017 • Dajian Li, Paul Weng, Orkun Karabasoglu

We also present a case study of our algorithm on the Manhattan, NYC, transportation network.

Paper
Add Code

From Preference-Based to Multiobjective Sequential Decision-Making

no code implementations • 3 Jan 2017 • Paul Weng

In this paper, we present a link between preference-based and multiobjective sequential decision-making.

Decision Making

Paper
Add Code

Optimizing Quantiles in Preference-based Markov Decision Processes

no code implementations • 1 Dec 2016 • Hugo Gilbert, Paul Weng, Yan Xu

In the Markov decision process model, policies are usually evaluated by expected cumulative rewards.

Paper
Add Code

Quantile Reinforcement Learning

no code implementations • 3 Nov 2016 • Hugo Gilbert, Paul Weng

In reinforcement learning, the standard criterion to evaluate policies in a state is the expectation of (discounted) sum of rewards.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes

no code implementations • 26 Sep 2013 • Patrice Perny, Paul Weng, Judy Goldsmith, Josiah Hanna

This paper is devoted to fair optimization in Multiobjective Markov Decision Processes (MOMDPs).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.