Search Results

FedCP: Separating Feature Information for Personalized Federated Learning via Conditional Policy

3 code implementations1 Jul 2023

To address this, we propose the Federated Conditional Policy (FedCP) method, which generates a conditional policy for each sample to separate the global information and personalized information in its features and then processes them by a global head and a personalized head, respectively.

Personalized Federated Learning

Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization

1 code implementation6 May 2020

We test the performance of our algorithm in a variety of environments where tasks may vary within each episode.

Joint Policy Search for Multi-agent Collaboration with Imperfect Information

1 code implementation NeurIPS 2020

Based on this, we propose Joint Policy Search(JPS) that iteratively improves joint policies of collaborative agents in imperfect information games, without re-evaluating the entire game.

Information Gathering in Decentralized POMDPs by Policy Graph Improvement

1 code implementation26 Feb 2019

Decentralized policies for information gathering are required when multiple autonomous agents are deployed to collect data about a phenomenon of interest without the ability to communicate.

Decision Making

Information-Transport-based Policy for Simultaneous Translation

1 code implementation22 Oct 2022

Simultaneous translation (ST) outputs translation while receiving the source inputs, and hence requires a policy to determine whether to translate a target token or wait for the next source token.

Machine Translation Translation

Policy Optimization with Second-Order Advantage Information

1 code implementation9 May 2018

Policy optimization on high-dimensional continuous control tasks exhibits its difficulty caused by the large variance of the policy gradient estimators.

Continuous Control

Gradient Informed Proximal Policy Optimization

1 code implementation NeurIPS 2023

We introduce a novel policy learning method that integrates analytical gradients from differentiable environments with the Proximal Policy Optimization (PPO) algorithm.

Wait-info Policy: Balancing Source and Target at Information Level for Simultaneous Machine Translation

1 code implementation20 Oct 2022

In this paper, we propose a Wait-info Policy to balance source and target at the information level.

Machine Translation Translation

Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

1 code implementation23 Mar 2021

Progress in deep reinforcement learning (RL) research is largely enabled by benchmark task environments.

Continuous Control OpenAI Gym +2