OpenAI Gym
163 papers with code • 9 benchmarks • 3 datasets
An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.
(Description by Evolutionary learning of interpretable decision trees)
(Image Credit: OpenAI Gym)
Libraries
Use these libraries to find OpenAI Gym models and implementationsLatest papers with no code
On Combining Expert Demonstrations in Imitation Learning via Optimal Transport
One of the key approaches to IL is to define a distance between agent and expert and to find an agent policy that minimizes that distance.
Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing
Our research demonstrates that to achieve $\epsilon$-optimal policies for all $M$ tasks, a single agent using DistMT-LSVI needs to run a total number of episodes that is at most $\tilde{\mathcal{O}}({d^3H^6(\epsilon^{-2}+c_{\rm sep}^{-2})}\cdot M/N)$, where $c_{\rm sep}>0$ is a constant representing task separability, $H$ is the horizon of each episode, and $d$ is the feature dimension of the dynamics and rewards.
Learning Environment Models with Continuous Stochastic Dynamics
We aim to provide insights into the decisions faced by the agent by learning an automaton model of environmental behavior under the control of an agent.
Correcting discount-factor mismatch in on-policy policy gradient methods
The policy gradient theorem gives a convenient form of the policy gradient in terms of three factors: an action value, a gradient of the action likelihood, and a state distribution involving discounting called the \emph{discounted stationary distribution}.
Deep Reinforcement Learning for ESG financial portfolio management
This paper investigates the application of Deep Reinforcement Learning (DRL) for Environment, Social, and Governance (ESG) financial portfolio management, with a specific focus on the potential benefits of ESG score-based market regulation.
Mimicking Better by Matching the Approximate Action Distribution
In this paper, we introduce MAAD, a novel, sample-efficient on-policy algorithm for Imitation Learning from Observations.
Active Inference in Hebbian Learning Networks
This work studies how brain-inspired neural ensembles equipped with local Hebbian plasticity can perform active inference (AIF) in order to control dynamical agents.
Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning
In this study, we tackle this challenge by introducing an off-policy inverse multi-agent reinforcement learning algorithm (IMARL).
Rethinking Population-assisted Off-policy Reinforcement Learning
In this paper, we first analyze the use of off-policy RL algorithms in combination with population-based algorithms, showing that the use of population data could introduce an overlooked error and harm performance.
Gym-preCICE: Reinforcement Learning Environments for Active Flow Control
Active flow control (AFC) involves manipulating fluid flow over time to achieve a desired performance or efficiency.