Efficient Exploration
145 papers with code • 0 benchmarks • 2 datasets
Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.
Source: Randomized Value Functions via Multiplicative Normalizing Flows
Benchmarks
These leaderboards are used to track progress in Efficient Exploration
Libraries
Use these libraries to find Efficient Exploration models and implementationsLatest papers with no code
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization
First, we propose a novel confidence set that is `semi-adaptive' to the unknown sub-Gaussian parameter $\sigma_*^2$ in the sense that the (normalized) confidence width scales with $\sqrt{d\sigma_*^2 + \sigma_0^2}$ where $d$ is the dimension and $\sigma_0^2$ is the specified sub-Gaussian parameter (known) that can be much larger than $\sigma_*^2$.
Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following
Diffusion-ES samples trajectories during evolutionary search from a diffusion model and scores them using a black-box reward function.
TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments
Additionally, TopoNav incorporates intrinsic motivation to guide exploration toward relevant regions and frontier nodes in the topological map, addressing the challenges of sparse extrinsic rewards.
Efficient Exploration for LLMs
We present evidence of substantial benefit from efficient exploration in gathering human feedback to improve large language models.
Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning
Therefore, we propose Scheduled Curiosity-Deep Dyna-Q (SC-DDQ), a curiosity-driven curriculum learning framework based on a state-of-the-art model-based reinforcement learning dialog model, Deep Dyna-Q (DDQ).
FIT-SLAM -- Fisher Information and Traversability estimation-based Active SLAM for exploration in 3D environments
Through this work, we propose FIT-SLAM (Fisher Information and Traversability estimation-based Active SLAM), a new exploration method tailored for unmanned ground vehicles (UGVs) to explore 3D environments.
Go-Explore for Residential Energy Management
We use the Go-Explore algorithm to solve the cost-saving task in residential energy management problems and achieve an improvement of up to 19. 84\% compared to the well-known reinforcement learning algorithms.
Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study
In this study, we employ a teacher-student learning framework to tackle these problems, specifically by offering feedback for LLMs using RL models and providing high-level information for RL models with LLMs in a cooperative multi-agent setting.
A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration
In this paper, a joint O-RAN/MEC orchestration using a Bayesian deep reinforcement learning (RL)-based framework is proposed that jointly controls the O-RAN functional splits, the allocated resources and hosting locations of the O-RAN/MEC services across geo-distributed platforms, and the routing for each O-RAN/MEC data flow.
Joint channel estimation and data detection in massive MIMO systems based on diffusion models
We propose a joint channel estimation and data detection algorithm for massive multilple-input multiple-output systems based on diffusion models.