Search Results for author: Fan-Ming Luo

Found 5 papers, 1 papers with code

Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning

no code implementations9 Oct 2023 Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu

MOREC learns a generalizable dynamics reward function from offline data, which is subsequently employed as a transition filter in any offline MBRL method: when generating transitions, the dynamics model generates a batch of transitions and selects the one with the highest dynamics reward value.

D4RL Model-based Reinforcement Learning +1

Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble

no code implementations1 Jun 2022 Fan-Ming Luo, Xingchen Cao, Yang Yu

Empirical results compared with the state-of-the-art AIL methods show that DARL can learn a reward that is more consistent with the true reward, thus obtaining higher environment returns.

Imitation Learning

Offline Model-based Adaptable Policy Learning

1 code implementation NeurIPS 2021 Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Qin, Wenjie Shang, Jieping Ye

Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.

Decision Making reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.