1 code implementation • 30 Apr 2023 • Baiting Zhu, Meihua Dang, Aditya Grover
In this work, we propose a new data-driven setup for offline MORL, where we wish to learn a preference-agnostic policy agent using only a finite dataset of offline demonstrations of other agents and their preferences.