no code implementations • 21 Dec 2023 • Yuanfu Wang, Chao Yang, Ying Wen, Yu Liu, Yu Qiao
Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the action distribution based on target returns for each state in a supervised manner.