no code implementations • 18 Oct 2023 • Ruixuan Miao, Xu Lu, Cong Tian, Bin Yu, Zhenhua Duan
Unlike the standard Reinforcement Learning (RL) model, many real-world tasks are non-Markovian, whose rewards are predicated on state history rather than solely on the current state.