no code implementations • 10 Feb 2022 • Juliusz Krysztof Ziomek, Jun Wang, Yaodong Yang
We study a novel setting in offline reinforcement learning (RL) where a number of distributed machines jointly cooperate to solve the problem but only one single round of communication is allowed and there is a budget constraint on the total number of information (in terms of bits) that each machine can send out.