no code implementations • 24 Aug 2023 • Hadar Schreiber Galler, Tom Zahavy, Guillaume Desjardins, Alon Cohen
This problem is formulated as mutual training of skills using an intrinsic reward and a discriminator trained to predict a skill given its trajectory.