no code implementations • 14 Mar 2018 • Jack Harmer, Linus Gisslén, Jorge del Val, Henrik Holst, Joakim Bergdahl, Tom Olsson, Kristoffer Sjöö, Magnus Nordin
This initial training technique kick-starts TD learning and the agent quickly learns to surpass the capabilities of the expert.