Trajectory Planning with Deep Reinforcement Learning in High-Level Action Spaces

30 Sep 2021 · Kyle R. Williams, Rachel Schlossman, Daniel Whitten, Joe Ingram, Srideep Musuvathy, Anirudh Patel, James Pagan, Kyle A. Williams, Sam Green, Anirban Mazumdar, Julie Parish ·

This paper presents a technique for trajectory planning based on continuously parameterized high-level actions (motion primitives) of variable duration. This technique leverages deep reinforcement learning (Deep RL) to formulate a policy which is suitable for real-time implementation. There is no separation of motion primitive generation and trajectory planning: each individual short-horizon motion is formed during the Deep RL training to achieve the full-horizon objective. Effectiveness of the technique is demonstrated numerically on a well-studied trajectory generation problem and a planning problem on a known obstacle-rich map. This paper also develops a new loss function term for policy-gradient-based Deep RL, which is analogous to an anti-windup mechanism in feedback control. We demonstrate the inclusion of this new term in the underlying optimization increases the average policy return in our numerical example.

PDF Abstract