Learning to Solve Multi-Robot Task Allocation with a Covariant-Attention based Neural Architecture

1 Jan 2021 · Steve Paul, Payam Ghassemi, Souma Chowdhury ·

This paper presents a novel graph (reinforcement) learning method to solve an important class of multi-robot task allocation (MRTA) problems that involve tasks with deadlines, and robots with ferry range and payload constraints (thus requiring multiple tours per robot). While drawing motivation from recent graph learning methods that learn to solve combinatorial optimization problems of the mTSP/VRP type, this paper seeks to provide better convergence and tractable scalability (with an increasing number of tasks) in learning to solve MRTA. The proposed neural architecture, called Covariant Attention-based Model or CAM, includes two main components: 1) an encoder: a covariant node-based embedding model to represent each task as a learnable feature vector, and 2) a decoder: an attention-based model to facilitate a sequential output. In order to learn the feature vectors, a policy-gradient method based on REINFORCE is used. The new learning architecture is applied to a flood response problem (representative of the general class of Multi-task Robots, and Single-robot Tasks or MR-ST problems), where multiple unmanned aerial vehicles (UAVs) supply survival kits to victims spread out over an area. To perform a comparative analysis, the well-known attention-based approach (that is designed to solve mTSP/VRP problems) is extended and applied to the stated MRTA class of problems. A comparison is performed in terms of a cost function that considers both task completion rate and cumulative traveled distance. The results show that our primary method is superior not only in terms of the cost function (over various training and unseen test scenarios), but also provide significantly faster convergence and yields learnt policies that can be executed within 6 milliseconds per robot, thereby allowing real-time application.

PDF Abstract