Multi-modal Intent Classification for Assistive Robots with Large-scale Naturalistic Datasets

ALTA 2021 · Karun Varghese Mathew, Venkata S Aditya Tarigoppula, Lea Frermann ·

Recent years have brought a tremendous growth in assistive robots/prosthetics for people with partial or complete loss of upper limb control. These technologies aim to help the users with various reaching and grasping tasks in their daily lives such as picking up an object and transporting it to a desired location; and their utility critically depends on the ease and effectiveness of communication between the user and robot. One of the natural ways of communicating with assistive technologies is through verbal instructions. The meaning of natural language commands depends on the current configuration of the surrounding environment and needs to be interpreted in this multi-modal context, as accurate interpretation of the command is essential for a successful execution of the userâs intent by an assistive device. The research presented in this paper demonstrates how large-scale situated natural language datasets can support the development of robust assistive technologies. We leveraged a navigational dataset comprising >25k human-provided natural language commands covering diverse situations. We demonstrated a way to extend the dataset in a task-informed way and use it to develop multi-modal intent classifiers for pick and place tasks. Our best classifier reached >98% accuracy in a 16-way multi-modal intent classification task, suggesting high robustness and flexibility.

PDF Abstract