Learning without Forgetting: Task Aware Multitask Learning for Multi-Modality Tasks
Existing joint learning strategies like multi-task or meta-learning focus more on shared learning and have little to no scope for task-specific learning. This creates the need for a distinct shared pretraining phase and a task-specific finetuning phase. The fine-tuning phase creates separate systems for each task, where improving the performance of a particular task necessitates forgetting some of the knowledge garnered in other tasks. Humans, on the other hand, perform task-specific learning in synergy with general domain-based learning. Inspired by these learning patterns in humans, we suggest a simple yet generic task aware framework to incorporate into existing joint learning strategies. The proposed framework computes task-specific representations to modulate model parameters during joint learning. Hence, it performs both shared and task-specific learning in a single-phase resulting in a single model for all the tasks. The single model itself achieves significant performance gains over the existing joint learning strategies. For example, we train a model on Speech Translation (ST), Automatic Speech Recognition (ASR), and Machine Translation (MT) tasks using the proposed task aware joint learning approach. This single model achieves a performance of 28.64 BLEU score on ST MuST-C English-German, WER of 11.61 on ASR TEDLium v3, and BLEU score of 23.35 on MT WMT14 English-German tasks. This sets a new state-of-the-art performance (SOTA) on ST while having close to SOTA performance on ASR and competitive performance on MT tasks.
PDF Abstract