no code implementations • EMNLP 2021 • Simeng Sun, Kalpesh Krishna, Andrew Mattarella-Micke, Mohit Iyyer
Language models are generally trained on short, truncated input sequences, which limits their ability to use discourse-level information present in long-range context to improve their predictions.
1 code implementation • EMNLP 2020 • Tu Vu, Tong Wang, Tsendsuren Munkhdalai, Alessandro Sordoni, Adam Trischler, Andrew Mattarella-Micke, Subhransu Maji, Mohit Iyyer
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task, and we validate their effectiveness in experiments controlled for source and target data size.