no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Siddhant Garg, Rohit Kumar Sharma, YIngyu Liang
In this paper we show that concatenating the embeddings from the pre-trained model with those from a simple sentence embedding model trained only on the target data, can improve over the performance of FT for few-sample tasks.