no code implementations • 5 Feb 2020 • Nuo Wang Pierse, Jingwen Lu
We found that, with objective alignment, our 768 by 3 and 512 by 3 transformer language models can reach accuracy of 83. 9%/82. 5% for concept-of-interest tagging and 73. 8%/70. 2% for acronym detection using only 200 finetuning examples per task, outperforming the 768 by 3 model pretrained without objective alignment by +4. 8%/+3. 4% and +9. 9%/+6. 3%.