Revisiting Few-sample BERT Fine-tuning

10 Jun 2020Tianyi ZhangFelix WuArzoo KatiyarKilian Q. WeinbergerYoav Artzi

We study the problem of few-sample fine-tuning of BERT contextual representations, and identify three sub-optimal choices in current, broadly adopted practices. First, we observe that the omission of the gradient bias correction in the BERTAdam optimizer results in fine-tuning instability... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper