Linear Warmup With Cosine Annealing is a learning rate schedule where we increase the learning rate linearly for $n$ updates and then anneal according to a cosine schedule afterwards.
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 84 | 10.92% |
Large Language Model | 50 | 6.50% |
Question Answering | 48 | 6.24% |
Retrieval | 27 | 3.51% |
Text Generation | 26 | 3.38% |
In-Context Learning | 25 | 3.25% |
Sentence | 24 | 3.12% |
Prompt Engineering | 22 | 2.86% |
Code Generation | 18 | 2.34% |