Linear Warmup With Cosine Annealing is a learning rate schedule where we increase the learning rate linearly for $n$ updates and then anneal according to a cosine schedule afterwards.
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 79 | 10.66% |
Large Language Model | 50 | 6.75% |
Question Answering | 40 | 5.40% |
Retrieval | 29 | 3.91% |
In-Context Learning | 24 | 3.24% |
Text Generation | 22 | 2.97% |
Code Generation | 21 | 2.83% |
Prompt Engineering | 21 | 2.83% |
Sentence | 20 | 2.70% |