no code implementations • 17 Oct 2023 • Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano
Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models.