Search Results for author: Xinran Gu

Found 4 papers, 4 papers with code

Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates

1 code implementation • 28 Feb 2024 • Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal, Sanjeev Arora

Public LLMs such as the Llama 2-Chat have driven huge activity in LLM research.

Paper
Code

A Quadratic Synchronization Rule for Distributed Deep Learning

1 code implementation • 22 Oct 2023 • Xinran Gu, Kaifeng Lyu, Sanjeev Arora, Jingzhao Zhang, Longbo Huang

In distributed deep learning with data parallelism, synchronizing gradients at each training step can cause a huge communication overhead, especially when many nodes work together to train large models.

Paper
Code

Why (and When) does Local SGD Generalize Better than SGD?

1 code implementation • 2 Mar 2023 • Xinran Gu, Kaifeng Lyu, Longbo Huang, Sanjeev Arora

Local SGD is a communication-efficient variant of SGD for large-scale training, where multiple GPUs perform SGD independently and average the model parameters periodically.

Paper
Code

Fast Federated Learning in the Presence of Arbitrary Device Unavailability

1 code implementation • NeurIPS 2021 • Xinran Gu, Kaixuan Huang, Jingzhao Zhang, Longbo Huang

In this case, the convergence of popular FL algorithms such as FedAvg is severely influenced by the straggling devices.

Federated Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.