Search Results for author: Howard Yen

Found 3 papers, 3 papers with code

Long-Context Language Modeling with Parallel Context Encoding

1 code implementation26 Feb 2024 Howard Yen, Tianyu Gao, Danqi Chen

We further introduce a CEPE variant that can extend the context window of instruction-tuned models with only unlabeled data, and showcase its effectiveness on LLAMA-2-CHAT, leading to a strong instruction-following model that can leverage very long context on downstream tasks.

8k Decoder +3

Cannot find the paper you are looking for? You can Submit a new open access paper.