Search Results for author: Honglu Fan

Found 4 papers, 1 papers with code

Grokking Group Multiplication with Cosets

no code implementations11 Dec 2023 Dashiell Stander, Qinan Yu, Honglu Fan, Stella Biderman

We use the group Fourier transform over the symmetric group $S_n$ to reverse engineer a 1-layer feedforward network that has "grokked" the multiplication of $S_5$ and $S_6$.

YaRN: Efficient Context Window Extension of Large Language Models

5 code implementations31 Aug 2023 Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole

Rotary Position Embeddings (RoPE) have been shown to effectively encode positional information in transformer-based language models.

Position

An ML approach to resolution of singularities

no code implementations1 Jul 2023 Gergely Bérczi, Honglu Fan, Mingcong Zeng

The solution set of a system of polynomial equations typically contains ill-behaved, singular points.

Stay on topic with Classifier-Free Guidance

no code implementations30 Jun 2023 Guillaume Sanchez, Honglu Fan, Alexander Spangher, Elad Levi, Pawan Sasanka Ammanamanchi, Stella Biderman

Classifier-Free Guidance (CFG) has recently emerged in text-to-image generation as a lightweight technique to encourage prompt-adherence in generations.

Code Generation Common Sense Reasoning +7

Cannot find the paper you are looking for? You can Submit a new open access paper.