Search Results for author: USVSN Sai Prashanth

Found 3 papers, 3 papers with code

Emergent and Predictable Memorization in Large Language Models

2 code implementations • NeurIPS 2023 • Stella Biderman, USVSN Sai Prashanth, Lintang Sutawika, Hailey Schoelkopf, Quentin Anthony, Shivanshu Purohit, Edward Raff

Memorization, or the tendency of large language models (LLMs) to output entire sequences from their training data verbatim, is a key concern for safely deploying language models.

Memorization

6,617

Paper
Code

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

4 code implementations • 3 Apr 2023 • Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O'Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, Oskar van der Wal

How do large language models (LLMs) develop and evolve over the course of training?

Ranked #4 on Language Modelling on LAMBADA (Perplexity metric)

Common Sense Reasoning Coreference Resolution +3

6,896

Paper
Code

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

5 code implementations • BigScience (ACL) 2022 • Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license.

Ranked #86 on Multi-task Language Understanding on MMLU

Language Modelling Multi-task Language Understanding

48,805

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.