Search Results for author: Sumit Sanghai

Found 13 papers, 4 papers with code

Functional Interpolation for Relative Positions Improves Long Context Transformers

no code implementations • 6 Oct 2023 • Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

Preventing the performance decay of Transformers on inputs longer than those used for training has been an important challenge in extending the context length of these models.

Language Modelling Position

Paper
Add Code

MEMORY-VQ: Compression for Tractable Internet-Scale Memory

no code implementations • 28 Aug 2023 • Yury Zemlyanskiy, Michiel de Jong, Luke Vilnis, Santiago Ontañón, William W. Cohen, Sumit Sanghai, Joshua Ainslie

Retrieval augmentation is a powerful but expensive method to make language models more knowledgeable about the world.

Quantization Retrieval

Paper
Add Code

GLIMMER: generalized late-interaction memory reranker

no code implementations • 17 Jun 2023 • Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie

Memory-augmentation is a powerful approach for efficiently incorporating external information into language models, but leads to reduced performance relative to retrieving text.

Retrieval

Paper
Add Code

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

3 code implementations • 22 May 2023 • Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebrón, Sumit Sanghai

Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference.

Language Modelling

53,048

Paper
Code

CoLT5: Faster Long-Range Transformers with Conditional Computation

no code implementations • 17 Mar 2023 • Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token.

Ranked #1 on Long-range modeling on SCROLLS

Long-range modeling

Paper
Add Code

Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

no code implementations • 25 Jan 2023 • Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Joshua Ainslie, Sumit Sanghai, Fei Sha, William Cohen

Retrieval-augmented language models such as Fusion-in-Decoder are powerful, setting the state of the art on a variety of knowledge-intensive tasks.

Question Answering Retrieval

Paper
Add Code

ImPaKT: A Dataset for Open-Schema Knowledge Base Construction

no code implementations • 21 Dec 2022 • Luke Vilnis, Zach Fisher, Bhargav Kanagal, Patrick Murray, Sumit Sanghai

Large language models have ushered in a golden age of semantic parsing.

Attribute Language Modelling +3

Paper
Add Code

FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

no code implementations • 15 Dec 2022 • Michiel de Jong, Yury Zemlyanskiy, Joshua Ainslie, Nicholas FitzGerald, Sumit Sanghai, Fei Sha, William Cohen

Fusion-in-Decoder (FiD) is a powerful retrieval-augmented language model that sets the state-of-the-art on many knowledge-intensive NLP tasks.

Ranked #3 on Question Answering on WebQuestions

Language Modelling Retrieval

Paper
Add Code

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

1 code implementation • 18 Oct 2022 • Luke Vilnis, Yury Zemlyanskiy, Patrick Murray, Alexandre Passos, Sumit Sanghai

Decoding methods for large language models often trade-off between diversity of outputs and parallelism of computation.

Language Modelling Large Language Model +1

32,819

Paper
Code

Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing

no code implementations • COLING 2022 • Yury Zemlyanskiy, Michiel de Jong, Joshua Ainslie, Panupong Pasupat, Peter Shaw, Linlu Qiu, Sumit Sanghai, Fei Sha

Then, it retrieves exemplars with outputs similar to the preliminary prediction which are used to generate a final prediction.

Retrieval Semantic Parsing

Paper
Add Code

MAVE: A Product Dataset for Multi-source Attribute Value Extraction

1 code implementation • 16 Dec 2021 • Li Yang, Qifan Wang, Zac Yu, Anand Kulkarni, Sumit Sanghai, Bin Shu, Jon Elsas, Bhargav Kanagal

Attribute value extraction refers to the task of identifying values of an attribute of interest from product information.

Attribute Attribute Extraction +2

132

Paper
Code

ShopTalk: A System for Conversational Faceted Search

no code implementations • 2 Sep 2021 • Gurmeet Manku, James Lee-Thorp, Bhargav Kanagal, Joshua Ainslie, Jingchen Feng, Zach Pearson, Ebenezer Anjorin, Sudeep Gandhe, Ilya Eckstein, Jim Rosswog, Sumit Sanghai, Michael Pohl, Larry Adams, D. Sivakumar

The dialog understanding system consists of a deep-learned Contextual Language Understanding module, which interprets user utterances, and a primarily rules-based Dialog-State Tracker (DST), which updates the dialog state and formulates search requests intended for the fulfillment engine.

Management slot-filling +1

Paper
Add Code

ETC: Encoding Long and Structured Inputs in Transformers

2 code implementations • EMNLP 2020 • Joshua Ainslie, Santiago Ontanon, Chris Alberti, Vaclav Cvicek, Zachary Fisher, Philip Pham, Anirudh Ravula, Sumit Sanghai, Qifan Wang, Li Yang

Transformer models have advanced the state of the art in many Natural Language Processing (NLP) tasks.

Ranked #3 on Question Answering on ConditionalQA

Position Question Answering

32,833

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.