Search Results for author: Sumit Sanghai

Found 13 papers, 4 papers with code

Functional Interpolation for Relative Positions Improves Long Context Transformers

no code implementations6 Oct 2023 Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

Preventing the performance decay of Transformers on inputs longer than those used for training has been an important challenge in extending the context length of these models.

Language Modelling Position

MEMORY-VQ: Compression for Tractable Internet-Scale Memory

no code implementations28 Aug 2023 Yury Zemlyanskiy, Michiel de Jong, Luke Vilnis, Santiago Ontañón, William W. Cohen, Sumit Sanghai, Joshua Ainslie

Retrieval augmentation is a powerful but expensive method to make language models more knowledgeable about the world.

Quantization Retrieval

GLIMMER: generalized late-interaction memory reranker

no code implementations17 Jun 2023 Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie

Memory-augmentation is a powerful approach for efficiently incorporating external information into language models, but leads to reduced performance relative to retrieving text.

Retrieval

CoLT5: Faster Long-Range Transformers with Conditional Computation

no code implementations17 Mar 2023 Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token.

Long-range modeling

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

1 code implementation18 Oct 2022 Luke Vilnis, Yury Zemlyanskiy, Patrick Murray, Alexandre Passos, Sumit Sanghai

Decoding methods for large language models often trade-off between diversity of outputs and parallelism of computation.

Language Modelling Large Language Model +1

MAVE: A Product Dataset for Multi-source Attribute Value Extraction

1 code implementation16 Dec 2021 Li Yang, Qifan Wang, Zac Yu, Anand Kulkarni, Sumit Sanghai, Bin Shu, Jon Elsas, Bhargav Kanagal

Attribute value extraction refers to the task of identifying values of an attribute of interest from product information.

Attribute Attribute Extraction +2

ShopTalk: A System for Conversational Faceted Search

no code implementations2 Sep 2021 Gurmeet Manku, James Lee-Thorp, Bhargav Kanagal, Joshua Ainslie, Jingchen Feng, Zach Pearson, Ebenezer Anjorin, Sudeep Gandhe, Ilya Eckstein, Jim Rosswog, Sumit Sanghai, Michael Pohl, Larry Adams, D. Sivakumar

The dialog understanding system consists of a deep-learned Contextual Language Understanding module, which interprets user utterances, and a primarily rules-based Dialog-State Tracker (DST), which updates the dialog state and formulates search requests intended for the fulfillment engine.

Management slot-filling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.