Search Results for author: Jonah Yi

Found 2 papers, 1 papers with code

KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization

no code implementations7 May 2024 Tianyi Zhang, Jonah Yi, Zhaozhuo Xu, Anshumali Shrivastava

We observe that distinct channels of a key/value activation embedding are highly inter-dependent, and the joint entropy of multiple channels grows at a slower rate than the sum of their marginal entropies.

Language Modelling Large Language Model +1

CAPS: A Practical Partition Index for Filtered Similarity Search

1 code implementation29 Aug 2023 Gaurav Gupta, Jonah Yi, Benjamin Coleman, Chen Luo, Vihan Lakshman, Anshumali Shrivastava

With the surging popularity of approximate near-neighbor search (ANNS), driven by advances in neural representation learning, the ability to serve queries accompanied by a set of constraints has become an area of intense interest.

Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.