no code implementations • ICML 2020 • Benjamin Coleman, Anshumali Shrivastava, Richard Baraniuk
We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset.
no code implementations • 7 May 2024 • Tianyi Zhang, Jonah Yi, Zhaozhuo Xu, Anshumali Shrivastava
We observe that distinct channels of a key/value activation embedding are highly inter-dependent, and the joint entropy of multiple channels grows at a slower rate than the sum of their marginal entropies.
1 code implementation • 2 Mar 2024 • Tianyi Zhang, Jonah Wonkyu Yi, Bowen Yao, Zhaozhuo Xu, Anshumali Shrivastava
Large language model inference on Central Processing Units (CPU) is challenging due to the vast quantities of expensive Multiply-Add (MAD) matrix operations in the attention computations.
no code implementations • 21 Feb 2024 • Zichang Liu, Qingyun Liu, Yuening Li, Liang Liu, Anshumali Shrivastava, Shuchao Bi, Lichan Hong, Ed H. Chi, Zhe Zhao
Further, to accommodate the dissimilarity among the teachers in the committee, we introduce DiverseDistill, which allows the student to understand the expertise of each teacher and extract task knowledge.
no code implementations • 28 Dec 2023 • Tianyi Zhang, Haoteng Yin, Rongzhe Wei, Pan Li, Anshumali Shrivastava
We further show that any type of neighborhood overlap-based heuristic can be estimated by a neural network that takes Bloom signatures as input.
no code implementations • 13 Dec 2023 • Bingcong Li, Shuai Zheng, Parameswaran Raman, Anshumali Shrivastava, Georgios B. Giannakis
On-device memory concerns in distributed deep learning have become severe due to (i) the growth of model size in multi-GPU training, and (ii) the wide adoption of deep neural networks for federated learning on IoT devices which have limited storage.
no code implementations • 22 Nov 2023 • Shabnam Daghaghi, Benjamin Coleman, Benito Geordie, Anshumali Shrivastava
To address this problem, we propose a novel sampling distribution based on nonparametric kernel regression that learns an effective importance score as the neural network trains.
1 code implementation • 3 Nov 2023 • Aditya Desai, Benjamin Meisburger, Zichang Liu, Anshumali Shrivastava
To include data from all devices in federated learning, we must enable collective training of embedding tables on devices with heterogeneous memory capacities.
1 code implementation • 26 Oct 2023 • Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen
We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.
no code implementations • 17 Oct 2023 • Aditya Desai, Anshumali Shrivastava
In this paper, we comprehensively assess the trade-off between memory and accuracy across RPS, pruning techniques, and building smaller models.
no code implementations • 23 Sep 2023 • Zhuang Wang, Zhaozhuo Xu, Anshumali Shrivastava, T. S. Eugene Ng
We then systematically explore the design space of communication schemes for sparse tensors and find the optimal one.
1 code implementation • 29 Aug 2023 • Gaurav Gupta, Jonah Yi, Benjamin Coleman, Chen Luo, Vihan Lakshman, Anshumali Shrivastava
With the surging popularity of approximate near-neighbor search (ANNS), driven by advances in neural representation learning, the ability to serve queries accompanied by a set of constraints has become an area of intense interest.
no code implementations • 26 May 2023 • Benjamin Coleman, David Torres Ramos, Vihan Lakshman, Chen Luo, Anshumali Shrivastava
Lookup tables are a fundamental structure in many data processing and systems applications.
no code implementations • 17 May 2023 • Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava
Thus, optimizing this accuracy-efficiency trade-off is crucial for the LLM deployment on commodity hardware.
2 code implementations • 30 Mar 2023 • Nicholas Meisburger, Vihan Lakshman, Benito Geordie, Joshua Engels, David Torres Ramos, Pratik Pranav, Benjamin Coleman, Benjamin Meisburger, Shubh Gupta, Yashwanth Adunukota, Tharun Medini, Anshumali Shrivastava
Efficient large-scale neural network training and inference on commodity CPU hardware is of immense practical significance in democratizing deep learning (DL) capabilities.
Ranked #2 on Node Classification on Yelp-Fraud
no code implementations • 10 Mar 2023 • Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu
Current theoretical literature focuses on greedy search on exact near neighbor graph while practitioners use approximate near neighbor graph (ANN-Graph) to reduce the preprocessing time.
1 code implementation • 29 Dec 2022 • Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson
The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems.
no code implementations • 21 Jul 2022 • Aditya Desai, Anshumali Shrivastava
It contains 100GB of embedding memory (25+Billion parameters).
1 code implementation • 21 Jul 2022 • Aditya Desai, Keren Zhou, Anshumali Shrivastava
Advancements in deep learning are often associated with increasing model sizes.
no code implementations • 29 Jan 2022 • Minghao Yan, Nicholas Meisburger, Tharun Medini, Anshumali Shrivastava
We show that with reduced communication, due to sparsity, we can train close to a billion parameter model on simple 4-16 core CPU nodes connected by basic low bandwidth interconnect.
no code implementations • NeurIPS 2021 • Zhaozhuo Xu, Beidi Chen, Chaojian Li, Weiyang Liu, Le Song, Yingyan Lin, Anshumali Shrivastava
However, as one of the most influential and practical MT paradigms, iterative machine teaching (IMT) is prohibited on IoT devices due to its inefficient and unscalable algorithms.
no code implementations • NeurIPS 2021 • Aditya Desai, Zhaozhuo Xu, Menal Gupta, Anu Chandran, Antoine Vial-Aussavy, Anshumali Shrivastava
This paradigm breaks the SI into local inversion tasks, which predicts each small chunk of subsurface properties using surrounding seismic data.
no code implementations • NeurIPS 2021 • Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu
In this work, we focus on improving the per iteration cost of CGM.
no code implementations • 23 Oct 2021 • Zhenwei Dai, Chen Dun, Yuxin Tang, Anastasios Kyrillidis, Anshumali Shrivastava
Federated learning enables many local devices to train a deep learning model jointly without sharing the local data.
no code implementations • 29 Sep 2021 • Aditya Desai, Shashank Sonkar, Anshumali Shrivastava, Richard Baraniuk
Grounded on this framework, we show that many algorithms ranging across different domains are, in fact, searching for continuous stable coloring solutions of an underlying graph corresponding to the domain.
no code implementations • 29 Sep 2021 • Gaurav Gupta, Benjamin Coleman, John Chen, Anshumali Shrivastava
To this end, we propose STORM, an online sketching-based method for empirical risk minimization.
no code implementations • 4 Aug 2021 • Aditya Desai, Li Chou, Anshumali Shrivastava
In this paper, we present Random Offset Block Embedding Array (ROBE) as a low memory alternative to embedding tables which provide orders of magnitude reduction in memory usage while maintaining accuracy and boosting execution speed.
no code implementations • 21 Jun 2021 • Zichang Liu, Benjamin Coleman, Anshumali Shrivastava
Large machine learning models achieve unprecedented performance on various tasks and have evolved as the go-to technique.
no code implementations • 15 Jun 2021 • Zhaozhuo Xu, Minghao Yan, Junyan Zhang, Anshumali Shrivastava
The dot product self-attention in Transformer allows us to model interactions between words.
no code implementations • 18 May 2021 • Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu
We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions.
no code implementations • 17 Mar 2021 • Gaurav Gupta, Tharun Medini, Anshumali Shrivastava, Alexander J Smola
Neural models have transformed the fundamental information retrieval problem of mapping a query to a giant set of items.
2 code implementations • 6 Mar 2021 • Shabnam Daghaghi, Nicholas Meisburger, Mengnan Zhao, Yong Wu, Sameh Gobriel, Charlie Tai, Anshumali Shrivastava
Our work highlights several novel perspectives and opportunities for implementing randomized algorithms for deep learning on modern CPUs.
no code implementations • 26 Feb 2021 • Zhaozhuo Xu, Aditya Desai, Menal Gupta, Anu Chandran, Antoine Vial-Aussavy, Anshumali Shrivastava
We propose a fundamental shift to move away from convolutions and introduce SESDI: Set Embedding based SDI approach.
1 code implementation • 24 Feb 2021 • Aditya Desai, Yanzhou Pan, Kuangyuan Sun, Li Chou, Anshumali Shrivastava
In particular, our LMA embeddings achieve the same performance compared to standard embeddings with a 16$\times$ reduction in memory footprint.
no code implementations • 24 Feb 2021 • Aditya Desai, Benjamin Coleman, Anshumali Shrivastava
We introduce Density sketches (DS): a succinct online summary of the data distribution.
no code implementations • ICLR 2021 • Beidi Chen, Zichang Liu, Binghui Peng, Zhaozhuo Xu, Jonathan Lingjie Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re
Recent advances by practitioners in the deep learning community have breathed new life into Locality Sensitive Hashing (LSH), using it to reduce memory and time bottlenecks in neural network (NN) training.
no code implementations • 1 Jan 2021 • Shabnam Daghaghi, Tharun Medini, Beidi Chen, Mengnan Zhao, Anshumali Shrivastava
Softmax classifiers with a very large number of classes naturally occur in many applications such as natural language processing and information retrieval.
no code implementations • 31 Dec 2020 • Shabnam Daghaghi, Tharun Medini, Nicholas Meisburger, Beidi Chen, Mengnan Zhao, Anshumali Shrivastava
Unfortunately, due to the dynamically updated parameters and data samples, there is no sampling scheme that is provably adaptive and samples the negative classes efficiently.
no code implementations • NeurIPS 2020 • Zhenwei Dai, Anshumali Shrivastava
Recent work suggests improving the performance of Bloom filter by incorporating a machine learning model as a binary classifier.
no code implementations • 23 Nov 2020 • Zichang Liu, Li Chou, Anshumali Shrivastava
In this paper, we argue that the state-of-the-art-systems are significantly worse in terms of accuracy because they are incapable of utilizing these essential structural information.
1 code implementation • 29 Oct 2020 • Constantinos Chamzas, Zachary Kingston, Carlos Quintero-Peña, Anshumali Shrivastava, Lydia E. Kavraki
Earlier work has shown that reusing experience from prior motion planning problems can improve the efficiency of similar, future motion planning queries.
no code implementations • ICLR 2021 • Tharun Medini, Beidi Chen, Anshumali Shrivastava
The label vectors are random, sparse, and near-orthogonal by design, while the query vectors are learned and sparse.
no code implementations • 5 Aug 2020 • Nicholas Meisburger, Anshumali Shrivastava
In the end, we show how our system scale to Tera-scale Criteo dataset with more than 4 billion samples.
no code implementations • 21 Jul 2020 • Louis Abraham, Gary Becigneul, Benjamin Coleman, Bernhard Scholkopf, Anshumali Shrivastava, Alexander Smola
Group testing is a well-studied problem with several appealing solutions, but recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
no code implementations • 2 Jul 2020 • Zichang Liu, Zhaozhuo Xu, Alan Ji, Jonathan Li, Beidi Chen, Anshumali Shrivastava
Efficient inference for wide output layers (WOLs) is an essential yet challenging task in large scale machine learning.
no code implementations • 25 Jun 2020 • Benjamin Coleman, Gaurav Gupta, John Chen, Anshumali Shrivastava
To this end, we propose STORM, an online sketch for empirical risk minimization.
no code implementations • 16 Jun 2020 • Benjamin Coleman, Anshumali Shrivastava
Existing methods for DP kernel density estimation scale poorly, often exponentially slower with an increase in dimensions.
no code implementations • 8 Jun 2020 • Sicong Liu, Junzhao Du, Anshumali Shrivastava, Lin Zhong
This work departs from prior works in methodology: we leverage adversarial learning to a better balance between privacy and utility.
no code implementations • 4 Dec 2019 • Benjamin Coleman, Anshumali Shrivastava
We evaluate our method on real-world high-dimensional datasets and show that our sketch achieves 10x better compression compared to competing methods.
no code implementations • ICML 2020 • Beidi Chen, Weiyang Liu, Zhiding Yu, Jan Kautz, Anshumali Shrivastava, Animesh Garg, Anima Anandkumar
We also find that AVH has a statistically significant correlation with human visual hardness.
2 code implementations • 2 Dec 2019 • Anastasios Kyrillidis, Anshumali Shrivastava, Moshe Y. Vardi, Zhiwei Zhang
By such a reduction to continuous optimization, we propose an algebraic framework for solving systems consisting of different types of constraints.
1 code implementation • NeurIPS 2019 • Beidi Chen, Yingchen Xu, Anshumali Shrivastava
In this paper, we break this barrier by providing the first demonstration of a scheme, Locality sensitive hashing (LSH) sampled Stochastic Gradient Descent (LGD), which leads to superior gradient estimation while keeping the sampling cost per iteration similar to that of the uniform sampling.
no code implementations • 30 Oct 2019 • Beidi Chen, Yingchen Xu, Anshumali Shrivastava
In this paper, we break this barrier by providing the first demonstration of a scheme, Locality sensitive hashing (LSH) sampled Stochastic Gradient Descent (LGD), which leads to superior gradient estimation while keeping the sampling cost per iteration similar to that of the uniform sampling.
1 code implementation • NeurIPS 2019 • Tharun Medini, Qixuan Huang, Yiqiu Wang, Vijai Mohan, Anshumali Shrivastava
Our largest model has 6. 4 billion parameters and trains in less than 35 hours on a single p3. 16x machine.
no code implementations • NeurIPS 2020 • Zhenwei Dai, Anshumali Shrivastava
Recent work suggests improving the performance of Bloom filter by incorporating a machine learning model as a binary classifier.
1 code implementation • 10 Oct 2019 • Gaurav Gupta, Minghao Yan, Benjamin Coleman, Bryce Kille, R. A. Leo Elworth, Tharun Medini, Todd Treangen, Anshumali Shrivastava
Interestingly, it is a count-min sketch type arrangement of a membership testing utility (Bloom Filter in our case).
1 code implementation • 7 Oct 2019 • Gaurav Gupta, Benjamin Coleman, Tharun Medini, Vijai Mohan, Anshumali Shrivastava
A simple array of Bloom Filters can achieve that.
no code implementations • 10 Sep 2019 • Shabnam Daghaghi, Tharun Medini, Anshumali Shrivastava
Zero-Shot Learning (ZSL) is a classification task where we do not have even a single training labeled example from a set of unseen classes.
1 code implementation • 23 Aug 2019 • John Chen, Ben Coleman, Anshumali Shrivastava
We show, both theoretically and empirically, that our proposed solution is significantly superior for load balancing and is optimal in many senses.
Data Structures and Algorithms
no code implementations • 20 Mar 2019 • Constantinos Chamzas, Anshumali Shrivastava, Lydia E. Kavraki
In this work, we decompose the workspace into local primitives, memorizing local experiences by these primitives in the form of local samplers, and store them in a database.
3 code implementations • 7 Mar 2019 • Beidi Chen, Tharun Medini, James Farwell, Sameh Gobriel, Charlie Tai, Anshumali Shrivastava
On the same CPU hardware, SLIDE is over 10x faster than TF.
no code implementations • 18 Feb 2019 • Benjamin Coleman, Richard G. Baraniuk, Anshumali Shrivastava
We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset.
1 code implementation • 1 Feb 2019 • Ryan Spring, Anastasios Kyrillidis, Vijai Mohan, Anshumali Shrivastava
The problem is becoming more severe as deep learning models continue to grow larger in order to learn from complex, large-scale datasets.
no code implementations • ICLR 2019 • Sicong Liu, Anshumali Shrivastava, Junzhao Du, Lin Zhong
This work represents a methodical departure from prior works: we balance between a measure of privacy and another of utility by leveraging adversarial learning to find a sweeter tradeoff.
1 code implementation • NeurIPS 2018 • Ankush Mandal, He Jiang, Anshumali Shrivastava, Vivek Sarkar
In particular, for identifying top-K frequent items, Count-Min Sketch (CMS) has fantastic update time but lack the important property of reducibility which is needed for exploiting available massive data parallelism.
no code implementations • 11 Oct 2018 • Rebecca C. Steorts, Anshumali Shrivastava
Entity resolution seeks to merge databases as to remove duplicate entries where unique identifiers are typically unknown.
no code implementations • 9 Oct 2018 • Qixuan Huang, Yiqiu Wang, Tharun Medini, Anshumali Shrivastava
With MACH we can train ODP dataset with 100, 000 classes and 400, 000 features on a single Titan X GPU, with the classification accuracy of 19. 28%, which is the best-reported accuracy on this dataset.
no code implementations • 27 Sep 2018 • Tharun Medini, Anshumali Shrivastava
Imitation Learning is the task of mimicking the behavior of an expert player in a Reinforcement Learning(RL) Environment to enhance the training of a fresh agent (called novice) beginning from scratch.
1 code implementation • ICML 2018 • Amirali Aghazadeh, Ryan Spring, Daniel LeJeune, Gautam Dasarathy, Anshumali Shrivastava, baraniuk
We demonstrate that MISSION accurately and efficiently performs feature selection on real-world, large-scale datasets with billions of dimensions.
1 code implementation • 12 Jun 2018 • Amirali Aghazadeh, Ryan Spring, Daniel LeJeune, Gautam Dasarathy, Anshumali Shrivastava, Richard G. Baraniuk
We demonstrate that MISSION accurately and efficiently performs feature selection on real-world, large-scale datasets with billions of dimensions.
no code implementations • 21 Feb 2018 • Chen Luo, Anshumali Shrivastava
It is well known that state-of-the-art methods for split-merge MCMC do not scale well.
no code implementations • ICLR 2018 • Qixuan Huang, Anshumali Shrivastava, Yiqiu Wang
MACH is the first generic $K$-classification algorithm, with provably theoretical guarantees, which requires $O(\log{K})$ memory without any assumption on the relationship between classes.
no code implementations • ICLR 2018 • Beidi Chen, Yingchen Xu, Anshumali Shrivastava
In this paper, we break this barrier by providing the first demonstration of a sampling scheme, which leads to superior gradient estimation, while keeping the sampling cost per iteration similar to that of the uniform sampling.
no code implementations • 25 Aug 2017 • Chen Luo, Zhengzhang Chen, Lu-An Tang, Anshumali Shrivastava, Zhichun Li
Given a well-trained dependency graph from a source domain and an immature dependency graph from a target domain, how can we extract the entity and dependency knowledge from the source to enhance the target?
no code implementations • 20 Jun 2017 • Chen Luo, Anshumali Shrivastava
In the big-data world existing methods fail to address the new set of memory and latency constraints.
1 code implementation • 15 Mar 2017 • Ryan Spring, Anshumali Shrivastava
We propose a new sampling scheme and an unbiased estimator that estimates the partition function accurately in sub-linear time.
1 code implementation • ICML 2017 • Anshumali Shrivastava
Minwise hashing is a fundamental and one of the most successful hashing algorithm in the literature.
no code implementations • 6 Dec 2016 • M. Sadegh Riazi, Beidi Chen, Anshumali Shrivastava, Dan Wallach, Farinaz Koushanfar
In Near-Neighbor Search (NNS), a new client queries a database (held by a server) for the most similar data (near-neighbors) given a certain similarity metric.
no code implementations • 6 Dec 2016 • Beidi Chen, Anshumali Shrivastava
WTA (Winner Take All) hashing has been successfully applied in many large scale vision applications.
no code implementations • NeurIPS 2016 • Anshumali Shrivastava
Weighted minwise hashing (WMH) is one of the fundamental subroutine, required by many celebrated approximation algorithms, commonly adopted in industrial practice for large -scale search and learning.
1 code implementation • 24 Oct 2016 • Chen Luo, Anshumali Shrivastava
However, branch and bound based pruning are only useful for very short queries (low dimensional time series), and the bounds are quite weak for longer queries.
no code implementations • 26 Feb 2016 • Ryan Spring, Anshumali Shrivastava
A unique property of the proposed hashing based back-propagation is that the updates are always sparse.
no code implementations • 21 Feb 2016 • Ping Li, Michael Mitzenmacher, Anshumali Shrivastava
In this paper, we focus on a simple 2-bit coding scheme.
1 code implementation • 14 Nov 2014 • Anshumali Shrivastava, Ping Li
Minwise hashing (Minhash) is a widely popular indexing scheme in practice.
no code implementations • 20 Oct 2014 • Anshumali Shrivastava, Ping Li
In the prior work, the authors use asymmetric transformations which convert the problem of approximate MIPS into the problem of approximate near neighbor search which can be efficiently solved using hashing.
no code implementations • 16 Jul 2014 • Anshumali Shrivastava, Ping Li
To provide a common basis for comparison, we evaluate retrieval results in terms of $\mathcal{S}$ for both MinHash and SimHash.
1 code implementation • 18 Jun 2014 • Anshumali Shrivastava, Ping Li
The existing work on densification of one permutation hashing reduces the query processing cost of the $(K, L)$-parameterized Locality Sensitive Hashing (LSH) algorithm with minwise hashing, from $O(dKL)$ to merely $O(d + KL)$, where $d$ is the number of nonzeros of the data vector, $K$ is the number of hashes in each hash table, and $L$ is the number of hash tables.
no code implementations • NeurIPS 2014 • Anshumali Shrivastava, Ping Li
Our proposal is based on an interesting mathematical phenomenon in which inner products, after independent asymmetric transformations, can be converted into the problem of approximate near neighbor search.
no code implementations • 21 Apr 2014 • Anshumali Shrivastava, Ping Li
We propose a representation of graph as a functional object derived from the power iteration of the underlying adjacency matrix.
no code implementations • 17 Apr 2014 • Anshumali Shrivastava, Ping Li
We show that the proposed matrix representation encodes the spectrum of the underlying adjacency matrix and it also contains information about the counts of small sub-structures present in the graph such as triangles and small paths.
no code implementations • 31 Mar 2014 • Ping Li, Michael Mitzenmacher, Anshumali Shrivastava
This technical note compares two coding (quantization) schemes for random projections in the context of sub-linear time approximate near neighbor search.
no code implementations • NeurIPS 2013 • Anshumali Shrivastava, Ping Li
We go beyond the notion of pairwise similarity and look into search problems with $k$-way similarity functions.
no code implementations • 9 Aug 2013 • Ping Li, Michael Mitzenmacher, Anshumali Shrivastava
The method of random projections has become very popular for large-scale applications in statistical learning, information retrieval, bio-informatics and other applications.
no code implementations • NeurIPS 2011 • Ping Li, Anshumali Shrivastava, Joshua L. Moore, Arnd C. König
Minwise hashing is a standard technique in the context of search for efficiently computing set similarities.