1 code implementation • 4 Jul 2023 • Tal Wagner, Yonatan Naamad, Nina Mishra
We study efficient mechanisms for differentially private kernel density estimation (DP-KDE).
no code implementations • 15 Apr 2023 • Nicholas Schiefer, Justin Y. Chen, Piotr Indyk, Shyam Narayanan, Sandeep Silwal, Tal Wagner
An $\varepsilon$-approximate quantile sketch over a stream of $n$ inputs approximates the rank of any query point $q$ - that is, the number of input points less than $q$ - up to an additive error of $\varepsilon n$, generally with some probability of at least $1 - 1/\mathrm{poly}(n)$, while consuming $o(n)$ space.
no code implementations • 6 Nov 2022 • Anders Aamand, Justin Y. Chen, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Nicholas Schiefer, Sandeep Silwal, Tal Wagner
However, those simulations involve neural networks for the 'combine' function of size polynomial or even exponential in the number of graph nodes $n$, as well as feature vectors of length linear in $n$.
no code implementations • 24 Oct 2022 • David Alvarez-Melis, Nicolò Fusi, Lester Mackey, Tal Wagner
Optimal Transport (OT) is a fundamental tool for comparing probability distributions, but its exact computation remains prohibitive for large datasets.
no code implementations • 16 Jun 2022 • Peter Bartlett, Piotr Indyk, Tal Wagner
Our techniques are general, and provide generalization bounds for many other recently proposed data-driven algorithms in numerical linear algebra, covering both sketching-based and multigrid-based methods.
1 code implementation • 9 Jun 2022 • Yi Zhang, Arturs Backurs, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Tal Wagner
We study how the trained models eventually succeed at the task, and in particular, we manage to understand some of the attention heads as well as how the information flows in the network.
no code implementations • ICLR 2022 • Justin Y. Chen, Talya Eden, Piotr Indyk, Honghao Lin, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner, David P. Woodruff, Michael Zhang
We propose data-driven one-pass streaming algorithms for estimating the number of triangles and four cycles, two fundamental problems in graph analytics that are widely studied in the graph data stream literature.
no code implementations • NeurIPS 2021 • Piotr Indyk, Tal Wagner, David Woodruff
Recently, data-driven and learning-based algorithms for low rank matrix approximation were shown to outperform classical data-oblivious algorithms by wide margins in terms of accuracy.
no code implementations • ICLR 2021 • Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner
We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements.
no code implementations • 16 Feb 2021 • Arturs Backurs, Piotr Indyk, Cameron Musco, Tal Wagner
In particular, we consider estimating the sum of kernel matrix entries, along with its top eigenvalue and eigenvector.
1 code implementation • NeurIPS 2019 • Arturs Backurs, Piotr Indyk, Tal Wagner
We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property.
1 code implementation • ICML 2020 • Arturs Backurs, Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner
Our extensive experiments, on real-world text and image datasets, show that Flowtree improves over various baselines and existing methods in either running time or accuracy.
Data Structures and Algorithms
no code implementations • 2 Jun 2019 • Piotr Indyk, Ali Vakilian, Tal Wagner, David Woodruff
Recent work by Bakshi and Woodruff (NeurIPS 2018) showed it is possible to compute a rank-$k$ approximation of a distance matrix in time $O((n+m)^{1+\gamma}) \cdot \mathrm{poly}(k, 1/\epsilon)$, where $\epsilon>0$ is an error parameter and $\gamma>0$ is an arbitrarily small constant.
1 code implementation • 10 Feb 2019 • Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, Tal Wagner
In the fair variant of $k$-median, the points are colored, and the goal is to minimize the same average distance objective while ensuring that all clusters have an "approximately equal" number of points of each color.
1 code implementation • ICLR 2020 • Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner
Space partitions of $\mathbb{R}^d$ underlie a vast and important class of fast nearest neighbor search (NNS) algorithms.
no code implementations • ICML 2018 • Tal Wagner, Sudipto Guha, Shiva Kasiviswanathan, Nina Mishra
We consider the problem of labeling points on a fast-moving data stream when only a small number of labeled examples are available.
no code implementations • NeurIPS 2017 • Noga Alon, Daniel Reichman, Igor Shinkar, Tal Wagner, Sebastian Musslick, Jonathan D. Cohen, Tom Griffiths, Biswadip Dey, Kayhan Ozcimder
A key feature of neural network architectures is their ability to support the simultaneous interaction among large numbers of units in the learning and processing of representations.
no code implementations • NeurIPS 2017 • Piotr Indyk, Ilya Razenshteyn, Tal Wagner
We introduce a new distance-preserving compact representation of multi-dimensional point-sets.
no code implementations • NeurIPS 2012 • Koby Crammer, Tal Wagner
We introduce a large-volume box classification for binary prediction, which maintains a subset of weight vectors, and specifically axis-aligned boxes.