Search Results for author: Vatsal Sharan

Found 25 papers, 4 papers with code

Simplicity Bias of Transformers to Learn Low Sensitivity Functions

no code implementations • 11 Mar 2024 • Bhavya Vasudeva, Deqing Fu, Tianyi Zhou, Elliott Kau, Youqi Huang, Vatsal Sharan

Transformers achieve state-of-the-art accuracy and robustness across many tasks, but an understanding of the inductive biases that they have and how those biases are different from other neural network architectures remains elusive.

Paper
Add Code

Learnability is a Compact Property

no code implementations • 15 Feb 2024 • Julian Asilis, Siddartha Devic, Shaddin Dughmi, Vatsal Sharan, Shang-Hua Teng

Furthermore, the learnability of such problems can fail to be a property of finite character: informally, it cannot be detected by examining finite projections of the problem.

Learning Theory

Paper
Add Code

Stability and Multigroup Fairness in Ranking with Uncertain Predictions

no code implementations • 14 Feb 2024 • Siddartha Devic, Aleksandra Korolova, David Kempe, Vatsal Sharan

However, when predictors trained for classification tasks have intrinsic uncertainty, it is not obvious how this uncertainty should be represented in the derived rankings.

Fairness

Paper
Add Code

Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models

no code implementations • 26 Oct 2023 • Deqing Fu, Tian-Qi Chen, Robin Jia, Vatsal Sharan

In this paper, we instead demonstrate that Transformers learn to implement higher-order optimization methods to perform ICL.

In-Context Learning

Paper
Add Code

Mitigating Simplicity Bias in Deep Learning for Improved OOD Generalization and Robustness

1 code implementation • 9 Oct 2023 • Bhavya Vasudeva, Kameron Shahabi, Vatsal Sharan

Neural networks (NNs) are known to exhibit simplicity bias where they tend to prefer learning 'simple' features over more 'complex' ones, even when the latter may be more informative.

Fairness

Paper
Code

Regularization and Optimal Multiclass Learning

no code implementations • 24 Sep 2023 • Julian Asilis, Siddartha Devic, Shaddin Dughmi, Vatsal Sharan, Shang-Hua Teng

We demonstrate that an agnostic version of the Hall complexity again characterizes error rates exactly, and exhibit an optimal learner using maximum entropy programs.

Transductive Learning

Paper
Add Code

Fairness in Matching under Uncertainty

no code implementations • 8 Feb 2023 • Siddartha Devic, David Kempe, Vatsal Sharan, Aleksandra Korolova

The prevalence and importance of algorithmic two-sided marketplaces has drawn attention to the issue of fairness in such settings.

Fairness

Paper
Add Code

Efficient Convex Optimization Requires Superlinear Memory

no code implementations • 29 Mar 2022 • Annie Marsden, Vatsal Sharan, Aaron Sidford, Gregory Valiant

We show that any memory-constrained, first-order algorithm which minimizes $d$-dimensional, $1$-Lipschitz convex functions over the unit ball to $1/\mathrm{poly}(d)$ accuracy using at most $d^{1. 25 - \delta}$ bits of memory must make at least $\tilde{\Omega}(d^{1 + (4/3)\delta})$ first-order queries (for any constant $\delta \in [0, 1/4]$).

Paper
Add Code

KL Divergence Estimation with Multi-group Attribution

1 code implementation • 28 Feb 2022 • Parikshit Gopalan, Nina Narodytska, Omer Reingold, Vatsal Sharan, Udi Wieder

Estimating the Kullback-Leibler (KL) divergence between two distributions given samples from them is well-studied in machine learning and information theory.

Fairness

Paper
Code

On the Statistical Complexity of Sample Amplification

no code implementations • 12 Jan 2022 • Brian Axelrod, Shivam Garg, Yanjun Han, Vatsal Sharan, Gregory Valiant

In this work, we place the sample amplification problem on a firm statistical foundation by deriving generally applicable amplification procedures, lower bound techniques and connections to existing statistical notions.

Paper
Add Code

Big-Step-Little-Step: Efficient Gradient Methods for Objectives with Multiple Scales

no code implementations • 4 Nov 2021 • Jonathan Kelner, Annie Marsden, Vatsal Sharan, Aaron Sidford, Gregory Valiant, Honglin Yuan

We consider the problem of minimizing a function $f : \mathbb{R}^d \rightarrow \mathbb{R}$ which is implicitly decomposable as the sum of $m$ unknown non-interacting smooth, strongly convex functions and provide a method which solves this problem with a number of gradient evaluations that scales (up to logarithmic factors) as the product of the square-root of the condition numbers of the components.

Paper
Add Code

Omnipredictors

no code implementations • 11 Sep 2021 • Parikshit Gopalan, Adam Tauman Kalai, Omer Reingold, Vatsal Sharan, Udi Wieder

We suggest a rigorous new paradigm for loss minimization in machine learning where the loss function can be ignored at the time of learning and only be taken into account when deciding an action.

Fairness

Paper
Add Code

One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks

no code implementations • ICLR 2021 • Atish Agarwala, Abhimanyu Das, Brendan Juba, Rina Panigrahy, Vatsal Sharan, Xin Wang, Qiuyi Zhang

Can deep learning solve multiple tasks simultaneously, even when they are unrelated and very different?

Paper
Add Code

Multicalibrated Partitions for Importance Weights

no code implementations • 10 Mar 2021 • Parikshit Gopalan, Omer Reingold, Vatsal Sharan, Udi Wieder

We significantly strengthen previous work that use the MaxEntropy approach, that define the importance weights based on a distribution $Q$ closest to $P$, that looks the same as $R$ on every set $C \in \mathcal{C}$, where $\mathcal{C}$ may be a huge collection of sets.

Anomaly Detection Domain Adaptation

Paper
Add Code

PIDForest: Anomaly Detection via Partial Identification

1 code implementation • NeurIPS 2019 • Parikshit Gopalan, Vatsal Sharan, Udi Wieder

We consider the problem of detecting anomalies in a large dataset.

Anomaly Detection Attribute

Paper
Code

Sample Amplification: Increasing Dataset Size even when Learning is Impossible

no code implementations • ICML 2020 • Brian Axelrod, Shivam Garg, Vatsal Sharan, Gregory Valiant

In the Gaussian case, we show that an $\left(n, n+\Theta(\frac{n}{\sqrt{d}} )\right)$ amplifier exists, even though learning the distribution to small constant total variation distance requires $\Theta(d)$ samples.

valid

Paper
Add Code

Memory-Sample Tradeoffs for Linear Regression with Small Error

no code implementations • 18 Apr 2019 • Vatsal Sharan, Aaron Sidford, Gregory Valiant

We consider the problem of performing linear regression over a stream of $d$-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints.

regression

Paper
Add Code

A Spectral View of Adversarially Robust Features

no code implementations • NeurIPS 2018 • Shivam Garg, Vatsal Sharan, Brian Hu Zhang, Gregory Valiant

This connection can be leveraged to provide both robust features, and a lower bound on the robustness of any function that has significant variance across the dataset.

Paper
Add Code

Recovery Guarantees for Quadratic Tensors with Sparse Observations

no code implementations • 31 Oct 2018 • Hongyang R. Zhang, Vatsal Sharan, Moses Charikar, YIngyu Liang

We consider the tensor completion problem of predicting the missing entries of a tensor.

Recommendation Systems

Paper
Add Code

Efficient Anomaly Detection via Matrix Sketching

no code implementations • NeurIPS 2018 • Vatsal Sharan, Parikshit Gopalan, Udi Wieder

We consider the problem of finding anomalies in high-dimensional data using popular PCA based anomaly scores.

Anomaly Detection

Paper
Add Code

Sketching Linear Classifiers over Data Streams

1 code implementation • 7 Nov 2017 • Kai Sheng Tai, Vatsal Sharan, Peter Bailis, Gregory Valiant

We introduce a new sub-linear space sketch---the Weight-Median Sketch---for learning compressed linear classifiers over data streams while supporting the efficient recovery of large-magnitude weights in the model.

feature selection

Paper
Code

Learning Overcomplete HMMs

no code implementations • NeurIPS 2017 • Vatsal Sharan, Sham Kakade, Percy Liang, Gregory Valiant

On the other hand, we show that learning is impossible given only a polynomial number of samples for HMMs with a small output alphabet and whose transition matrices are random regular graphs with large degree.

Paper
Add Code

Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data

no code implementations • 25 Jun 2017 • Vatsal Sharan, Kai Sheng Tai, Peter Bailis, Gregory Valiant

What learning algorithms can be run directly on compressively-sensed data?

EEG Time Series +1

Paper
Add Code

Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use

no code implementations • ICML 2017 • Vatsal Sharan, Gregory Valiant

The popular Alternating Least Squares (ALS) algorithm for tensor decomposition is efficient and easy to implement, but often converges to poor local optima---particularly when the weights of the factors are non-uniform.

Tensor Decomposition Word Embeddings

Paper
Add Code

Prediction with a Short Memory

no code implementations • 8 Dec 2016 • Vatsal Sharan, Sham Kakade, Percy Liang, Gregory Valiant

For a Hidden Markov Model with $n$ hidden states, $I$ is bounded by $\log n$, a quantity that does not depend on the mixing time, and we show that the trivial prediction algorithm based on the empirical frequencies of length $O(\log n/\epsilon)$ windows of observations achieves this error, provided the length of the sequence is $d^{\Omega(\log n/\epsilon)}$, where $d$ is the size of the observation alphabet.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.