Search Results for author: Anirbit Mukherjee

Found 17 papers, 4 papers with code

Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks

1 code implementation • 12 Apr 2024 • Matteo Tucat, Anirbit Mukherjee

In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width.

Paper
Code

Investigating the Ability of PINNs To Solve Burgers' PDE Near Finite-Time BlowUp

no code implementations • 8 Oct 2023 • Dibyakanti Kumar, Anirbit Mukherjee

Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated PDEs numerically while offering an attractive trade-off between accuracy and speed of inference.

Generalization Bounds

Paper
Add Code

LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class

no code implementations • 7 Oct 2023 • Hongbo Zhu, Angelo Cangelosi, Procheta Sen, Anirbit Mukherjee

This data-efficiency is seen to manifest as LIPEx being able to compute its explanation matrix around 53% faster than all-class LIME, for classification experiments with text data.

Feature Importance

Paper
Add Code

Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets

no code implementations • 17 Sep 2023 • Pulkit Gopalani, Samyak Jha, Anirbit Mukherjee

In this note, we demonstrate a first-of-its-kind provable convergence of SGD to the global minima of appropriately regularized logistic empirical risk of depth $2$ nets -- for arbitrary data and with any number of gates with adequately smooth and bounded activations like sigmoid and tanh.

Paper
Add Code

Size Lowerbounds for Deep Operator Networks

no code implementations • 11 Aug 2023 • Anirbit Mukherjee, Amartya Roy

Deep Operator Networks are an increasingly popular paradigm for solving regression in infinite dimensions and hence solve families of PDEs in one shot.

Paper
Add Code

Global Convergence of SGD On Two Layer Neural Nets

no code implementations • 20 Oct 2022 • Pulkit Gopalani, Anirbit Mukherjee

In this note we demonstrate provable convergence of SGD to the global minima of appropriately regularized $\ell_2-$empirical risk of depth $2$ nets -- for arbitrary data and with any number of gates, if they are using adequately smooth and bounded activations like sigmoid and tanh.

Vocal Bursts Valence Prediction

Paper
Add Code

Towards Size-Independent Generalization Bounds for Deep Operator Nets

no code implementations • 23 May 2022 • Pulkit Gopalani, Sayar Karmakar, Dibyakanti Kumar, Anirbit Mukherjee

In recent times machine learning methods have made significant advances in becoming a useful tool for analyzing physical systems.

BIG-bench Machine Learning Generalization Bounds +1

Paper
Add Code

An Empirical Study of the Occurrence of Heavy-Tails in Training a ReLU Gate

no code implementations • 26 Apr 2022 • Sayar Karmakar, Anirbit Mukherjee

while training a $\relu$ gate (in the realizable and in the binary classification setup) and for a variant of S. G. D.

Binary Classification

Paper
Add Code

Dynamics of Local Elasticity During Training of Neural Nets

1 code implementation • 1 Nov 2021 • Soham Dan, Anirbit Mukherjee, Avirup Das, Phanideep Gampa

On various state-of-the-art neural network training on SVHN, CIFAR-10 and CIFAR-100 we demonstrate how our new proposal of $S_{\rm rel}$, as opposed to the original definition, much more sharply detects the property of the weight updates preferring to make prediction changes within the same class as the sampled data.

regression

Paper
Code

Investigating the Role of Overparameterization While Solving the Pendulum with DeepONets

no code implementations • NeurIPS Workshop DLDE 2021 • Pulkit Gopalani, Anirbit Mukherjee

DeepONets [1] are one of the most prominent ideas in this theme which entails an optimization over a space of inner-products of neural nets.

Paper
Add Code

A Study of the Mathematics of Deep Learning

1 code implementation • 28 Apr 2021 • Anirbit Mukherjee

In chapter 2 we show new circuit complexity theorems for deep neural functions and prove classification theorems about these function spaces which in turn lead to exact algorithms for empirical risk minimization for depth 2 ReLU nets.

Paper
Code

Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm

no code implementations • 8 May 2020 • Sayar Karmakar, Anirbit Mukherjee

In this work, we demonstrate provable guarantees on the training of a single ReLU gate in hitherto unexplored regimes.

Data Poisoning

Paper
Add Code

Depth-2 Neural Networks Under a Data-Poisoning Attack

1 code implementation • 4 May 2020 • Sayar Karmakar, Anirbit Mukherjee, Theodore Papamarkou

In this class of networks, we attempt to learn the network weights in the presence of a malicious oracle doing stochastic, bounded and additive adversarial distortions on the true output during training.

Adversarial Attack Data Poisoning

Paper
Code

Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration

no code implementations • ICLR 2019 • Soham De, Anirbit Mukherjee, Enayat Ullah

Through these experiments we demonstrate the interesting sensitivity that ADAM has to its momentum parameter $\beta_1$.

Paper
Add Code

Lower bounds over Boolean inputs for deep neural networks with ReLU gates

no code implementations • 8 Nov 2017 • Anirbit Mukherjee, Amitabh Basu

We use the method of sign-rank to show exponential in dimension lower bounds for ReLU circuits ending in a LTF gate and of depths upto $O(n^{\xi})$ with $\xi < \frac{1}{8}$ with some restrictions on the weights in the bottom most layer.

Paper
Add Code

Sparse Coding and Autoencoders

no code implementations • 12 Aug 2017 • Akshay Rangamani, Anirbit Mukherjee, Amitabh Basu, Tejaswini Ganapathy, Ashish Arora, Sang Chin, Trac. D. Tran

This property holds independent of the loss function.

Dictionary Learning

Paper
Add Code

Understanding Deep Neural Networks with Rectified Linear Units

no code implementations • ICLR 2018 • Raman Arora, Amitabh Basu, Poorya Mianjy, Anirbit Mukherjee

In this paper we investigate the family of functions representable by deep neural networks (DNN) with rectified linear units (ReLU).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.