Search Results for author: Navin Goyal

Found 30 papers, 11 papers with code

Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically

no code implementations • 25 Apr 2024 • Kabir Ahuja, Vidhisha Balachandran, Madhur Panwar, Tianxing He, Noah A. Smith, Navin Goyal, Yulia Tsvetkov

Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures without explicitly encoding any structural bias.

Inductive Bias Language Modelling

Paper
Add Code

Guiding Language Models of Code with Global Context using Monitors

1 code implementation • 19 Jun 2023 • Lakshya A Agrawal, Aditya Kanade, Navin Goyal, Shuvendu K. Lahiri, Sriram K. Rajamani

We construct a repository-level dataset PragmaticCode for method-completion in Java and evaluate MGD on it.

Ranked #1 on Code Completion on DotPrompts

Code Completion Code Repair +2

108

Paper
Code

In-Context Learning through the Bayesian Prism

1 code implementation • 8 Jun 2023 • Madhur Panwar, Kabir Ahuja, Navin Goyal

One of the main discoveries in this line of research has been that for several function classes, such as linear regression, transformers successfully generalize to new functions in the class.

Bayesian Inference In-Context Learning +4

Paper
Code

A Theory of Emergent In-Context Learning as Implicit Structure Induction

no code implementations • 14 Mar 2023 • Michael Hahn, Navin Goyal

Scaling large language models (LLMs) leads to an emergent capacity to learn in-context from example demonstrations.

In-Context Learning

Paper
Add Code

Towards a Mathematics Formalisation Assistant using Large Language Models

no code implementations • 14 Nov 2022 • Ayush Agrawal, Siddhartha Gadgil, Navin Goyal, Ashvni Narayanan, Anand Tadipatri

Mathematics formalisation is the task of writing mathematics (i. e., definitions, theorem statements, proofs) in natural language, as found in books and papers, into a formal language that can then be checked for correctness by a program.

Language Modelling Large Language Model

Paper
Add Code

When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

1 code implementation • 23 Oct 2022 • Ankur Sikarwar, Arkil Patel, Navin Goyal

On analyzing the task, we find that identifying the target location in the grid world is the main challenge for the models.

Paper
Code

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

1 code implementation • ACL 2022 • Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal

Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences.

Paper
Code

Learning and Generalization in Overparameterized Normalizing Flows

1 code implementation • 19 Jun 2021 • Kulin Shah, Amit Deshpande, Navin Goyal

In supervised learning, it is known that overparameterized neural networks with one hidden layer provably and efficiently learn and generalize, when trained using stochastic gradient descent with a sufficiently small learning rate and suitable initialization.

Density Estimation

Paper
Code

Learning and Generalization in RNNs

no code implementations • NeurIPS 2021 • Abhishek Panigrahi, Navin Goyal

In contrast to the previous work that could only deal with functions of sequences that are sums of functions of individual tokens in the sequence, we allow general functions.

Paper
Add Code

Analyzing the Nuances of Transformers' Polynomial Simplification Abilities

no code implementations • 29 Apr 2021 • Vishesh Agarwal, Somak Aditya, Navin Goyal

To understand Transformers' abilities in such tasks in a fine-grained manner, we deviate from traditional end-to-end settings, and explore a step-wise polynomial simplification task.

Paper
Add Code

Are NLP Models really able to Solve Simple Math Word Problems?

3 code implementations • NAACL 2021 • Arkil Patel, Satwik Bhattamishra, Navin Goyal

Since existing solvers achieve high performance on the benchmark datasets for elementary level MWPs containing one-unknown arithmetic word problems, such problems are often considered "solved" with the bulk of research attention moving to more complex MWPs.

Ranked #1 on Math Word Problem SolvingΩ on MAWPS

Math Math Word Problem Solving +1

105

Paper
Code

Do Transformers Understand Polynomial Simplification?

no code implementations • 1 Jan 2021 • Vishesh Agarwal, Somak Aditya, Navin Goyal

For a polynomial which is not necessarily in this normal form, a sequence of simplification steps is applied to reach the fully simplified (i. e., in the normal form) polynomial.

Paper
Add Code

Learning and Generalization in Univariate Overparameterized Normalizing Flows

no code implementations • 1 Jan 2021 • Kulin Shah, Amit Deshpande, Navin Goyal

In supervised learning, it is known that overparameterized neural networks with one hidden layer provably and efficiently learn and generalize, when trained using Stochastic Gradient Descent (SGD).

Density Estimation

Paper
Add Code

On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages

1 code implementation • COLING 2020 • Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

We find that while recurrent models generalize nearly perfectly if the lengths of the training and test strings are from the same range, they perform poorly if the test strings are longer.

Paper
Code

On the Ability and Limitations of Transformers to Recognize Formal Languages

1 code implementation • EMNLP 2020 • Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

Our analysis also provides insights on the role of self-attention mechanism in modeling certain behaviors and the influence of positional encoding schemes on the learning and generalization abilities of the model.

Paper
Code

Robust Identifiability in Linear Structural Equation Models of Causal Inference

no code implementations • 14 Jul 2020 • Karthik Abinav Sankararaman, Anand Louis, Navin Goyal

First, for a large and well-studied class of LSEMs, namely ``bow free'' models, we provide a sufficient condition on model parameters under which robust identifiability holds, thereby removing the restriction of paths required by prior work.

Causal Inference

Paper
Add Code

On the Computational Power of Transformers and its Implications in Sequence Modeling

1 code implementation • CONLL 2020 • Satwik Bhattamishra, Arkil Patel, Navin Goyal

Transformers are being used extensively across several sequence modeling tasks.

Machine Translation Translation

Paper
Code

Non-Gaussianity of Stochastic Gradient Noise

no code implementations • 21 Oct 2019 • Abhishek Panigrahi, Raghav Somani, Navin Goyal, Praneeth Netrapalli

What enables Stochastic Gradient Descent (SGD) to achieve better generalization than Gradient Descent (GD) in Neural Network training?

Paper
Add Code

Effect of Activation Functions on the Training of Overparametrized Neural Nets

no code implementations • ICLR 2020 • Abhishek Panigrahi, Abhishek Shetty, Navin Goyal

In the present paper, we provide theoretical results about the effect of activation function on the training of highly overparametrized 2-layer neural networks.

Small Data Image Classification

Paper
Add Code

Universality Patterns in the Training of Neural Networks

no code implementations • 17 May 2019 • Raghav Somani, Navin Goyal, Prateek Jain, Praneeth Netrapalli

This paper proposes and demonstrates a surprising pattern in the training of neural networks: there is a one to one relation between the values of any pair of losses (such as cross entropy, mean squared error, 0/1 error etc.)

Paper
Add Code

Stability of Linear Structural Equation Models of Causal Inference

no code implementations • 16 May 2019 • Karthik Abinav Sankararaman, Anand Louis, Navin Goyal

First we prove that under a sufficient condition, for a certain sub-class of $\LSEM$ that are \emph{bow-free} (Brito and Pearl (2002)), the parameter recovery is stable.

Causal Inference Sociology

Paper
Add Code

Non-Gaussian Component Analysis using Entropy Methods

no code implementations • 13 Jul 2018 • Navin Goyal, Abhishek Shetty

NGCA is also related to dimension reduction and to other data analysis problems such as ICA.

Dimensionality Reduction

Paper
Add Code

Depth separation and weight-width trade-offs for sigmoidal neural networks

no code implementations • ICLR 2018 • Amit Deshpande, Navin Goyal, Sushrut Karmalkar

We show a similar separation between the expressive power of depth-2 and depth-3 sigmoidal neural networks over a large class of input distributions, as long as the weights are polynomially bounded.

Paper
Add Code

Learnability of Learned Neural Networks

no code implementations • ICLR 2018 • Rahul Anand Sharma, Navin Goyal, Monojit Choudhury, Praneeth Netrapalli

This paper explores the simplicity of learned neural networks under various settings: learned on real vs random data, varying size/architecture and using large minibatch size vs small minibatch size.

Paper
Add Code

Heavy-Tailed Analogues of the Covariance Matrix for ICA

no code implementations • 22 Feb 2017 • Joseph Anderson, Navin Goyal, Anupama Nandi, Luis Rademacher

Like the current state-of-the-art, the new algorithm is based on the centroid body (a first moment analogue of the covariance matrix).

Paper
Add Code

Heavy-tailed Independent Component Analysis

no code implementations • 2 Sep 2015 • Joseph Anderson, Navin Goyal, Anupama Nandi, Luis Rademacher

Independent component analysis (ICA) is the problem of efficiently recovering a matrix $A \in \mathbb{R}^{n\times n}$ from i. i. d.

Paper
Add Code

The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures

no code implementations • 12 Nov 2013 • Joseph Anderson, Mikhail Belkin, Navin Goyal, Luis Rademacher, James Voss

The problem of learning this map can be efficiently solved using some recent results on tensor decompositions and Independent Component Analysis (ICA), thus giving an algorithm for recovering the mixture.

Paper
Add Code

Fourier PCA and Robust Tensor Decomposition

1 code implementation • 25 Jun 2013 • Navin Goyal, Santosh Vempala, Ying Xiao

Fourier PCA is Principal Component Analysis of a matrix obtained from higher order derivatives of the logarithm of the Fourier transform of a distribution. We make this method algorithmic by developing a tensor decomposition method for a pair of tensors sharing the same vectors in rank-$1$ decompositions.

Tensor Decomposition

Paper
Code

Efficient learning of simplices

no code implementations • 9 Nov 2012 • Joseph Anderson, Navin Goyal, Luis Rademacher

We also show a direct connection between the problem of learning a simplex and ICA: a simple randomized reduction to ICA from the problem of learning a simplex.

Paper
Add Code

Thompson Sampling for Contextual Bandits with Linear Payoffs

1 code implementation • 15 Sep 2012 • Shipra Agrawal, Navin Goyal

Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems.

Multi-Armed Bandits Thompson Sampling

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.