Search Results for author: Brando Miranda

Found 13 papers, 2 papers with code

Is Pre-training Truly Better Than Meta-Learning?

no code implementations • 24 Jun 2023 • Brando Miranda, Patrick Yu, Saumya Goyal, Yu-Xiong Wang, Sanmi Koyejo

Using this analysis, we demonstrate the following: 1. when the formal diversity of a data set is low, PT beats MAML on average and 2. when the formal diversity is high, MAML beats PT on average.

Few-Shot Learning

Paper
Add Code

Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data

no code implementations • 24 Jun 2023 • Alycia Lee, Brando Miranda, Sudharsan Sundar, Sanmi Koyejo

Current trends to pre-train capable Large Language Models (LLMs) mostly focus on scaling of model and dataset size.

Paper
Add Code

Are Emergent Abilities of Large Language Models a Mirage?

no code implementations • NeurIPS 2023 • Rylan Schaeffer, Brando Miranda, Sanmi Koyejo

Recent work claims that large language models display emergent abilities, abilities not present in smaller-scale models that are present in larger-scale models.

Paper
Add Code

Transformer Models for Type Inference in the Simply Typed Lambda Calculus: A Case Study in Deep Learning for Code

no code implementations • 15 Mar 2023 • Brando Miranda, Avi Shinnar, Vasily Pestun, Barry Trager

Despite a growing body of work at the intersection of deep learning and formal languages, there has been relatively little systematic exploration of transformer models for reasoning about typed lambda calculi.

Paper
Add Code

The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence

no code implementations • 2 Aug 2022 • Brando Miranda, Patrick Yu, Yu-Xiong Wang, Sanmi Koyejo

This novel insight contextualizes claims that transfer learning solutions are better than meta-learned solutions in the regime of low diversity under a fair comparison.

Few-Shot Learning Transfer Learning

Paper
Add Code

The Curse of Zero Task Diversity: On the Failure of Transfer Learning to Outperform MAML and their Empirical Equivalence

no code implementations • 24 Dec 2021 • Brando Miranda, Yu-Xiong Wang, Sanmi Koyejo

We hypothesize that the diversity coefficient of the few-shot learning benchmark is predictive of whether meta-learning solutions will succeed or not.

Few-Shot Learning Transfer Learning

Paper
Add Code

Does MAML Only Work via Feature Re-use? A Data Centric Perspective

1 code implementation • 24 Dec 2021 • Brando Miranda, Yu-Xiong Wang, Sanmi Koyejo

Recent work has suggested that a good embedding is all we need to solve many few-shot learning benchmarks.

Few-Shot Learning

Paper
Code

Theory III: Dynamics and Generalization in Deep Networks

no code implementations • 12 Mar 2019 • Andrzej Banburski, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Fernanda De La Torre, Jack Hidary, Tomaso Poggio

In particular, gradient descent induces a dynamics of the normalized weights which converge for $t \to \infty$ to an equilibrium which corresponds to a minimum norm (or maximum margin) solution.

Paper
Add Code

A Surprising Linear Relationship Predicts Test Performance in Deep Networks

3 code implementations • 25 Jul 2018 • Qianli Liao, Brando Miranda, Andrzej Banburski, Jack Hidary, Tomaso Poggio

Given two networks with the same training loss on a dataset, when would they have drastically different test losses and errors?

General Classification Generalization Bounds

Paper
Code

Theory IIIb: Generalization in Deep Networks

no code implementations • 29 Jun 2018 • Tomaso Poggio, Qianli Liao, Brando Miranda, Andrzej Banburski, Xavier Boix, Jack Hidary

Here we prove a similar result for nonlinear multilayer DNNs near zero minima of the empirical loss.

Binary Classification

Paper
Add Code

Theory of Deep Learning IIb: Optimization Properties of SGD

no code implementations • 7 Jan 2018 • Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio

In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.

Paper
Add Code

Theory of Deep Learning III: explaining the non-overfitting puzzle

no code implementations • 30 Dec 2017 • Tomaso Poggio, Kenji Kawaguchi, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Xavier Boix, Jack Hidary, Hrushikesh Mhaskar

In this note, we show that the dynamics associated to gradient descent minimization of nonlinear networks is topologically equivalent, near the asymptotically stable minima of the empirical error, to linear gradient system in a quadratic potential with a degenerate (for square loss) or almost degenerate (for logistic or crossentropy loss) Hessian.

General Classification

Paper
Add Code

Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review

no code implementations • 2 Nov 2016 • Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao

The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.