Search Results for author: Ari S. Morcos

Found 33 papers, 16 papers with code

Effective pruning of web-scale datasets based on complexity of concept clusters

1 code implementation • 9 Jan 2024 • Amro Abbas, Evgenia Rusak, Kushal Tirumala, Wieland Brendel, Kamalika Chaudhuri, Ari S. Morcos

Using a simple and intuitive complexity measure, we are able to reduce the training cost to a quarter of regular training.

Paper
Code

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

no code implementations • 5 Dec 2023 • Yu Yang, Aaditya K. Singh, Mostafa Elhoushi, Anas Mahmoud, Kushal Tirumala, Fabian Gloeckle, Baptiste Rozière, Carole-Jean Wu, Ari S. Morcos, Newsha Ardalani

Armed with this knowledge, we devise novel pruning metrics that operate in embedding space to identify and remove low-quality entries in the Stack dataset.

Code Generation

Paper
Add Code

PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning

1 code implementation • NeurIPS 2023 • Florian Bordes, Shashank Shekhar, Mark Ibrahim, Diane Bouchacourt, Pascal Vincent, Ari S. Morcos

Synthetic image datasets offer unmatched advantages for designing and evaluating deep neural networks: they make it possible to (i) render as many data samples as needed, (ii) precisely control each scene and yield granular ground truth labels (and captions), (iii) precisely control distribution shifts between training and testing to isolate variables of interest for sound experimentation.

Representation Learning

220

Paper
Code

SemDeDup: Data-efficient learning at web-scale through semantic deduplication

1 code implementation • 16 Mar 2023 • Amro Abbas, Kushal Tirumala, Dániel Simig, Surya Ganguli, Ari S. Morcos

Analyzing a subset of LAION, we show that SemDeDup can remove 50% of the data with minimal performance loss, effectively halving training time.

Paper
Code

Emergence of Maps in the Memories of Blind Navigation Agents

no code implementations • 30 Jan 2023 • Erik Wijmans, Manolis Savva, Irfan Essa, Stefan Lee, Ari S. Morcos, Dhruv Batra

A positive answer to this question would (a) explain the surprising phenomenon in recent literature of ostensibly map-free neural-networks achieving strong performance, and (b) strengthen the evidence of mapping as a fundamental mechanism for navigation by intelligent embodied agents, whether they be biological or artificial.

Inductive Bias PointGoal Navigation

Paper
Add Code

lo-fi: distributed fine-tuning without communication

no code implementations • 19 Oct 2022 • Mitchell Wortsman, Suchin Gururangan, Shen Li, Ali Farhadi, Ludwig Schmidt, Michael Rabbat, Ari S. Morcos

When fine-tuning DeiT-base and DeiT-large on ImageNet, this procedure matches accuracy in-distribution and improves accuracy under distribution shift compared to the baseline, which observes the same amount of data but communicates gradients at each step.

Paper
Add Code

Beyond neural scaling laws: beating power law scaling via data pruning

3 code implementations • 29 Jun 2022 • Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, Ari S. Morcos

Widely observed neural scaling laws, in which error falls off as a power of the training set size, model size, or both, have driven substantial performance improvements in deep learning.

Benchmarking

Paper
Code

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

5 code implementations • 10 Mar 2022 • Mitchell Wortsman, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S. Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Kornblith, Ludwig Schmidt

The conventional recipe for maximizing model accuracy is to (1) train multiple models with various hyperparameters and (2) pick the individual model which performs best on a held-out validation set, discarding the remainder.

Ranked #1 on Image Classification on ImageNet V2 (using extra training data)

Domain Generalization Image Classification +2

373

Paper
Code

Learning Background Invariance Improves Generalization and Robustness in Self-Supervised Learning on ImageNet and Beyond

no code implementations • NeurIPS Workshop ImageNet_PPF 2021 • Chaitanya Ryali, David J. Schwab, Ari S. Morcos

Through a systematic, comprehensive investigation, we show that background augmentations lead to improved generalization with substantial improvements ($\sim$1-2% on ImageNet) in performance across a spectrum of state-of-the-art self-supervised methods (MoCo-v2, BYOL, SwAV) on a variety of tasks, even enabling performance on par with the supervised baseline.

Data Augmentation Self-Supervised Learning +1

Paper
Add Code

Grounding inductive biases in natural images:invariance stems from variations in data

1 code implementation • NeurIPS 2021 • Diane Bouchacourt, Mark Ibrahim, Ari S. Morcos

While prior work has focused on synthetic data, we attempt here to characterize the factors of variation in a real dataset, ImageNet, and study the invariance of both standard residual networks and the recently proposed vision transformer with respect to changes in these factors.

Data Augmentation Translation

Paper
Code

Grounding inductive biases in natural images: invariance stems from variations in data

1 code implementation • NeurIPS 2021 • Diane Bouchacourt, Mark Ibrahim, Ari S. Morcos

Data Augmentation Translation

Paper
Code

Width Transfer: On the (In)variance of Width Optimization

no code implementations • 24 Apr 2021 • Ting-Wu Chin, Diana Marculescu, Ari S. Morcos

In this work, we propose width transfer, a technique that harnesses the assumptions that the optimized widths (or channel counts) are regular across sizes and depths.

Paper
Add Code

Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations

no code implementations • 23 Mar 2021 • Chaitanya K. Ryali, David J. Schwab, Ari S. Morcos

Recent progress in self-supervised learning has demonstrated promising results in multiple visual tasks.

Ranked #83 on Image Classification on ObjectNet (using extra training data)

Contrastive Learning Data Augmentation +4

Paper
Add Code

Uncovering the impact of learning rate for global magnitude pruning

no code implementations • 1 Jan 2021 • Janice Lan, Rudy Chin, Alexei Baevski, Ari S. Morcos

However, prior work has implicitly assumed that the best training configuration for model performance was also the best configuration for mask discovery.

Paper
Add Code

Reservoir Transformers

no code implementations • ACL 2021 • Sheng Shen, Alexei Baevski, Ari S. Morcos, Kurt Keutzer, Michael Auli, Douwe Kiela

We demonstrate that transformers obtain impressive performance even when some of the layers are randomly initialized and never updated.

BIG-bench Machine Learning Language Modelling +2

Paper
Add Code

Are all negatives created equal in contrastive instance discrimination?

no code implementations • 13 Oct 2020 • Tiffany Tianhui Cai, Jonathan Frankle, David J. Schwab, Ari S. Morcos

Using methodology from MoCo v2 (Chen et al., 2020), we divided negatives by their difficulty for a given query and studied which difficulty ranges were most important for learning useful representations.

Image Classification Self-Supervised Learning

Paper
Add Code

Linking average- and worst-case perturbation robustness via class selectivity and dimensionality

no code implementations • 28 Sep 2020 • Matthew L Leavitt, Ari S. Morcos

We also found that the input-unit gradient was more variable across samples and units in high-selectivity networks compared to low-selectivity networks.

Paper
Add Code

PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks

no code implementations • 28 Sep 2020 • Rudy Chin, Ari S. Morcos, Diana Marculescu

Slimmable neural networks provide a flexible trade-off front between prediction error and computational cost (such as the number of floating-point operations or FLOPs) with the same storage cost as a single model.

Paper
Add Code

Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks

2 code implementations • 23 Jul 2020 • Ting-Wu Chin, Ari S. Morcos, Diana Marculescu

In this work, we propose a general framework to enable joint optimization for both width configurations and weights of slimmable networks.

Paper
Code

On the relationship between class selectivity, dimensionality, and robustness

no code implementations • 8 Jul 2020 • Matthew L. Leavitt, Ari S. Morcos

While the relative trade-offs between sparse and distributed representations in deep neural networks (DNNs) are well-studied, less is known about how these trade-offs apply to representations of semantically-meaningful information.

Paper
Add Code

Plan2Vec: Unsupervised Representation Learning by Latent Plans

1 code implementation • 7 May 2020 • Ge Yang, Amy Zhang, Ari S. Morcos, Joelle Pineau, Pieter Abbeel, Roberto Calandra

In this paper we introduce plan2vec, an unsupervised representation learning approach that is inspired by reinforcement learning.

Motion Planning reinforcement-learning +2

Paper
Code

Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs

4 code implementations • ICLR 2021 • Jonathan Frankle, David J. Schwab, Ari S. Morcos

A wide variety of deep learning techniques from style transfer to multitask learning rely on training affine transformations of features.

Style Transfer

620

Paper
Code

The Early Phase of Neural Network Training

1 code implementation • ICLR 2020 • Jonathan Frankle, David J. Schwab, Ari S. Morcos

We perform extensive measurements of the network state during these early iterations of training and leverage the framework of Frankle et al. (2019) to quantitatively probe the weight distribution and its reliance on various aspects of the dataset.

620

Paper
Code

The Generalization-Stability Tradeoff In Neural Network Pruning

no code implementations • NeurIPS 2020 • Brian R. Bartoldson, Ari S. Morcos, Adrian Barbu, Gordon Erlebacher

Pruning neural network parameters is often viewed as a means to compress models, but pruning has also been motivated by the desire to prevent overfitting.

Network Pruning

Paper
Add Code

Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP

no code implementations • ICLR 2020 • Haonan Yu, Sergey Edunov, Yuandong Tian, Ari S. Morcos

The lottery ticket hypothesis proposes that over-parameterization of deep neural networks (DNNs) aids training by increasing the probability of a "lucky" sub-network initialization being present rather than by helping the optimization process (Frankle & Carbin, 2019).

Image Classification Reinforcement Learning (RL)

Paper
Add Code

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers

2 code implementations • NeurIPS 2019 • Ari S. Morcos, Haonan Yu, Michela Paganini, Yuandong Tian

Perhaps surprisingly, we found that, within the natural images domain, winning ticket initializations generalized across a variety of datasets, including Fashion MNIST, SVHN, CIFAR-10/100, ImageNet, and Places365, often achieving performance close to that of winning tickets generated on the same dataset.

Paper
Code

Learning to Make Analogies by Contrasting Abstract Relational Structure

2 code implementations • ICLR 2019 • Felix Hill, Adam Santoro, David G. T. Barrett, Ari S. Morcos, Timothy Lillicrap

Here, we study how analogical reasoning can be induced in neural networks that learn to perceive and reason about raw visual data.

166

Paper
Code

Analyzing biological and artificial neural networks: challenges with opportunities for synergy?

no code implementations • 31 Oct 2018 • David G. T. Barrett, Ari S. Morcos, Jakob H. Macke

We explore opportunities for synergy between the two fields, such as the use of DNNs as in-silico model systems for neuroscience, and how this synergy can lead to new hypotheses about the operating principles of biological neural networks.

Object Recognition

Paper
Add Code

Measuring abstract reasoning in neural networks

2 code implementations • ICML 2018 • David G. T. Barrett, Felix Hill, Adam Santoro, Ari S. Morcos, Timothy Lillicrap

To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-defined ways.

Paper
Code

Human-level performance in first-person multiplayer games with population-based deep reinforcement learning

no code implementations • 3 Jul 2018 • Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio Garcia Castaneda, Charles Beattie, Neil C. Rabinowitz, Ari S. Morcos, Avraham Ruderman, Nicolas Sonnerat, Tim Green, Louise Deason, Joel Z. Leibo, David Silver, Demis Hassabis, Koray Kavukcuoglu, Thore Graepel

Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games.

Reinforcement Learning (RL)

Paper
Add Code

Insights on representational similarity in neural networks with canonical correlation

2 code implementations • NeurIPS 2018 • Ari S. Morcos, Maithra Raghu, Samy Bengio

Comparing representations in neural networks is fundamentally difficult as the structure of representations varies greatly, even across groups of networks trained on identical tasks, and over the course of training.

618

Paper
Code

Pooling is neither necessary nor sufficient for appropriate deformation stability in CNNs

no code implementations • ICLR 2019 • Avraham Ruderman, Neil C. Rabinowitz, Ari S. Morcos, Daniel Zoran

In this work, we rigorously test these questions, and find that deformation stability in convolutional networks is more nuanced than it first appears: (1) Deformation invariance is not a binary property, but rather that different tasks require different degrees of deformation stability at different layers.

General Classification Image Classification +1

Paper
Add Code

On the importance of single directions for generalization

1 code implementation • ICLR 2018 • Ari S. Morcos, David G. T. Barrett, Neil C. Rabinowitz, Matthew Botvinick

Finally, we find that class selectivity is a poor predictor of task importance, suggesting not only that networks which generalize well minimize their dependence on individual units by reducing their selectivity, but also that individually selective units may not be necessary for strong network performance.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.