Search Results for author: Pedro Savarese

Found 12 papers, 7 papers with code

SySMOL: A Hardware-software Co-design Framework for Ultra-Low and Fine-Grained Mixed-Precision Neural Networks

no code implementations23 Nov 2023 Cyrus Zhou, Vaughn Richard, Pedro Savarese, Zachary Hassman, Michael Maire, Michael DiBrino, Yanjing Li

The design for mixed-precision networks that achieves optimized tradeoffs corresponds to an architecture that supports 1, 2, and 4-bit fixed-point operations with four configurable precision patterns, when coupled with system-aware training and inference optimization -- networks trained for this design achieve accuracies that closely match full-precision accuracies, while compressing and improving run-time efficiency of the neural networks drastically by 10-20x, compared to full-precision networks.

Inference Optimization Quantization

Meta-Learning via Learning with Distributed Memory

no code implementations NeurIPS 2021 Sudarshan Babu, Pedro Savarese, Michael Maire

We demonstrate that efficient meta-learning can be achieved via end-to-end training of deep neural networks with memory distributed across layers.

Few-Shot Semantic Segmentation Meta-Learning +1

Information-Theoretic Segmentation by Inpainting Error Maximization

1 code implementation CVPR 2021 Pedro Savarese, Sunnie S. Y. Kim, Michael Maire, Greg Shakhnarovich, David Mcallester

We study image segmentation from an information-theoretic perspective, proposing a novel adversarial method that performs unsupervised segmentation by partitioning images into maximally independent sets.

Image Segmentation Segmentation +2

Growing Efficient Deep Networks by Structured Continuous Sparsification

no code implementations ICLR 2021 Xin Yuan, Pedro Savarese, Michael Maire

We develop an approach to growing deep network architectures over the course of training, driven by a principled combination of accuracy and sparsity objectives.

Image Classification Language Modelling +1

Kernel and Rich Regimes in Overparametrized Models

1 code implementation20 Feb 2020 Blake Woodworth, Suriya Gunasekar, Jason D. Lee, Edward Moroshko, Pedro Savarese, Itay Golan, Daniel Soudry, Nathan Srebro

We provide a complete and detailed analysis for a family of simple depth-$D$ models that already exhibit an interesting and meaningful transition between the kernel and rich regimes, and we also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.

Winning the Lottery with Continuous Sparsification

2 code implementations NeurIPS 2020 Pedro Savarese, Hugo Silva, Michael Maire

Additionally, the recent Lottery Ticket Hypothesis conjectures that, for a typically-sized neural network, it is possible to find small sub-networks which, when trained from scratch on a comparable budget, match the performance of the original dense counterpart.

Network Pruning Ticket Search +1

Domain-independent Dominance of Adaptive Methods

1 code implementation CVPR 2021 Pedro Savarese, David Mcallester, Sudarshan Babu, Michael Maire

From a simplified analysis of adaptive methods, we derive AvaGrad, a new optimizer which outperforms SGD on vision tasks when its adaptability is properly tuned.

Image Classification Language Modelling +1

Building a Massive Corpus for Named Entity Recognition using Free Open Data Sources

no code implementations13 Aug 2019 Daniel Specht Menezes, Pedro Savarese, Ruy Luiz Milidiú

With the recent progress in machine learning, boosted by techniques such as deep learning, many tasks can be successfully solved once a large enough dataset is available for training.

named-entity-recognition Named Entity Recognition +1

On the Convergence of AdaBound and its Connection to SGD

1 code implementation13 Aug 2019 Pedro Savarese

Adaptive gradient methods such as Adam have gained extreme popularity due to their success in training complex neural networks and less sensitivity to hyperparameter tuning compared to SGD.

Kernel and Rich Regimes in Overparametrized Models

1 code implementation13 Jun 2019 Blake Woodworth, Suriya Gunasekar, Pedro Savarese, Edward Moroshko, Itay Golan, Jason Lee, Daniel Soudry, Nathan Srebro

A recent line of work studies overparametrized neural networks in the "kernel regime," i. e. when the network behaves during training as a kernelized linear predictor, and thus training with gradient descent has the effect of finding the minimum RKHS norm solution.

How do infinite width bounded norm networks look in function space?

no code implementations13 Feb 2019 Pedro Savarese, Itay Evron, Daniel Soudry, Nathan Srebro

We consider the question of what functions can be captured by ReLU networks with an unbounded number of units (infinite width), but where the overall network Euclidean norm (sum of squares of all weights in the system, except for an unregularized bias term for each unit) is bounded; or equivalently what is the minimal norm required to approximate a given function.

Cannot find the paper you are looking for? You can Submit a new open access paper.