Search Results for author: Yoram Singer

Found 23 papers, 6 papers with code

Towards Practical Second Order Optimization for Deep Learning

no code implementations • 1 Jan 2021 • Rohan Anil, Vineet Gupta, Tomer Koren, Kevin Regan, Yoram Singer

Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent.

Click-Through Rate Prediction Image Classification +4

Paper
Add Code

Scalable Second Order Optimization for Deep Learning

1 code implementation • 20 Feb 2020 • Rohan Anil, Vineet Gupta, Tomer Koren, Kevin Regan, Yoram Singer

Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent.

Image Classification Language Modelling +2

32,951

Paper
Code

Proximity Preserving Binary Code using Signed Graph-Cut

no code implementations • 5 Feb 2020 • Inbal Lav, Shai Avidan, Yoram Singer, Yacov Hel-Or

We show that the proposed approximation is superior to the commonly used spectral methods with respect to both accuracy and complexity.

graph partitioning

Paper
Add Code

Memory Efficient Adaptive Optimization

1 code implementation • NeurIPS 2019 • Rohan Anil, Vineet Gupta, Tomer Koren, Yoram Singer

Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for achieving state-of-the-art performance in machine translation and language modeling.

Language Modelling Machine Translation +1

32,952

Paper
Code

Convolutional Bipartite Attractor Networks

no code implementations • 8 Jun 2019 • Michael Iuzzolino, Yoram Singer, Michael C. Mozer

In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence.

Image Denoising Imputation +1

Paper
Add Code

Identity Crisis: Memorization and Generalization under Extreme Overparameterization

no code implementations • ICLR 2020 • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Michael C. Mozer, Yoram Singer

We study the interplay between memorization and generalization of overparameterized networks in the extreme case of a single training example and an identity-mapping task.

Memorization

Paper
Add Code

Are All Layers Created Equal?

2 code implementations • ICML Workshop Deep_Phenomen 2019 • Chiyuan Zhang, Samy Bengio, Yoram Singer

Morally, layers of large deep neural networks can be categorized as either "robust" or "critical".

Paper
Code

Exponentiated Gradient Meets Gradient Descent

no code implementations • 5 Feb 2019 • Udaya Ghai, Elad Hazan, Yoram Singer

The hypentropy has a natural spectral counterpart which we use to derive a family of matrix-based updates that bridge gradient methods and the multiplicative method for matrices.

Paper
Add Code

Memory-Efficient Adaptive Optimization

3 code implementations • 30 Jan 2019 • Rohan Anil, Vineet Gupta, Tomer Koren, Yoram Singer

Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for achieving state-of-the-art performance in machine translation and language modeling.

Ranked #31 on Machine Translation on WMT2014 English-French

Language Modelling Machine Translation +1

32,951

Paper
Code

The Well-Tempered Lasso

no code implementations • ICML 2018 • Yuanzhi Li, Yoram Singer

Every regression parameter in the Lasso changes linearly as a function of the regularization value.

regression

Paper
Add Code

The Well Tempered Lasso

no code implementations • 8 Jun 2018 • Yuanzhi Li, Yoram Singer

Every regression parameter in the Lasso changes linearly as a function of the regularization value.

regression

Paper
Add Code

Shampoo: Preconditioned Stochastic Tensor Optimization

3 code implementations • ICML 2018 • Vineet Gupta, Tomer Koren, Yoram Singer

Preconditioned gradient methods are among the most general and powerful tools in optimization.

Stochastic Optimization

173

Paper
Code

Learning a neural response metric for retinal prosthesis

no code implementations • ICLR 2018 • Nishal P Shah, Sasidhar Madugula, EJ Chichilnisky, Yoram Singer, Jonathon Shlens

Retinal prostheses for treating incurable blindness are designed to electrically stimulate surviving retinal neurons, causing them to send artificial visual signals to the brain.

Paper
Add Code

A Unified Approach to Adaptive Regularization in Online and Stochastic Optimization

no code implementations • 20 Jun 2017 • Vineet Gupta, Tomer Koren, Yoram Singer

We describe a framework for deriving and analyzing online optimization algorithms that incorporate adaptive, data-dependent regularization, also termed preconditioning.

Stochastic Optimization

Paper
Add Code

Random Features for Compositional Kernels

no code implementations • 22 Mar 2017 • Amit Daniely, Roy Frostig, Vineet Gupta, Yoram Singer

We describe and analyze a simple random feature scheme (RFS) from prescribed compositional kernels.

Paper
Add Code

Sketching and Neural Networks

no code implementations • 19 Apr 2016 • Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar

In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree.

Paper
Add Code

Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity

no code implementations • NeurIPS 2016 • Amit Daniely, Roy Frostig, Yoram Singer

We develop a general duality between neural networks and compositional kernels, striving towards a better understanding of deep learning.

Paper
Add Code

Train faster, generalize better: Stability of stochastic gradient descent

no code implementations • 3 Sep 2015 • Moritz Hardt, Benjamin Recht, Yoram Singer

In the non-convex case, we give a new interpretation of common practices in neural networks, and formally show that popular techniques for training large deep models are indeed stability-promoting.

Paper
Add Code

Zero-Shot Learning by Convex Combination of Semantic Embeddings

2 code implementations • 19 Dec 2013 • Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg S. Corrado, Jeffrey Dean

In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage.

Ranked #8 on Multi-label zero-shot learning on Open Images V4

Multi-label zero-shot learning

912

Paper
Code

Using Web Co-occurrence Statistics for Improving Image Categorization

no code implementations • 19 Dec 2013 • Samy Bengio, Jeff Dean, Dumitru Erhan, Eugene Ie, Quoc Le, Andrew Rabinovich, Jonathon Shlens, Yoram Singer

Albeit the simplicity of the resulting optimization problem, it is effective in improving both recognition and localization accuracy.

Common Sense Reasoning Image Categorization +1

Paper
Add Code

The Maximum Entropy Relaxation Path

no code implementations • 7 Nov 2013 • Moshe Dubiner, Matan Gavish, Yoram Singer

We show existence and a geometric description of the relaxation path.

Paper
Add Code

Efficient Learning using Forward-Backward Splitting

no code implementations • NeurIPS 2009 • Yoram Singer, John C. Duchi

We derive concrete and very simple algorithms for minimization of loss functions with $\ell_1$, $\ell_2$, $\ell_2^2$, and $\ell_\infty$ regularization.

Paper
Add Code

Group Sparse Coding

no code implementations • NeurIPS 2009 • Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow

Bag-of-words document representations are often used in text, image and video processing.

Computational Efficiency General Classification +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.