Search Results for author: Gregory Diamos

Found 12 papers, 5 papers with code

Empirically Characterizing Overparameterization Impact on Convergence

no code implementations ICLR 2019 Newsha Ardalani, Joel Hestness, Gregory Diamos

A long-held conventional wisdom states that larger models train more slowly when using gradient descent.

Coloring Big Graphs with AlphaGoZero

no code implementations26 Feb 2019 Jiayi Huang, Mostofa Patwary, Gregory Diamos

We show that recent innovations in deep reinforcement learning can effectively color very large graphs -- a well-known NP-hard problem with clear commercial applications.

Language Modeling at Scale

no code implementations23 Oct 2018 Mostofa Patwary, Milind Chabbi, Heewoo Jun, Jiaji Huang, Gregory Diamos, Kenneth Church

We show how Zipf's Law can be used to scale up language modeling (LM) to take advantage of more training data and more GPUs.

Language Modelling Machine Translation +2

A Proposed Hierarchy of Deep Learning Tasks

no code implementations27 Sep 2018 Joel Hestness, Sharan Narang, Newsha Ardalani, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou, Gregory Diamos, Kenneth Church

As the pace of deep learning innovation accelerates, it becomes increasingly important to organize the space of problems by relative difficultly.

Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks

no code implementations20 Aug 2018 Sercan O. Arik, Heewoo Jun, Gregory Diamos

We propose the multi-head convolutional neural network (MCNN) architecture for waveform synthesis from spectrograms.

speech-recognition Speech Recognition +1

Deep Learning Scaling is Predictable, Empirically

no code implementations1 Dec 2017 Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou

As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art.

Language Modelling Machine Translation +3

Block-Sparse Recurrent Neural Networks

no code implementations ICLR 2018 Sharan Narang, Eric Undersander, Gregory Diamos

Even though sparse operations need less compute and memory relative to their dense counterparts, the speed-up observed by using sparse operations is less than expected on different hardware platforms.

Language Modelling Machine Translation +3

Deep Voice 2: Multi-Speaker Neural Text-to-Speech

1 code implementation NeurIPS 2017 Sercan Arik, Gregory Diamos, Andrew Gibiansky, John Miller, Kainan Peng, Wei Ping, Jonathan Raiman, Yanqi Zhou

We introduce Deep Voice 2, which is based on a similar pipeline with Deep Voice 1, but constructed with higher performance building blocks and demonstrates a significant audio quality improvement over Deep Voice 1.

Speech Synthesis

Exploring Sparsity in Recurrent Neural Networks

1 code implementation17 Apr 2017 Sharan Narang, Erich Elsen, Gregory Diamos, Shubho Sengupta

Benchmarks show that using our technique model size can be reduced by 90% and speed-up is around 2x to 7x.

Cannot find the paper you are looking for? You can Submit a new open access paper.