Search Results for author: Niki Parmar

Found 19 papers, 12 papers with code

Decoder Denoising Pretraining for Semantic Segmentation

1 code implementation • 23 May 2022 • Emmanuel Brempong Asiedu, Simon Kornblith, Ting Chen, Niki Parmar, Matthias Minderer, Mohammad Norouzi

We propose a decoder pretraining approach based on denoising, which can be combined with supervised pretraining of the encoder.

Denoising Segmentation +1

Paper
Code

Simple and Efficient ways to Improve REALM

no code implementations • EMNLP (MRQA) 2021 • Vidhisha Balachandran, Ashish Vaswani, Yulia Tsvetkov, Niki Parmar

Dense retrieval has been shown to be effective for retrieving relevant documents for Open Domain QA, surpassing popular sparse retrieval methods like BM25.

Retrieval

Paper
Add Code

Scaling Local Self-Attention for Parameter Efficient Visual Backbones

7 code implementations • CVPR 2021 • Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens

Self-attention models have recently been shown to have encouraging improvements on accuracy-parameter trade-offs compared to baseline convolutional models such as ResNet-50.

Ranked #212 on Image Classification on ImageNet

Image Classification Instance Segmentation +4

29,789

Paper
Code

Bottleneck Transformers for Visual Recognition

13 code implementations • CVPR 2021 • Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani

Finally, we present a simple adaptation of the BoTNet design for image classification, resulting in models that achieve a strong performance of 84. 7% top-1 accuracy on the ImageNet benchmark while being up to 1. 64x faster in compute time than the popular EfficientNet models on TPU-v3 hardware.

Ranked #52 on Instance Segmentation on COCO minival

Image Classification Instance Segmentation +3

29,789

Paper
Code

Conformer: Convolution-augmented Transformer for Speech Recognition

24 code implementations • 16 May 2020 • Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs).

Ranked #12 on Speech Recognition on LibriSpeech test-other (using extra training data)

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

10,151

Paper
Code

High Resolution Medical Image Analysis with Spatial Partitioning

1 code implementation • 6 Sep 2019 • Le Hou, Youlong Cheng, Noam Shazeer, Niki Parmar, Yeqing Li, Panagiotis Korfiatis, Travis M. Drucker, Daniel J. Blezek, Xiaodan Song

It is infeasible to train CNN models directly on such high resolution images, because neural activations of a single image do not fit in the memory of a single GPU/TPU, and naive data and model parallelism approaches do not work.

Vocal Bursts Intensity Prediction

1,555

Paper
Code

Stand-Alone Self-Attention in Vision Models

8 code implementations • NeurIPS 2019 • Prajit Ramachandran, Niki Parmar, Ashish Vaswani, Irwan Bello, Anselm Levskaya, Jonathon Shlens

The natural question that arises is whether attention can be a stand-alone primitive for vision models instead of serving as just an augmentation on top of convolutions.

object-detection Object Detection

32,835

Paper
Code

Towards a better understanding of Vector Quantized Autoencoders

no code implementations • ICLR 2019 • Aurko Roy, Ashish Vaswani, Niki Parmar, Arvind Neelakantan

Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks.

Knowledge Distillation Machine Translation +1

Paper
Add Code

Corpora Generation for Grammatical Error Correction

no code implementations • NAACL 2019 • Jared Lichtarge, Chris Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar, Simon Tong

We provide systematic analysis that compares the two approaches to data generation and highlights the effectiveness of ensembling.

Grammatical Error Correction Machine Translation +1

Paper
Add Code

Mesh-TensorFlow: Deep Learning for Supercomputers

1 code implementation • NeurIPS 2018 • Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, Ryan Sepassi, Blake Hechtman

We use Mesh-TensorFlow to implement an efficient data-parallel, model-parallel version of the Transformer sequence-to-sequence model.

Ranked #10 on Language Modelling on One Billion Word

Language Modelling

1,555

Paper
Code

Weakly Supervised Grammatical Error Correction using Iterative Decoding

no code implementations • 31 Oct 2018 • Jared Lichtarge, Christopher Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar

We describe an approach to Grammatical Error Correction (GEC) that is effective at making use of models trained on large amounts of weakly supervised bitext.

Grammatical Error Correction

Paper
Add Code

Theory and Experiments on Vector Quantized Autoencoders

2 code implementations • 28 May 2018 • Aurko Roy, Ashish Vaswani, Arvind Neelakantan, Niki Parmar

Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks.

Image Generation Knowledge Distillation +2

Paper
Code

The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

3 code implementations • ACL 2018 • Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Niki Parmar, Mike Schuster, Zhifeng Chen, Yonghui Wu, Macduff Hughes

Each of these new approaches consists of a fundamental architecture accompanied by a set of modeling and training techniques that are in principle applicable to other seq2seq architectures.

Ranked #26 on Machine Translation on WMT2014 English-French

Machine Translation Translation

2,780

Paper
Code

Tensor2Tensor for Neural Machine Translation

14 code implementations • WS 2018 • Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit

Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.

Machine Translation Translation

14,892

Paper
Code

Fast Decoding in Sequence Models using Discrete Latent Variables

no code implementations • ICML 2018 • Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer

Finally, we evaluate our model end-to-end on the task of neural machine translation, where it is an order of magnitude faster at decoding than comparable autoregressive models.

Machine Translation Translation