Browse SoTA > Methodology > Quantization

Quantization

267 papers with code · Methodology

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Benchmarks

Greatest papers with code

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

CVPR 2018 tensorflow/models

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes.

QUANTIZATION

FastText.zip: Compressing text classification models

12 Dec 2016facebookresearch/fastText

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.

QUANTIZATION TEXT CLASSIFICATION WORD EMBEDDINGS

What Do Compressed Deep Neural Networks Forget?

13 Nov 2019google-research/google-research

However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques.

FAIRNESS INTERPRETABILITY TECHNIQUES FOR DEEP LEARNING MODEL COMPRESSION NETWORK PRUNING OUTLIER DETECTION QUANTIZATION

Link and code: Fast indexing with graphs and compact regression codes

CVPR 2018 facebookresearch/faiss

Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements.

IMAGE SIMILARITY SEARCH QUANTIZATION

Billion-scale similarity search with GPUs

28 Feb 2017facebookresearch/faiss

Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures.

IMAGE SIMILARITY SEARCH QUANTIZATION

Polysemous codes

7 Sep 2016facebookresearch/faiss

This paper considers the problem of approximate nearest neighbor search in the compressed domain.

QUANTIZATION

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

NeurIPS 2020 pytorch/fairseq

We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler.

 Ranked #1 on Speech Recognition on TIMIT (using extra training data)

QUANTIZATION SELF-SUPERVISED LEARNING SPEECH RECOGNITION

Training with Quantization Noise for Extreme Model Compression

ICLR 2021 pytorch/fairseq

A standard solution is to train networks with Quantization Aware Training, where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator.

IMAGE CLASSIFICATION MODEL COMPRESSION QUANTIZATION

Trained Ternary Quantization

4 Dec 2016tensorpack/tensorpack

To solve this problem, we propose Trained Ternary Quantization (TTQ), a method that can reduce the precision of weights in neural networks to ternary values.

QUANTIZATION

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

28 Jan 2019NervanaSystems/distiller

The majority of existing literature focuses on training quantized DNNs, while this work examines the less-studied topic of quantizing a floating-point model without (re)training.

LANGUAGE MODELLING NEURAL NETWORK COMPRESSION QUANTIZATION