Quantization

1032 papers with code • 10 benchmarks • 18 datasets

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Libraries

Use these libraries to find Quantization models and implementations

Most implemented papers

Polysemous codes

facebookresearch/faiss 7 Sep 2016

This paper considers the problem of approximate nearest neighbor search in the compressed domain.

Learned Step Size Quantization

zhutmost/lsq-net ICLR 2020

Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases.

Improvements to Target-Based 3D LiDAR to Camera Calibration

UMich-BipedLab/extrinsic_lidar_camera_calibration 7 Oct 2019

The homogeneous transformation between a LiDAR and monocular camera is required for sensor fusion tasks, such as SLAM.

YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications

meituan/yolov6 7 Sep 2022

The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios.

Link and code: Fast indexing with graphs and compact regression codes

facebookresearch/faiss CVPR 2018

Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements.

Unsupervised Cross-lingual Representation Learning for Speech Recognition

huggingface/transformers 24 Jun 2020

This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages.

QVRF: A Quantization-error-aware Variable Rate Framework for Learned Image Compression

bytedance/qraf 10 Mar 2023

In this paper, we present a Quantization-error-aware Variable Rate Framework (QVRF) that utilizes a univariate quantization regulator a to achieve wide-range variable rates within a single model.

Model compression via distillation and quantization

antspy/quantized_distillation ICLR 2018

Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning.

Data-Free Quantization Through Weight Equalization and Bias Correction

jakc4103/DFQ ICCV 2019

This improves quantization accuracy performance, and can be applied to many common computer vision architectures with a straight forward API call.