Quantization

1032 papers with code • 10 benchmarks • 18 datasets

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Benchmarks

Add a Result

These leaderboards are used to track progress in Quantization

Dataset	Best Model	Compare
ImageNet	FQ-ViT (ViT-L)	See all
CIFAR-10	3DCNN_VIVA_3	See all
Knowledge-based:	3DCNN_VIVA_5	See all
MS COCO	SSD ResNet50 V1 FPN 640x640	See all
LFW		See all
CFP-FP		See all
AgeDB-30		See all
IJB-C		See all
IJB-B		See all
Wiki-40B	OutEffHop-Bert_base	See all

Libraries

Use these libraries to find Quantization models and implementations

microsoft/DeepSpeed

8 papers

32,567

faceonlive/ai-research

5 papers

139

UCMerced-ML/LC-model-compression

5 papers

huggingface/transformers

4 papers

124,793

See all 6 libraries.

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

EdgeFusion: On-Device Text-to-Image Generation

no code yet • 18 Apr 2024

The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application.

Paper
Add Code

Privacy-Preserving UCB Decision Process Verification via zk-SNARKs

no code yet • 18 Apr 2024

With the increasingly widespread application of machine learning, how to strike a balance between protecting the privacy of data and algorithm parameters and ensuring the verifiability of machine learning has always been a challenge.

Paper
Add Code

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory

no code yet • 17 Apr 2024

Transformer models have been successful in various sequence processing tasks, but the self-attention mechanism's computational cost limits its practicality for long sequences.

Paper
Add Code

Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

no code yet • 17 Apr 2024

Simulating the dynamics of open quantum systems coupled to non-Markovian environments remains an outstanding challenge due to exponentially scaling computational costs.

Paper
Add Code

QGen: On the Ability to Generalize in Quantization Aware Training

no code yet • 17 Apr 2024

In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance.

Paper
Add Code

Comprehensive Survey of Model Compression and Speed up for Vision Transformers

no code yet • 16 Apr 2024

Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks.

Paper
Add Code

Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning

no code yet • 16 Apr 2024

Inductive biases are crucial in disentangled representation learning for narrowing down an underspecified solution set.

Paper
Add Code

Efficient and accurate neural field reconstruction using resistive memory

no code yet • 15 Apr 2024

The GE harnesses the intrinsic stochasticity of resistive memory for efficient input encoding, while the PE achieves precise weight mapping through a Hardware-Aware Quantization (HAQ) circuit.

Paper
Add Code

TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models

no code yet • 15 Apr 2024

Diffusion models have emerged as preeminent contenders in the realm of generative models.

Paper
Add Code

Quantization of Large Language Models with an Overdetermined Basis

no code yet • 15 Apr 2024

In this paper, we introduce an algorithm for data quantization based on the principles of Kashin representation.

Paper
Add Code

Quantization

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result