Quantization

1046 papers with code • 10 benchmarks • 18 datasets

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Benchmarks

Add a Result

These leaderboards are used to track progress in Quantization

Dataset	Best Model	Compare
ImageNet	FQ-ViT (ViT-L)	See all
CIFAR-10	3DCNN_VIVA_3	See all
Knowledge-based:	3DCNN_VIVA_5	See all
MS COCO	SSD ResNet50 V1 FPN 640x640	See all
LFW		See all
CFP-FP		See all
AgeDB-30		See all
IJB-C		See all
IJB-B		See all
Wiki-40B	OutEffHop-Bert_base	See all

Libraries

Use these libraries to find Quantization models and implementations

microsoft/DeepSpeed

8 papers

32,813

faceonlive/ai-research

5 papers

186

UCMerced-ML/LC-model-compression

5 papers

huggingface/transformers

4 papers

125,385

See all 5 libraries.

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Latency-Distortion Tradeoffs in Communicating Classification Results over Noisy Channels

no code yet • 22 Apr 2024

Our results show that there is an interesting interplay between source distortion (i. e., distortion for the probability vector measured via f-divergence) and the subsequent channel encoding/decoding parameters; and indicate that a joint design of these parameters is crucial to navigate the latency-distortion tradeoff.

Paper
Add Code

AdaQAT: Adaptive Bit-Width Quantization-Aware Training

no code yet • 22 Apr 2024

Compared to other methods that are generally designed to be run on a pretrained network, AdaQAT works well in both training from scratch and fine-tuning scenarios. Initial results on the CIFAR-10 and ImageNet datasets using ResNet20 and ResNet18 models, respectively, indicate that our method is competitive with other state-of-the-art mixed-precision quantization approaches.

Paper
Add Code

FedMPQ: Secure and Communication-Efficient Federated Learning with Multi-codebook Product Quantization

no code yet • 21 Apr 2024

In federated learning, particularly in cross-device scenarios, secure aggregation has recently gained popularity as it effectively defends against inference attacks by malicious aggregators.

Paper
Add Code

HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression

no code yet • 20 Apr 2024

This paper investigates the challenging problem of learned image compression (LIC) with extreme low bitrates.

Paper
Add Code

EdgeFusion: On-Device Text-to-Image Generation

no code yet • 18 Apr 2024

The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application.

Paper
Add Code

Privacy-Preserving UCB Decision Process Verification via zk-SNARKs

no code yet • 18 Apr 2024

With the increasingly widespread application of machine learning, how to strike a balance between protecting the privacy of data and algorithm parameters and ensuring the verifiability of machine learning has always been a challenge.

Paper
Add Code

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory

no code yet • 17 Apr 2024

Transformer models have been successful in various sequence processing tasks, but the self-attention mechanism's computational cost limits its practicality for long sequences.

Paper
Add Code

Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

no code yet • 17 Apr 2024

Simulating the dynamics of open quantum systems coupled to non-Markovian environments remains an outstanding challenge due to exponentially scaling computational costs.

Paper
Add Code

QGen: On the Ability to Generalize in Quantization Aware Training

no code yet • 17 Apr 2024

In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance.

Paper
Add Code

Comprehensive Survey of Model Compression and Speed up for Vision Transformers

no code yet • 16 Apr 2024

Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks.

Paper
Add Code

Quantization

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result