Quantization

1039 papers with code • 10 benchmarks • 18 datasets

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Benchmarks

Add a Result

These leaderboards are used to track progress in Quantization

Dataset	Best Model	Compare
ImageNet	FQ-ViT (ViT-L)	See all
CIFAR-10	3DCNN_VIVA_3	See all
Knowledge-based:	3DCNN_VIVA_5	See all
MS COCO	SSD ResNet50 V1 FPN 640x640	See all
LFW		See all
CFP-FP		See all
AgeDB-30		See all
IJB-C		See all
IJB-B		See all
Wiki-40B	OutEffHop-Bert_base	See all

Libraries

Use these libraries to find Quantization models and implementations

microsoft/DeepSpeed

8 papers

32,692

faceonlive/ai-research

5 papers

156

UCMerced-ML/LC-model-compression

5 papers

huggingface/transformers

4 papers

125,059

See all 5 libraries.

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

macaronlin/llama3-quantization • • 22 Apr 2024

This exploration holds the potential to unveil new insights and challenges for low-bit quantization of LLaMA3 and other forthcoming LLMs, especially in addressing performance degradation problems that suffer in LLM compression.

22 Apr 2024

Paper
Code

MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA based Mixture of Experts

TUDB-Labs/MixLoRA • • 22 Apr 2024

Unlike other LoRA based MoE methods, MixLoRA enhances model performance by utilizing independently configurable attention-layer LoRA adapters, supporting the use of LoRA and its variants for the construction of experts, and applying auxiliary load balance loss to address the imbalance problem of the router.

22 Apr 2024

Paper
Code

MAexp: A Generic Platform for RL-based Multi-Agent Exploration

duangzhu/maexp • • 19 Apr 2024

The sim-to-real gap poses a significant challenge in RL-based multi-agent exploration due to scene quantization and action discretization.

19 Apr 2024

Paper
Code

decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points

bytedance/decoupleq • • 19 Apr 2024

However, existing quantization schemes suffer from significant accuracy degradation at very low bits, or require some additional computational overhead when deployed, making it difficult to be applied to large-scale applications in industry.

19 Apr 2024

Paper
Code

Variational quantization for state space models

etidav/next • • 17 Apr 2024

The main challenge is to model a rich variety of time series, leverage any available external signals and provide sharp predictions with statistical guarantees.

17 Apr 2024

Paper
Code

Exploring Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators

faceonlive/ai-research • 8 Apr 2024

Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference accelerator depend on many factors, including a weight quantization strategy (i. e., data types and bit-widths) and mapping (i. e., placement and scheduling of DNN elementary operations on hardware units of the accelerator).

156

08 Apr 2024

Paper
Code

David and Goliath: An Empirical Evaluation of Attacks and Defenses for QNNs at the Deep Edge

faceonlive/ai-research • 8 Apr 2024

To fill this gap, we empirically evaluate the effectiveness of attacks and defenses from (full-precision) ANNs on (constrained) QNNs.

156

08 Apr 2024

Paper
Code

BinaryDM: Towards Accurate Binarization of Diffusion Model

xingyu-zheng/binarydm • • 8 Apr 2024

With the advancement of diffusion models (DMs) and the substantially increased computational requirements, quantization emerges as a practical solution to obtain compact and efficient low-bit DMs.

08 Apr 2024

Paper
Code

Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging

thuccslab/mergeguard • • 8 Apr 2024

Model merging is a promising lightweight model empowerment technique that does not rely on expensive computing devices (e. g., GPUs) or require the collection of specific training data.

08 Apr 2024

Paper
Code

Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval

faceonlive/ai-research • Proceedings of the AAAI Conference on Artificial Intelligence 2021

Deep quantization methods have shown high efficiency on large-scale image retrieval.

156

07 Apr 2024

Paper
Code

Quantization

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result