Model Compression

342 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Benchmarks

Add a Result

These leaderboards are used to track progress in Model Compression

Trend	Dataset	Best Model	Paper	Code	Compare
	ImageNet	ADLIK-MO-ResNet50+W4A4			See all
	QNLI	MobileBERT + 2bit-1dim model compression using DKM			See all

Libraries

Use these libraries to find Model Compression models and implementations

yoshitomo-matsubara/torchdistill

6 papers

1,280

UCMerced-ML/LC-model-compression

5 papers

yeshaokai/Robustness-Aware-Pruning-…

3 papers

NervanaSystems/distiller

2 papers

4,307

See all 8 libraries.

Datasets

Subtasks

Neural Network Compression

Latest papers with no code

Most implemented Social Latest No code

Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations

no code yet • 2 Apr 2024

Therefore model compression is important, to retain the performance of larger models, but with a reduced cost of running them.

Paper
Add Code

Instance-Aware Group Quantization for Vision Transformers

no code yet • 1 Apr 2024

In particular, the distribution of activations for each channel vary drastically according to input instances, making PTQ methods for CNNs inappropriate for ViTs.

Paper
Add Code

Dense Vision Transformer Compression with Few Samples

no code yet • 27 Mar 2024

In particular, the issue of sparse compression exists in traditional CNN few-shot methods, which can only produce very few compressed models of different model sizes.

Paper
Add Code

Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation

no code yet • 27 Mar 2024

Moreover, we propose a method that allows the transfer of modules between incompatible PLMs without any change in the inference complexity.

Paper
Add Code

Chain of Compression: A Systematic Approach to Combinationally Compress Convolutional Neural Networks

no code yet • 26 Mar 2024

Convolutional neural networks (CNNs) have achieved significant popularity, but their computational and memory intensity poses challenges for resource-constrained computing systems, particularly with the prerequisite of real-time performance.

Paper
Add Code

Magic for the Age of Quantized DNNs

no code yet • 22 Mar 2024

Recently, the number of parameters in DNNs has explosively increased, as exemplified by LLMs (Large Language Models), making inference on small-scale computers more difficult.

Paper
Add Code

Advancing IIoT with Over-the-Air Federated Learning: The Role of Iterative Magnitude Pruning

no code yet • 21 Mar 2024

Targeting the notion of compact yet robust DNN models, we propose the integration of iterative magnitude pruning (IMP) of the DNN model being trained in an over-the-air FL (OTA-FL) environment for IIoT.

Paper
Add Code

DiPaCo: Distributed Path Composition

no code yet • 15 Mar 2024

Progress in machine learning (ML) has been fueled by scaling neural network models.

Paper
Add Code

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

no code yet • 14 Mar 2024

Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.

Paper
Add Code

BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction

no code yet • 14 Mar 2024

To better extract the joint features of heterogeneous EEG data as well as enhance classification accuracy, BRIEDGE introduces an informer-based ProbSparse self-attention mechanism.

Paper
Add Code

Model Compression

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result