Browse SoTA > Methodology > Model Compression

Model Compression

95 papers with code · Methodology

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Benchmarks

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Greatest papers with code

Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

ICLR 2020 google-research/bert

Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training.

LANGUAGE MODELLING MODEL COMPRESSION SENTIMENT ANALYSIS

What Do Compressed Deep Neural Networks Forget?

13 Nov 2019google-research/google-research

However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques.

FAIRNESS INTERPRETABILITY TECHNIQUES FOR DEEP LEARNING MODEL COMPRESSION NETWORK PRUNING OUTLIER DETECTION QUANTIZATION

The State of Sparsity in Deep Neural Networks

25 Feb 2019google-research/google-research

We rigorously evaluate three state-of-the-art techniques for inducing sparsity in deep neural networks on two large-scale learning tasks: Transformer trained on WMT 2014 English-to-German, and ResNet-50 trained on ImageNet.

MODEL COMPRESSION SPARSE LEARNING

Training with Quantization Noise for Extreme Model Compression

ICLR 2021 pytorch/fairseq

A standard solution is to train networks with Quantization Aware Training, where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator.

IMAGE CLASSIFICATION MODEL COMPRESSION QUANTIZATION

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size

24 Feb 2016pytorch/vision

(2) Smaller DNNs require less bandwidth to export a new model from the cloud to an autonomous car.

IMAGE CLASSIFICATION MODEL COMPRESSION

Model compression via distillation and quantization

ICLR 2018 NervanaSystems/distiller

Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning.

MODEL COMPRESSION QUANTIZATION

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

ECCV 2018 NervanaSystems/distiller

Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets.

MODEL COMPRESSION NEURAL ARCHITECTURE SEARCH

Contrastive Representation Distillation

ICLR 2020 HobbitLong/RepDistiller

We demonstrate that this objective ignores important structural knowledge of the teacher network.

CONTRASTIVE LEARNING MODEL COMPRESSION TRANSFER LEARNING

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

13 Apr 2020FLHonker/Awesome-Knowledge-Distillation

To achieve faster speeds and to handle the problems caused by the lack of data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another.

MODEL COMPRESSION TRANSFER LEARNING

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

NeurIPS 2019 ShawnDing1994/ACNet

Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices.

MODEL COMPRESSION