Model Compression

343 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Model Compression Techniques in Biometrics Applications: A Survey

eduardacaldeira/compression_bias_survey 18 Jan 2024

The development of deep learning algorithms has extensively empowered humanity's task automatization capacity.

3
18 Jan 2024

Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices

UoS-EEC/DynamicOFA 17 Jan 2024

In this thesis, we proposed a combined method, a system was developed for DNN performance trade-off management, combining the runtime trade-off opportunities in both algorithms and hardware to meet dynamically changing application performance targets and hardware constraints in real time.

25
17 Jan 2024

Knowledge Translation: A New Pathway for Model Compression

zju-swj/kt 11 Jan 2024

Deep learning has witnessed significant advancements in recent years at the cost of increasing training, inference, and model storage overhead.

6
11 Jan 2024

Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software Deployment

jiepku/safecompress 2 Jan 2024

To mitigate this issue, AI software compression plays a crucial role, which aims to compress model size while keeping high performance.

1
02 Jan 2024

Generative Model-based Feature Knowledge Distillation for Action Recognition

aaai-24/generative-based-kd 14 Dec 2023

Addressing this gap, our paper introduces an innovative knowledge distillation framework, with the generative model for training a lightweight student model.

8
14 Dec 2023

Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models

transmuteai/trailmet 12 Dec 2023

Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical.

18
12 Dec 2023

Understanding the Effect of Model Compression on Social Bias in Large Language Models

gsgoncalves/emnlp2023_llm_compression_and_social_bias 9 Dec 2023

Large Language Models (LLMs) trained with self-supervision on vast corpora of web text fit to the social biases of that text.

0
09 Dec 2023

Language Model Knowledge Distillation for Efficient Question Answering in Spanish

adrianbzg/tinyroberta-distillation-qa-es 7 Dec 2023

Recent advances in the development of pre-trained Spanish language models has led to significant progress in many Natural Language Processing (NLP) tasks, such as question answering.

1
07 Dec 2023

The Efficiency Spectrum of Large Language Models: An Algorithmic Survey

tding1/efficient-llm-survey 1 Dec 2023

The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains, reshaping the artificial general intelligence landscape.

50
01 Dec 2023

Physics Inspired Criterion for Pruning-Quantization Joint Learning

fanxxxxyi/pic-pq 1 Dec 2023

Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices.

2
01 Dec 2023