Model Compression

342 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

faceonlive/ai-research 11 Apr 2024

In the context of efficient OVS, we target achieving performance that is comparable to or even better than prior OVS works based on large vision-language foundation models, by utilizing smaller models that incur lower training costs.

131
11 Apr 2024

Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind

faceonlive/ai-research 6 Apr 2024

MBS overcomes the English-centric limitations of existing methods by sampling calibration data from various languages proportionally to the language distribution of the model training datasets.

131
06 Apr 2024

Are Compressed Language Models Less Subgroup Robust?

wearepal/compression-subgroup 26 Mar 2024

To reduce the inference cost of large language models, model compression is increasingly used to create smaller scalable models.

2
26 Mar 2024

Tiny Models are the Computational Saver for Large Models

QingyuanWang/tinysaver 26 Mar 2024

By searching and employing the most appropriate tiny model as the computational saver for a given large model, the proposed approaches work as a novel and generic method to model compression.

1
26 Mar 2024

Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency

saintslab/adver-fine 14 Mar 2024

We present experiments on two benchmark datasets showing that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models, while also improving computational efficiency.

0
14 Mar 2024

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

aiot-mlsys-lab/svd-llm 12 Mar 2024

The advancements in Large Language Models (LLMs) have been hindered by their substantial sizes, which necessitate LLM compression methods for practical deployment.

32
12 Mar 2024

Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing

hly1998/brcd 10 Mar 2024

In this paper, we propose an innovative Bit-mask Robust Contrastive knowledge Distillation (BRCD) method, specifically devised for the distillation of semantic hashing models.

2
10 Mar 2024

DyCE: Dynamic Configurable Exiting for Deep Learning Compression and Scaling

QingyuanWang/dyce 4 Mar 2024

Moreover, most current dynamic compression designs are monolithic and tightly integrated with base models, thereby complicating the adaptation to novel base models.

0
04 Mar 2024

"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach

model-compression/lossless_compression 1 Mar 2024

Modern deep neural networks (DNNs) are extremely powerful; however, this comes at the price of increased depth and having more parameters per layer, making their training and inference more computationally challenging.

21
01 Mar 2024

PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning

hkuds/promptmm 27 Feb 2024

Additionally, to adjust the impact of inaccuracies in multimedia data, a disentangled multi-modal list-wise distillation is developed with modality-aware re-weighting mechanism.

25
27 Feb 2024