Network Pruning
213 papers with code • 5 benchmarks • 5 datasets
Network Pruning is a popular approach to reduce a heavy network to obtain a light-weight form by removing redundancy in the heavy network. In this approach, a complex over-parameterized network is first trained, then pruned based on come criterions, and finally fine-tuned to achieve comparable performance with reduced parameters.
Source: Ensemble Knowledge Distillation for Learning Improved and Efficient Networks
Libraries
Use these libraries to find Network Pruning models and implementationsLatest papers
Neural Network Pruning by Gradient Descent
The rapid increase in the parameters of deep learning models has led to significant costs, challenging computational efficiency and model interpretability.
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
On the other hand, even successful methods identify neurons that are not specific to a single memorized sequence.
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
GBLM-Pruner leverages the first-order term of the Taylor expansion, operating in a training-free manner by harnessing properly normalized gradients from a few calibration samples to determine the pruning metric, and substantially outperforms competitive counterparts like SparseGPT and Wanda in multiple benchmarks.
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Inspired by the Dynamic Sparse Training, DSnoT minimizes the reconstruction error between the dense and sparse LLMs, in the fashion of performing iterative weight pruning-and-growing on top of sparse LLMs.
Filter Pruning For CNN With Enhanced Linear Representation Redundancy
In this paper, we propose a new structured pruning method.
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Large Language Models (LLMs), renowned for their remarkable performance across diverse domains, present a challenge when it comes to practical deployment due to their colossal model size.
SWAP: Sparse Entropic Wasserstein Regression for Robust Network Pruning
This study addresses the challenge of inaccurate gradients in computing the empirical Fisher Information Matrix during neural network pruning.
Feather: An Elegant Solution to Effective DNN Sparsification
Neural Network pruning is an increasingly popular way for producing compact and efficient models, suitable for resource-limited environments, while preserving high performance.
EDAC: Efficient Deployment of Audio Classification Models For COVID-19 Detection
Various researchers made use of machine learning methods in an attempt to detect COVID-19.
A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis, and Recommendations
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources.