Search Results for author: Maying Shen

Found 7 papers, 4 papers with code

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

no code implementations25 Jun 2023 Anna Bair, Hongxu Yin, Maying Shen, Pavlo Molchanov, Jose Alvarez

Robustness and compactness are two essential attributes of deep learning models that are deployed in the real world.

Image Classification object-detection +2

Soft Masking for Cost-Constrained Channel Pruning

1 code implementation4 Nov 2022 Ryan Humble, Maying Shen, Jorge Albericio Latorre, Eric Darve1, Jose M. Alvarez

Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy.

Structural Pruning via Latency-Saliency Knapsack

1 code implementation13 Oct 2022 Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget on targeting device.

When to Prune? A Policy towards Early Structural Pruning

no code implementations CVPR 2022 Maying Shen, Pavlo Molchanov, Hongxu Yin, Jose M. Alvarez

Through extensive experiments on ImageNet, we show that EPI empowers a quick tracking of early training epochs suitable for pruning, offering same efficacy as an otherwise ``oracle'' grid-search that scans through epochs and requires orders of magnitude more compute.

Network Pruning

HALP: Hardware-Aware Latency Pruning

1 code implementation20 Oct 2021 Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget.

Global Vision Transformer Pruning with Hessian-Aware Saliency

1 code implementation CVPR 2023 Huanrui Yang, Hongxu Yin, Maying Shen, Pavlo Molchanov, Hai Li, Jan Kautz

This work aims on challenging the common design philosophy of the Vision Transformer (ViT) model with uniform dimension across all the stacked blocks in a model stage, where we redistribute the parameters both across transformer blocks and between different structures within the block via the first systematic attempt on global structural pruning.

Efficient ViTs Philosophy

Optimal Quantization Using Scaled Codebook

no code implementations CVPR 2021 Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu Yin, Miguel A. Carreira-Perpinan, Jose M. Alvarez

We study the problem of quantizing N sorted, scalar datapoints with a fixed codebook containing K entries that are allowed to be rescaled.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.