Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior

5 Oct 2020 Zi Lin Jeremiah Zhe Liu Zi Yang Nan Hua Dan Roth

Traditional (unstructured) pruning methods for a Transformer model focus on regularizing the individual weights by penalizing them toward zero. In this work, we explore spectral-normalized identity priors (SNIP), a structured pruning approach that penalizes an entire residual module in a Transformer model toward an identity mapping... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper