Intragroup sparsity for efficient inference

1 Jan 2021 · Zilin Yu, Chao Wang, Xin Wang, Yong Zhao, Xundong Wu ·

This work studies intragroup sparsity, a fine-grained structural constraint on network weight parameters. It eliminates the computational inefficiency of fine-grained sparsity due to irregular dataflow, while at the same time achieving high inference accuracy. We present theoretical analysis on how weight group sizes affect sparsification error, and on how the performance of pruned networks changes with sparsity level. Further, we analyze inference-time I/O cost of two different strategies for achieving intragroup sparsity and how the choice of strategies affect I/O cost under mild assumptions on accelerator architecture. Moreover, we present a novel training algorithm that yield models of improved accuracies over the standard training approach under the intragroup sparsity constraint.

PDF Abstract