1 code implementation • 24 Sep 2023 • Christopher Subia-Waud, Srinandan Dasmahapatra
Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values.
1 code implementation • 24 Oct 2022 • Christopher Subia-Waud, Srinandan Dasmahapatra
Rather than a channel or layer-wise encoding, we look to lossless whole-network quantisation to minimise the entropy and number of unique parameters in a network.