no code implementations • 9 Feb 2022 • Adrian Alan Pol, Thea Aarrestad, Ekaterina Govorkova, Roi Halily, Anat Klempner, Tal Kopetz, Vladimir Loncar, Jennifer Ngadiuba, Maurizio Pierini, Olya Sirkin, Sioni Summers
We experiment with 8-bit and ternary quantization, benchmarking their accuracy and inference latency against a single-precision floating-point.