1 code implementation • NeurIPS 2023 • Haoran You, Huihong Shi, Yipin Guo, Yingyan Lin
To marry the best of both worlds, we further propose a new mixture of experts (MoE) framework to reparameterize MLPs by taking multiplication or its primitives as experts, e. g., multiplication and shift, and designing a new latency-aware load-balancing loss.