Search Results for author: Churan He

Found 1 papers, 0 papers with code

SEER-MoE: Sparse Expert Efficiency through Regularization for Mixture-of-Experts

no code implementations7 Apr 2024 Alexandre Muzio, Alex Sun, Churan He

The advancement of deep learning has led to the emergence of Mixture-of-Experts (MoEs) models, known for their dynamic allocation of computational resources based on input.

Cannot find the paper you are looking for? You can Submit a new open access paper.