Non-asymptotic oracle inequalities for the Lasso in high-dimensional mixture of experts

22 Sep 2020  ·  TrungTin Nguyen, Hien D. Nguyen, Faicel Chamroukhi, Geoffrey J. McLachlan ·

Mixture of experts (MoE) has a well-principled finite mixture model construction for prediction, allowing the gating network (mixture weights) to learn from the predictors (explanatory variables) together with the experts' network (mixture component densities). We investigate the estimation properties of MoEs in a high-dimensional setting, where the number of predictors is much larger than the sample size, for which the literature lacks computational and especially theoretical results. We consider the class of finite MoE models with softmax gating functions and Gaussian regression experts, and focus on the theoretical properties of their $l_1$-regularized estimation via the Lasso. We provide a lower bound on the regularization parameter of the Lasso penalty that ensures an $l_1$-oracle inequality is satisfied by the Lasso estimator according to the Kullback--Leibler loss. We further state an $l_1$-ball oracle inequality for the $l_1$-penalized maximum likelihood estimator from the model selection.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods