Distributed Linear Regression with Compositional Covariates

21 Oct 2023  ·  Yue Chao, Lei Huang, Xuejun Ma ·

With the availability of extraordinarily huge data sets, solving the problems of distributed statistical methodology and computing for such data sets has become increasingly crucial in the big data area. In this paper, we focus on the distributed sparse penalized linear log-contrast model in massive compositional data. In particular, two distributed optimization techniques under centralized and decentralized topologies are proposed for solving the two different constrained convex optimization problems. Both two proposed algorithms are based on the frameworks of Alternating Direction Method of Multipliers (ADMM) and Coordinate Descent Method of Multipliers(CDMM, Lin et al., 2014, Biometrika). It is worth emphasizing that, in the decentralized topology, we introduce a distributed coordinate-wise descent algorithm based on Group ADMM(GADMM, Elgabli et al., 2020, Journal of Machine Learning Research) for obtaining a communication-efficient regularized estimation. Correspondingly, the convergence theories of the proposed algorithms are rigorously established under some regularity conditions. Numerical experiments on both synthetic and real data are conducted to evaluate our proposed algorithms.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods