Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server

12 Apr 2018  ·  Cheng Daning, Xia Fen, Li Shigang, Zhang Yunquan ·

In AI research and industry, machine learning is the most widely used tool. One of the most important machine learning algorithms is Gradient Boosting Decision Tree, i.e. GBDT whose training process needs considerable computational resources and time. To shorten GBDT training time, many works tried to apply GBDT on Parameter Server. However, those GBDT algorithms are synchronous parallel algorithms which fail to make full use of Parameter Server. In this paper, we examine the possibility of using asynchronous parallel methods to train GBDT model and name this algorithm as asynch-SGBDT (asynchronous parallel stochastic gradient boosting decision tree). Our theoretical and experimental results indicate that the scalability of asynch-SGBDT is influenced by the sample diversity of datasets, sampling rate, step length and the setting of GBDT tree. Experimental results also show asynch-SGBDT training process reaches a linear speedup in asynchronous parallel manner when datasets and GBDT trees meet high scalability requirements.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here