Unifying Decision Trees Split Criteria Using Tsallis Entropy

25 Nov 2015  ·  Yisen Wang, Chaobing Song, Shu-Tao Xia ·

The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct near-optimal decision trees. ID3, C4.5 and CART are classical decision tree algorithms and the split criteria they used are Shannon entropy, Gain Ratio and Gini index respectively. All the split criteria seem to be independent, actually, they can be unified in a Tsallis entropy framework. Tsallis entropy is a generalization of Shannon entropy and provides a new approach to enhance decision trees' performance with an adjustable parameter $q$. In this paper, a Tsallis Entropy Criterion (TEC) algorithm is proposed to unify Shannon entropy, Gain Ratio and Gini index, which generalizes the split criteria of decision trees. More importantly, we reveal the relations between Tsallis entropy with different $q$ and other split criteria. Experimental results on UCI data sets indicate that the TEC algorithm achieves statistically significant improvement over the classical algorithms.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here