Feature Selection with the Boruta Package

Journal of Statistical Software 2010 2010 · Miron B. Kursa, Witold R. Rudnicki ·

This article describes a R package Boruta, implementing a novel feature selection algorithm for finding all relevant variables. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.

PDF Abstract