LCEN: A Novel Feature Selection Algorithm for Nonlinear, Interpretable Machine Learning Models

27 Feb 2024 · Pedro Seber, Richard D. Braatz ·

Interpretable architectures can have advantages over black-box architectures, and interpretability is essential for the application of machine learning in critical settings, such as aviation or medicine. However, the simplest, most commonly used interpretable architectures (such as LASSO or EN) are limited to linear predictions and have poor feature selection capabilities. In this work, we introduce the LASSO-Clip-EN (LCEN) algorithm for the creation of nonlinear, interpretable machine learning models. LCEN is tested on a wide variety of artificial and empirical datasets, creating more accurate, sparser models than other commonly used architectures. These experiments reveal that LCEN is robust against many issues typically present in datasets and modeling, including noise, multicollinearity, data scarcity, and hyperparameter variance. LCEN is also able to rediscover multiple physical laws from empirical data and, for processes with no known physical laws, LCEN achieves better results than many other dense and sparse methods -- including using 10.8 times fewer features than dense methods and 8.1 times fewer features than EN on one dataset, and is comparable to an ANN on another dataset.

PDF Abstract