Symbolic regression by uniform random global search

18 Jun 2019  ·  Sohrab Towfighi ·

Symbolic regression (SR) is a data analysis problem where we search for the mathematical expression that best fits a numerical dataset. It is a global optimization problem. The most popular approach to SR is by genetic programming (SRGP). It is a common paradigm to compare an algorithm's performance to that of random search, but the data comparing SRGP to random search is lacking. We describe a novel algorithm for SR, namely SR by uniform random global search (SRURGS), also known as pure random search. We conduct experiments comparing SRURGS with SRGP using 100 randomly generated equations. Our results suggest that a SRGP is faster than SRURGS in producing equations with good R^2 for simple problems. However, our experiments suggest that SRURGS is more robust than SRGP, able to produce good output in more challenging problems. As SRURGS is arguably the simplest global search algorithm, we believe it should serve as a control algorithm against which other symbolic regression algorithms are compared. SRURGS has only one tuning parameter, and is conceptually very simple, making it a useful tool in solving SR problems. The method produces random equations, which is useful for the generation of symbolic regression benchmark problems. We have released well documented and open-source python code, currently under formal peer-review, so that interested researchers can deploy the tool in practice.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods