[Re] Neural Networks Fail to Learn Periodic Functions and How to Fix It

RC 2020 · Mayur Arvind, Mustansir Mama ·

Reproduction study for Neural Networks Fail to Learn Periodic Functions and How to Fix It

Scope of Reproducibility
Neural Networks Fail to Learn Periodic Functions and How to Fix It [1] demonstrates experimentally that standard activations such as ReLU, tanh, sigmoid and their variants all fail to learn how to extrapolate simple periodic functions. The original paper goes on to propose a new activation, which the authors name the snake function. The central claims of the paper are two-fold: (1) The properties of the activation functions are carried over to the neural networks. A tanh network will be smooth and extrapolates to a constant function, while ReLU extrapolates in a linear way. Standard neural networks with conventional activation functions are insufficient for extrapolating periodic functions. (2) The proposed activation function manages to learn periodic functions while being able to optimize as well as conventional activation functions. While both experimental proof and theoretical justifications are provided for the claims, we shall only be concerned with testing the claims via experimental means.

Methodology
While one of the authors was contacted to clarify certain difficulties, the reproduction of all experiments was completed using only the information provided in the paper. With one exception, the links to all datasets used were also provided in the original paper. This allowed us to implement most experiments from scratch.

Results
We were able to successfully replicate experiments supporting the central claim of the paper, that the proposed snake non-linearity can learn periodic functions. We also analyze the suitability of the snake activation for other tasks like generative modeling and sentiment analysis.

What was easy
Many experiments included descriptions of the neural network architectures and graphs showcasing performance, giving us a clear benchmark to compare our results against.

Links to datasets for all experiments, barring one, were also included in the paper itself.
What was difficult

Data for the human body temperature experiment was not available. Proper implementation details were not given for initializing the weights in neural networks with snake and using snake with RNNs.
Communication with original authors

Liu Ziyin, one of the authors, was contacted to provide the dataset used for the human body temperature experiment, elaborate upon the implementation of variance correction and provide the implementation of RNNs using snake. Liu provided the GitHub link to the authorsʼ original code for the human body temperature, market index, and extrapolation experiments. Liu also provided an explanation on how to implement variance correction. While the code for the RNN implementation using the snake activation was not made public, a screenshot of the same was provided. We thank the authors for their assistance.