This work introduces a novel activation unit that can be efficiently employed in deep neural nets (DNNs) and performs significantly better than the traditional Rectified Linear Units (ReLU). The function developed is a two parameter version of the specialized Richard's Curve and we call it Adaptive Richard's Curve weighted Activation (ARiA). This function is non-monotonous, analogous to the newly introduced Swish, however allows a precise control over its non-monotonous convexity by varying the hyper-parameters. We first demonstrate the mathematical significance of the two parameter ARiA followed by its application to benchmark problems such as MNIST, CIFAR-10 and CIFAR-100, where we compare the performance with ReLU and Swish units. Our results illustrate a significantly superior performance on all these datasets, making ARiA a potential replacement for ReLU and other activations in DNNs.
Source: ARiA: Utilizing Richard's Curve for Controlling the Non-monotonicity of the Activation Function in Deep Neural NetsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
3D Object Detection | 2 | 14.29% |
Language Modelling | 2 | 14.29% |
Object Detection | 2 | 14.29% |
3D Reconstruction | 1 | 7.14% |
Translation | 1 | 7.14% |
Federated Learning | 1 | 7.14% |
Image Classification | 1 | 7.14% |
Medical Image Classification | 1 | 7.14% |
Large Language Model | 1 | 7.14% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |