no code implementations • 11 Sep 2023 • Pavan Karjol, Rohan Kashyap, Prathosh A P
Unlike the traditional $H$-invariant networks wherein $H$ is assumed to be known, we present a method to discover the underlying subgroup, given that it satisfies certain conditions.
no code implementations • 6 Sep 2023 • Pavan Karjol, Rohan Kashyap, Aditya Gopalan, Prathosh A. P
At the core of the framework is a novel architecture composed of linear, matrix-valued and non-linear functions that expresses functions invariant to these subgroups in a principled manner.
no code implementations • 28 Nov 2022 • Rohan Kashyap, Vivek Kashyap, Narendra C. P.
Recent work has demonstrated substantial gains in pre-training large-language models (LLMs) followed by supervised fine-tuning on the downstream task.
no code implementations • 28 Nov 2022 • Rohan Kashyap
Deep Learning optimization involves minimizing a high-dimensional loss function in the weight space which is often perceived as difficult due to its inherent difficulties such as saddle points, local minima, ill-conditioning of the Hessian and limited compute resources.