Search Results for author: Frithjof Gressmann

Found 4 papers, 1 papers with code

Towards Structured Dynamic Sparse Pre-Training of BERT

no code implementations13 Aug 2021 Anastasia Dietrich, Frithjof Gressmann, Douglas Orr, Ivan Chelombiev, Daniel Justus, Carlo Luschi

Identifying algorithms for computational efficient unsupervised training of large language models is an important and active area of research.

Language Modelling

Improving Neural Network Training in Low Dimensional Random Bases

1 code implementation NeurIPS 2020 Frithjof Gressmann, Zach Eaton-Rosen, Carlo Luschi

Stochastic Gradient Descent (SGD) has proven to be remarkably effective in optimizing deep neural networks that employ ever-larger numbers of parameters.

Cannot find the paper you are looking for? You can Submit a new open access paper.