Search Results for author: Angelika Steger

Found 12 papers, 4 papers with code

Discovering modular solutions that generalize compositionally

1 code implementation • 22 Dec 2023 • Simon Schug, Seijin Kobayashi, Yassir Akram, Maciej Wołczyk, Alexandra Proca, Johannes von Oswald, Razvan Pascanu, João Sacramento, Angelika Steger

This allows us to relate the problem of compositional generalization to that of identification of the underlying modules.

Meta-Learning

Paper
Code

Gated recurrent neural networks discover attention

no code implementations • 4 Sep 2023 • Nicolas Zucchet, Seijin Kobayashi, Yassir Akram, Johannes von Oswald, Maxime Larcher, Angelika Steger, João Sacramento

In particular, we examine RNNs trained to solve simple in-context learning tasks on which Transformers are known to excel and find that gradient descent instills in our RNNs the same attention-based in-context learning algorithm used by Transformers.

In-Context Learning

Paper
Add Code

Random initialisations performing above chance and how to find them

1 code implementation • 15 Sep 2022 • Frederik Benzing, Simon Schug, Robert Meier, Johannes von Oswald, Yassir Akram, Nicolas Zucchet, Laurence Aitchison, Angelika Steger

Neural networks trained with stochastic gradient descent (SGD) starting from different random initialisations typically find functionally very similar solutions, raising the question of whether there are meaningful differences between different SGD solutions.

Paper
Code

Solving Static Permutation Mastermind using $O(n \log n)$ Queries

no code implementations • 3 Mar 2021 • Maxime Larcher, Anders Martinsson, Angelika Steger

Permutation Mastermind is a version of the classical mastermind game in which the number of positions $n$ is equal to the number of colors $k$, and repetition of colors is not allowed, neither in the codeword nor in the queries.

Combinatorics Probability

Paper
Add Code

Improving Gradient Estimation in Evolutionary Strategies With Past Descent Directions

no code implementations • 11 Oct 2019 • Florian Meier, Asier Mujika, Marcelo Matheus Gauy, Angelika Steger

Finally, we evaluate our approach empirically on MNIST and reinforcement learning tasks and show that it considerably improves the gradient estimation of ES at no extra computational cost.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses

no code implementations • 11 Oct 2019 • Asier Mujika, Felix Weissenberger, Angelika Steger

Learning long-term dependencies is a key long-standing challenge of recurrent neural networks (RNNs).

Paper
Add Code

Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning

1 code implementation • 11 Feb 2019 • Frederik Benzing, Marcelo Matheus Gauy, Asier Mujika, Anders Martinsson, Angelika Steger

In contrast, the online training algorithm Real Time Recurrent Learning (RTRL) provides untruncated gradients, with the disadvantage of impractically large computational costs.

Memorization

Paper
Code

The linear hidden subset problem for the (1+1) EA with scheduled and adaptive mutation rates

no code implementations • 16 Aug 2018 • Hafsteinn Einarsson, Marcelo Matheus Gauy, Johannes Lengler, Florian Meier, Asier Mujika, Angelika Steger, Felix Weissenberger

For the first setup, we give a schedule that achieves a runtime of $(1\pm o(1))\beta n \ln n$, where $\beta \approx 3. 552$, which is an asymptotic improvement over the runtime of the static setup.

Evolutionary Algorithms

Paper
Add Code

When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument

no code implementations • 3 Aug 2018 • Johannes Lengler, Anders Martinsson, Angelika Steger

Hillclimbing is an essential part of any optimization algorithm.

Paper
Add Code

Approximating Real-Time Recurrent Learning with Random Kronecker Factors

no code implementations • NeurIPS 2018 • Asier Mujika, Florian Meier, Angelika Steger

Despite all the impressive advances of recurrent neural networks, sequential data is still in need of better modelling.

Memorization

Paper
Add Code

Fast-Slow Recurrent Neural Networks

1 code implementation • NeurIPS 2017 • Asier Mujika, Florian Meier, Angelika Steger

Processing sequential data of variable length is a major challenge in a wide range of applications, such as speech recognition, language modeling, generative image modeling and machine translation.

Ranked #10 on Language Modelling on Penn Treebank (Character Level)

Language Modelling Machine Translation +2

Paper
Code

Drift Analysis and Evolutionary Algorithms Revisited

no code implementations • 10 Aug 2016 • Johannes Lengler, Angelika Steger

One of the easiest randomized greedy optimization algorithms is the following evolutionary algorithm which aims at maximizing a boolean function $f:\{0, 1\}^n \to {\mathbb R}$.

Evolutionary Algorithms

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.