no code implementations • 13 Feb 2024 • Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo Samuel Castro
The recent rapid progress in (self) supervised learning models is in large part predicted by empirical scaling laws: a model's performance scales proportionally to its size.
1 code implementation • 28 Aug 2023 • Murat Onur Yildirim, Elif Ceren Gok Yildirim, Ghada Sokar, Decebal Constantin Mocanu, Joaquin Vanschoren
Therefore, we perform a comprehensive study in which we investigate various DST components to find the best topology per task on well-known CIFAR100 and miniImageNet benchmarks in a task-incremental CL setup since our primary focus is to evaluate the performance of various DST criteria, rather than the process of mask selection.
1 code implementation • 24 Feb 2023 • Ghada Sokar, Rishabh Agarwal, Pablo Samuel Castro, Utku Evci
In this work we identify the dormant neuron phenomenon in deep reinforcement learning, where an agent's network suffers from an increasing number of inactive neurons, thereby affecting network expressivity.
1 code implementation • 13 Feb 2023 • Bram Grooten, Ghada Sokar, Shibhansh Dohare, Elena Mocanu, Matthew E. Taylor, Mykola Pechenizkiy, Decebal Constantin Mocanu
Tomorrow's robots will need to distinguish useful information from noise when performing different tasks.
1 code implementation • 26 Nov 2022 • Ghada Sokar, Zahra Atashgahi, Mykola Pechenizkiy, Decebal Constantin Mocanu
Our proposed approach outperforms the state-of-the-art methods in terms of selecting informative features while reducing training iterations and computational costs substantially.
1 code implementation • 11 Oct 2021 • Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy
To address this challenge, we propose a new CL method, named AFAF, that aims to Avoid Forgetting and Allow Forward transfer in class-IL using fix-capacity models.
2 code implementations • ICLR 2022 • Shiwei Liu, Tianlong Chen, Zahra Atashgahi, Xiaohan Chen, Ghada Sokar, Elena Mocanu, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu
Our framework, FreeTickets, is defined as the ensemble of these relatively cheap sparse subnetworks.
1 code implementation • 8 Jun 2021 • Ghada Sokar, Elena Mocanu, Decebal Constantin Mocanu, Mykola Pechenizkiy, Peter Stone
In this paper, we introduce for the first time a dynamic sparse training approach for deep reinforcement learning to accelerate the training process.
1 code implementation • 28 Jan 2021 • Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy
In this paper, we propose a new method, named Self-Attention Meta-Learner (SAM), which learns a prior knowledge for continual learning that permits learning a sequence of tasks, while avoiding catastrophic forgetting.
1 code implementation • 15 Jan 2021 • Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy
Finally, we analyze the role of the shared invariant representation in mitigating the forgetting problem especially when the number of replayed samples for each previous task is small.
2 code implementations • 1 Dec 2020 • Zahra Atashgahi, Ghada Sokar, Tim Van der Lee, Elena Mocanu, Decebal Constantin Mocanu, Raymond Veldhuis, Mykola Pechenizkiy
This method, named QuickSelection, introduces the strength of the neuron in sparse neural networks as a criterion to measure the feature importance.
1 code implementation • 15 Jul 2020 • Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy
Regularization-based methods maintain a fixed model capacity; however, previous studies showed the huge performance degradation of these methods when the task identity is not available during inference (e. g. class incremental learning scenario).
3 code implementations • 24 Jun 2020 • Shiwei Liu, Tim Van der Lee, Anil Yaman, Zahra Atashgahi, Davide Ferraro, Ghada Sokar, Mykola Pechenizkiy, Decebal Constantin Mocanu
However, comparing different sparse topologies and determining how sparse topologies evolve during training, especially for the situation in which the sparse structure optimization is involved, remain as challenging open questions.