no code implementations • 23 Feb 2024 • Anastasiia Fadeeva, Philippe Schlattner, Andrii Maksai, Mark Collier, Efi Kokiopoulou, Jesse Berent, Claudiu Musat
In this paper, we study online handwriting recognition with VLMs, going beyond naive OCR.
no code implementations • 10 Oct 2023 • Ke Wang, Guillermo Ortiz-Jimenez, Rodolphe Jenatton, Mark Collier, Efi Kokiopoulou, Pascal Frossard
Label noise is a pervasive problem in deep learning that often compromises the generalization performance of trained models.
1 code implementation • NeurIPS 2023 • Jannik Kossen, Mark Collier, Basil Mustafa, Xiao Wang, Xiaohua Zhai, Lucas Beyer, Andreas Steiner, Jesse Berent, Rodolphe Jenatton, Efi Kokiopoulou
With 3T, we propose a more flexible strategy that allows the image tower to benefit from both pretrained embeddings and contrastive training.
no code implementations • 12 Apr 2023 • Daniel Golovin, Gabor Bartok, Eric Chen, Emily Donahue, Tzu-Kuo Huang, Efi Kokiopoulou, Ruoyan Qin, Nikhil Sarda, Justin Sybrandt, Vincent Tjeng
We are living in a golden age of machine learning.
no code implementations • 18 Feb 2022 • Mark Collier, Rodolphe Jenatton, Efi Kokiopoulou, Jesse Berent
Supervised learning datasets often have privileged information, in the form of features which are available at training time but are not available at test time e. g. the ID of the annotator that provided the label.
no code implementations • CVPR 2021 • Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent
We place a multivariate Normal distributed latent variable on the final hidden layer of a neural network classifier.
Ranked #5 on Image Classification on WebVision-1000
no code implementations • 9 Sep 2020 • Mark Collier, Efi Kokiopoulou, Andrea Gesmundo, Jesse Berent
We propose the use of sparse routing networks for continual learning.
no code implementations • 15 Mar 2020 • Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent
By tuning the softmax temperature, we improve accuracy, log-likelihood and calibration on both image classification benchmarks with controlled label noise as well as Imagenet-21k which has naturally occurring label noise.
no code implementations • 26 Nov 2019 • Alina Dubatovka, Efi Kokiopoulou, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent
However, it requires a large amount of computing resources and in order to alleviate this, a performance prediction network has been recently proposed that enables efficient architecture search by forecasting the performance of candidate architectures, instead of relying on actual model training.
no code implementations • 10 Oct 2019 • Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent
The binary allocation variables are learned jointly with the model parameters by standard back-propagation thanks to the Gumbel-Softmax reparametrization method.
Ranked #1 on Multi-Task Learning on OMNIGLOT
no code implementations • 25 Sep 2019 • Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent
We propose the Gumbel-Matrix routing, a novel multi-task routing method based on the Gumbel-Softmax, that is designed to learn fine-grained parameter sharing.
no code implementations • 15 Feb 2019 • Efi Kokiopoulou, Anja Hauth, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent
At the core of our framework lies a deep value network that can predict the performance of input architectures on a task by utilizing task meta-features and the previous model training experiments performed on related tasks.