no code implementations • 16 May 2024 • Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos
Despite noise and caption quality having been acknowledged as important factors impacting vision-language contrastive pre-training, in this paper, we show that the full potential of improving the training process by addressing such issues is yet to be realized.
1 code implementation • ICCV 2023 • Yassine Ouali, Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos
Vision-Language (V-L) models trained with contrastive learning to align the visual and language modalities have been shown to be strong few-shot learners.
1 code implementation • 26 Dec 2020 • Yassine Ouali, Céline Hudelot, Myriam Tami
In this paper, we explore contrastive learning for few-shot classification, in which we propose to use it as an additional auxiliary training objective acting as a data-dependent regularizer to promote more general and transferable features.
1 code implementation • ECCV 2020 • Yassine Ouali, Céline Hudelot, Myriam Tami
In this work, we propose a new unsupervised image segmentation approach based on mutual information maximization between different constructed views of the inputs.
Ranked #5 on Unsupervised Semantic Segmentation on COCO-Stuff-3
no code implementations • 25 Jun 2020 • Yassine Ouali, Victor Bouvier, Myriam Tami, Céline Hudelot
Learning Invariant Representations has been successfully applied for reconciling a source and a target domain for Unsupervised Domain Adaptation.
1 code implementation • 9 Jun 2020 • Yassine Ouali, Céline Hudelot, Myriam Tami
Deep neural networks demonstrated their ability to provide remarkable performances on a wide range of supervised learning tasks (e. g., image classification) when trained on extensive collections of labeled data (e. g., ImageNet).
5 code implementations • CVPR 2020 • Yassine Ouali, Céline Hudelot, Myriam Tami
To leverage the unlabeled examples, we enforce a consistency between the main decoder predictions and those of the auxiliary decoders, taking as inputs different perturbed versions of the encoder's output, and consequently, improving the encoder's representations.