no code implementations • 8 Dec 2022 • Olivier Bousquet, Haim Kaplan, Aryeh Kontorovich, Yishay Mansour, Shay Moran, Menachem Sadigurschi, Uri Stemmer
We construct a universally Bayes consistent learning rule that satisfies differential privacy (DP).
no code implementations • 4 Oct 2022 • Peter L. Bartlett, Philip M. Long, Olivier Bousquet
We consider Sharpness-Aware Minimization (SAM), a gradient-based optimization method for deep networks that has exhibited performance improvements on image and language prediction problems.
no code implementations • 29 Sep 2022 • Andrew Drozdov, Nathanael Schärli, Ekin Akyürek, Nathan Scales, Xinying Song, Xinyun Chen, Olivier Bousquet, Denny Zhou
Humans can reason compositionally when presented with new tasks.
Ranked #1 on Semantic Parsing on CFQ
no code implementations • 31 Aug 2022 • Olivier Bousquet, Steve Hanneke, Shay Moran, Jonathan Shafer, Ilya Tolstikhin
We solve this problem in a principled manner, by introducing a combinatorial dimension called VCL that characterizes the best $d'$ for which $d'/n$ is a strong minimax lower bound.
no code implementations • 21 May 2022 • Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi
Chain-of-thought prompting has demonstrated remarkable performance on various natural language reasoning tasks.
Ranked #97 on Arithmetic Reasoning on GSM8K
no code implementations • 10 Feb 2022 • Olivier Bousquet, Amit Daniely, Haim Kaplan, Yishay Mansour, Shay Moran, Uri Stemmer
Our transformation readily implies monotone learners in a variety of contexts: for example it extends Pestov's result to classification tasks with an arbitrary number of labels.
no code implementations • 17 Aug 2021 • Olivier Bousquet, Mark Braverman, Klim Efremenko, Gillat Kol, Shay Moran
We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem, outputting $q$ such that $\mathsf{TV}(p, q) \leq2 \cdot opt + \eps$, with a (nearly) optimal sample complexity of~$\tilde O(\log n/\epsilon^2)$.
no code implementations • NeurIPS 2020 • Olivier Bousquet, Roi Livni, Shay Moran
We study the sample complexity of private synthetic data generation over an unbounded sized class of statistical queries, and show that any class that is privately proper PAC learnable admits a private synthetic data generator (perhaps non-efficient).
no code implementations • 9 Nov 2020 • Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, Amir Yehudayoff
How quickly can a given class of concepts be learned from examples?
no code implementations • NeurIPS 2020 • Hartmut Maennel, Ibrahim Alabdulmohsin, Ilya Tolstikhin, Robert J. N. Baldock, Olivier Bousquet, Sylvain Gelly, Daniel Keysers
We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling.
no code implementations • 24 May 2020 • Olivier Bousquet, Steve Hanneke, Shay Moran, Nikita Zhivotovskiy
It has been recently shown by Hanneke (2016) that the optimal sample complexity of PAC learning for any VC class C is achieved by a particular improper learning algorithm, which outputs a specific majority-vote of hypotheses in C. This leaves the question of when this bound can be achieved by proper learning algorithms, which are restricted to always output a hypothesis from C. In this paper we aim to characterize the classes for which the optimal sample complexity can be achieved by a proper learning algorithm.
1 code implementation • 26 Feb 2020 • Thomas Unterthiner, Daniel Keysers, Sylvain Gelly, Olivier Bousquet, Ilya Tolstikhin
Furthermore, the predictors are able to rank networks trained on different, unobserved datasets and with different architectures.
3 code implementations • ICLR 2020 • Daniel Keysers, Nathanael Schärli, Nathan Scales, Hylke Buisman, Daniel Furrer, Sergii Kashubin, Nikola Momchev, Danila Sinopalnikov, Lukasz Stafiniak, Tibor Tihon, Dmitry Tsarkov, Xiao Wang, Marc van Zee, Olivier Bousquet
We present a large and realistic natural language question answering dataset that is constructed according to this method, and we use it to analyze the compositional generalization ability of three machine learning architectures.
Ranked #5 on Semantic Parsing on CFQ
no code implementations • 28 Oct 2019 • Olivier Bousquet, Nikita Zhivotovskiy
First, we consider classification with a reject option, namely Chow's reject option model, and show that by slightly lowering the impact of hard instances, a learning rate of order $O\left(\frac{d}{n}\log \frac{n}{d}\right)$ is always achievable in the agnostic setting by a specific learning algorithm.
no code implementations • 17 Oct 2019 • Olivier Bousquet, Yegor Klochkov, Nikita Zhivotovskiy
In a series of recent breakthrough papers by Feldman and Vondrak (2018, 2019), it was shown that the best known high probability upper bounds for uniformly stable learning algorithms due to Bousquet and Elisseef (2002) are sub-optimal in some natural regimes.
2 code implementations • arXiv 2020 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby
And, how close are we to general visual representations?
Ranked #10 on Image Classification on VTAB-1k (using extra training data)
no code implementations • 25 Sep 2019 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby
Representation learning promises to unlock deep learning for the long tail of vision tasks without expansive labelled datasets.
1 code implementation • 25 Jul 2019 • Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly
Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner.
no code implementations • 28 May 2019 • Christina Göpfert, Shai Ben-David, Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Ruth Urner
In semi-supervised classification, one is given access both to labeled and unlabeled data.
1 code implementation • NeurIPS 2019 • Paul K. Rubenstein, Olivier Bousquet, Josip Djolonga, Carlos Riquelme, Ilya Tolstikhin
The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning.
no code implementations • 26 May 2019 • Josip Djolonga, Mario Lucic, Marco Cuturi, Olivier Bachem, Olivier Bousquet, Sylvain Gelly
Despite the tremendous progress in the estimation of generative models, the development of tools for diagnosing their failures and assessing their performance has advanced at a much slower pace.
no code implementations • 10 Feb 2019 • Olivier Bousquet, Daniel Kane, Shay Moran
We complement and extend this result by showing that: (i) the factor 3 can not be improved if one restricts the algorithm to output a density from $\mathcal{Q}$, and (ii) if one allows the algorithm to output arbitrary densities (e. g.\ a mixture of densities from $\mathcal{Q}$), then the approximation factor can be reduced to 2, which is optimal.
no code implementations • 9 Feb 2019 • Olivier Bousquet, Roi Livni, Shay Moran
We study the sample complexity of private synthetic data generation over an unbounded sized class of statistical queries, and show that any class that is privately proper PAC learnable admits a private synthetic data generator (perhaps non-efficient).
4 code implementations • NeurIPS 2018 • Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly
Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison.
no code implementations • 22 Mar 2018 • Hartmut Maennel, Olivier Bousquet, Sylvain Gelly
Deep neural networks are often trained in the over-parametrized regime (i. e. with far more parameters than training examples), and understanding why the training converges to solutions that generalize remains an open problem.
no code implementations • ICLR 2018 • Damien Vincent, Sylvain Gelly, Nicolas Le Roux, Olivier Bousquet
We propose an efficient online hyperparameter optimization method which uses a joint dynamical system to evaluate the gradient with respect to the hyperparameters.
9 code implementations • NeurIPS 2018 • Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, Olivier Bousquet
Generative adversarial networks (GAN) are a powerful subclass of generative models.
13 code implementations • ICLR 2018 • Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf
We propose the Wasserstein Auto-Encoder (WAE)---a new algorithm for building a generative model of the data distribution.
no code implementations • 10 Jun 2017 • Olivier Bousquet, Sylvain Gelly, Karol Kurach, Olivier Teytaud, Damien Vincent
The selection of hyper-parameters is critical in Deep Learning.
no code implementations • 10 Jun 2017 • Olivier Bousquet, Sylvain Gelly, Karol Kurach, Marc Schoenauer, Michele Sebag, Olivier Teytaud, Damien Vincent
This paper aims at one-shot learning of deep neural nets, where a highly parallel setting is considered to address the algorithm calibration problem - selecting the best neural architecture and learning hyper-parameter values depending on the dataset at hand.
no code implementations • NeurIPS 2017 • Shuang Liu, Olivier Bousquet, Kamalika Chaudhuri
In this paper, we address these questions in a broad and unified setting by defining a notion of adversarial divergences that includes a number of recently proposed objective functions.
no code implementations • 23 May 2017 • Karol Kurach, Sylvain Gelly, Michal Jastrzebski, Philip Haeusser, Olivier Teytaud, Damien Vincent, Olivier Bousquet
Generic text embeddings are successfully used in a variety of tasks.
1 code implementation • 22 May 2017 • Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, Bernhard Schoelkopf
We study unsupervised generative modeling in terms of the optimal transport (OT) problem between true (but unknown) data distribution $P_X$ and the latent variable model distribution $P_G$.
1 code implementation • NeurIPS 2017 • Ilya Tolstikhin, Sylvain Gelly, Olivier Bousquet, Carl-Johann Simon-Gabriel, Bernhard Schölkopf
Generative Adversarial Networks (GAN) (Goodfellow et al., 2014) are an effective method for training generative models of complex data such as natural images.
no code implementations • NeurIPS 2007 • Léon Bottou, Olivier Bousquet
This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms.