no code implementations • 29 Sep 2023 • Ibrahim Merad, Stéphane Gaïffas
For strongly convex objectives, we prove that the iteration converges to a concentrated distribution and derive high probability bounds on the final estimation error.
no code implementations • 20 Jun 2023 • Ibrahim Merad, Stéphane Gaïffas
We consider the optimization of a smooth and strongly convex objective using constant step-size stochastic gradient descent (SGD) and study its properties through the prism of Markov chains.
no code implementations • 10 Aug 2022 • Ibrahim Merad, Stéphane Gaïffas
We propose statistically robust and computationally efficient linear learning methods in the high-dimensional batch setting, where the number of features $d$ may exceed the sample size $n$.
1 code implementation • 31 Jan 2022 • Stéphane Gaïffas, Ibrahim Merad
This paper considers the problem of supervised learning with linear methods when both features and labels can be corrupted, either in the form of heavy tailed data and/or corrupted rows.
1 code implementation • 16 Sep 2021 • Stéphane Gaïffas, Ibrahim Merad, Yiyang Yu
We introduce WildWood (WW), a new ensemble algorithm for supervised learning of Random Forest (RF) type.
no code implementations • 2 Dec 2020 • Ibrahim Merad, Yiyang Yu, Emmanuel Bacry, Stéphane Gaïffas
Contrastive representation learning has been recently proved to be very efficient for self-supervised training.
no code implementations • 23 Dec 2019 • Jaouad Mourtada, Stéphane Gaïffas
On standard examples, this bound scales as $d/n$ with $d$ the model dimension and $n$ the sample size, and critically remains valid under model misspecification.
no code implementations • 13 Nov 2019 • Anastasiia Kabeshova, Yiyang Yu, Bertrand Lukacs, Emmanuel Bacry, Stéphane Gaïffas
We consider a long-term (18 months) relapse (urination problems still occur despite surgery), which is blurry since it is observed only through the reimbursement of a specific set of drugs for urination problems.
3 code implementations • 15 Oct 2019 • Emmanuel Bacry, Stéphane Gaïffas, Fanny Leroy, Maryan Morel, Dinh Phong Nguyen, Youcef Sebiat, Dian Sun
SCALPEL-Extraction provides fast concept extraction from a big table such as the one produced by SCALPEL-Flattening.
Distributed, Parallel, and Cluster Computing Computers and Society
2 code implementations • 25 Jun 2019 • Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet
Using a variant of the Context Tree Weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function.
no code implementations • 5 Sep 2018 • Jaouad Mourtada, Stéphane Gaïffas
Moreover, our analysis exhibits qualitative differences with other variants of the Hedge algorithm, such as the fixed-horizon version (with constant learning rate) and the one based on the so-called "doubling trick", both of which fail to adapt to the easier stochastic setting.
no code implementations • 25 Jul 2018 • Simon Bussy, Raphaël Veil, Vincent Looten, Anita Burgun, Stéphane Gaïffas, Agathe Guilloux, Brigitte Ranque, Anne-Sophie Jannot
We then compare performances of all methods both in terms of risk prediction and variable selection, with a focus on the use of Elastic-Net regularization technique.
no code implementations • 10 Jul 2018 • Martin Bompaire, Emmanuel Bacry, Stéphane Gaïffas
The minimization of convex objectives coming from linear supervised learning problems, such as penalized generalized linear models, can be formulated as finite sums of convex functions.
no code implementations • 15 Mar 2018 • Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet
Our results include consistency and convergence rates for Mondrian Trees and Forests, that turn out to be minimax optimal on the set of $s$-H\"older function with $s \in (0, 1]$ (for trees and forests) and $s \in (1, 2]$ (for forests only), assuming a proper tuning of their complexity parameter in both cases.
1 code implementation • 21 Dec 2017 • Maryan Morel, Emmanuel Bacry, Stéphane Gaïffas, Agathe Guilloux, Fanny Leroy
With the increased availability of large databases of electronic health records (EHRs) comes the chance of enhancing health risks screening.
no code implementations • 7 Dec 2017 • Alain Virouleau, Agathe Guilloux, Stéphane Gaïffas, Malgorzata Bogdan
Following a recent set of works providing methods for simultaneous robust regression and outliers detection, we consider in this paper a model of linear regression with individual intercepts, in a high-dimensional setting.
no code implementations • NeurIPS 2017 • Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet
We establish the consistency of an algorithm of Mondrian Forests, a randomized classification algorithm that can be implemented online.
no code implementations • 10 Jul 2017 • Stéphane Gaïffas, Gustaw Matulewicz
As a by-product, we point out the fact that for the Ornstein-Uhlenbeck process, one does not need an assumption of restricted eigenvalue type in order to derive fast rates for the Lasso, while it is well-known to be mandatory for linear regression for instance.
2 code implementations • 10 Jul 2017 • Emmanuel Bacry, Martin Bompaire, Stéphane Gaïffas, Soren Poulsen
Tick is a statistical learning library for Python~3, with a particular emphasis on time-dependent models, such as point processes, and tools for generalized linear models and survival analysis.
no code implementations • 24 Mar 2017 • Mokhtar Z. Alaya, Simon Bussy, Stéphane Gaïffas, Agathe Guilloux
In each group of binary features coming from the one-hot encoding of a single raw continuous feature, this penalization uses total-variation regularization together with an extra linear constraint.
1 code implementation • 24 Oct 2016 • Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot
We introduce a mixture model for censored durations (C-mix), and develop maximum likelihood inference for the joint estimation of the time distributions and latent regression parameters of the model.
1 code implementation • ICML 2017 • Massil Achab, Emmanuel Bacry, Stéphane Gaïffas, Iacopo Mastromatteo, Jean-Francois Muzy
We design a new nonparametric method that allows one to estimate the matrix of integrated kernels of a multivariate Hawkes process.
no code implementations • 4 Nov 2015 • Emmanuel Bacry, Stéphane Gaïffas, Iacopo Mastromatteo, Jean-François Muzy
We propose a fast and efficient estimation method that is able to accurately recover the parameters of a d-dimensional Hawkes point-process from a set of observations.
no code implementations • 16 Oct 2015 • Massil Achab, Agathe Guilloux, Stéphane Gaïffas, Emmanuel Bacry
We introduce a doubly stochastic proximal gradient algorithm for optimizing a finite average of smooth convex functions, whose gradients depend on numerically expensive expectations.
no code implementations • 2 Jul 2015 • Mokhtar Zahdi Alaya, Stéphane Gaïffas, Agathe Guilloux
We prove that this leads to a sharp tuning of the convex relaxation of the segmentation prior, by stating oracle inequalities with fast rates of convergence, and consistency for change-points detection.
no code implementations • 4 Jan 2015 • Emmanuel Bacry, Martin Bompaire, Stéphane Gaïffas, Jean-François Muzy
We consider the problem of unveiling the implicit network structure of node interactions (such as user interactions in a social network), based only on high-frequency timestamps.
no code implementations • 24 Dec 2014 • Emmanuel Bacry, Stéphane Gaïffas, Jean-François Muzy
This paper gives new concentration inequalities for the spectral norm of a wide class of matrix martingales in continuous time.