no code implementations • ICML 2020 • Riccardo Grazzi, Saverio Salzo, Massimiliano Pontil, Luca Franceschi
We study a general class of bilevel optimization problems, in which the upper-level objective is defined via the solution of a fixed point equation.
no code implementations • 15 Feb 2024 • Luca Franceschi, Michele Donini, Cédric Archambeau, Matthias Seeger
We argue that often there is a critical mismatch between what one wishes to explain (e. g. the output of a classifier) and what current methods such as SHAP explain (e. g. the scalar probability of a class).
1 code implementation • 27 Jan 2023 • Valentina Zantedeschi, Luca Franceschi, Jean Kaddour, Matt J. Kusner, Vlad Niculae
We propose a continuous optimization framework for discovering a latent directed acyclic graph (DAG) from observational data.
no code implementations • 27 Oct 2022 • Andrew J. Wren, Pasquale Minervini, Luca Franceschi, Valentina Zantedeschi
Recently continuous relaxations have been proposed in order to learn Directed Acyclic Graphs (DAGs) from data by backpropagation, instead of using combinatorial optimization.
1 code implementation • 11 Sep 2022 • Pasquale Minervini, Luca Franceschi, Mathias Niepert
In this work, we present Adaptive IMLE (AIMLE), the first adaptive gradient estimator for complex discrete distributions: it adaptively identifies the target distribution for IMLE by trading off the density of gradient information with the degree of bias in the gradient estimates.
no code implementations • 20 Jul 2022 • Yihong Chen, Pushkar Mishra, Luca Franceschi, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel
Factorisation-based Models (FMs), such as DistMult, have enjoyed enduring success for Knowledge Graph Completion (KGC) tasks, often outperforming Graph Neural Networks (GNNs).
2 code implementations • NeurIPS 2021 • Mathias Niepert, Pasquale Minervini, Luca Franceschi
We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components.
1 code implementation • 29 Jun 2020 • Riccardo Grazzi, Luca Franceschi, Massimiliano Pontil, Saverio Salzo
We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation.
1 code implementation • 18 Oct 2019 • Michele Donini, Luca Franceschi, Massimiliano Pontil, Orchid Majumder, Paolo Frasconi
We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization.
no code implementations • 25 Sep 2019 • Michele Donini, Luca Franceschi, Orchid Majumder, Massimiliano Pontil, Paolo Frasconi
We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization.
2 code implementations • 28 Mar 2019 • Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He
With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph.
Ranked #3 on Node Classification on Cora: fixed 20 node per class
2 code implementations • 13 Jun 2018 • Luca Franceschi, Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo, Paolo Frasconi
In (Franceschi et al., 2018) we proposed a unified mathematical framework, grounded on bilevel programming, that encompasses gradient-based hyperparameter optimization and meta-learning.
no code implementations • ICML 2018 • Luca Franceschi, Paolo Frasconi, Saverio Salzo, Riccardo Grazzi, Massimilano Pontil
We introduce a framework based on bilevel programming that unifies gradient-based hyperparameter optimization and meta-learning.
1 code implementation • 18 Dec 2017 • Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil
We consider a class of a nested optimization problems involving inner and outer objectives.
2 code implementations • ICML 2017 • Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil
We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent.