Search Results for author: Luca Franceschi

Found 15 papers, 9 papers with code

On the Iteration Complexity of Hypergradient Computations

no code implementations • ICML 2020 • Riccardo Grazzi, Saverio Salzo, Massimiliano Pontil, Luca Franceschi

We study a general class of bilevel optimization problems, in which the upper-level objective is defined via the solution of a fixed point equation.

Bilevel Optimization Computational Efficiency +1

Paper
Add Code

Explaining Probabilistic Models with Distributional Values

no code implementations • 15 Feb 2024 • Luca Franceschi, Michele Donini, Cédric Archambeau, Matthias Seeger

We argue that often there is a critical mismatch between what one wishes to explain (e. g. the output of a classifier) and what current methods such as SHAP explain (e. g. the scalar probability of a class).

Paper
Add Code

DAG Learning on the Permutahedron

1 code implementation • 27 Jan 2023 • Valentina Zantedeschi, Luca Franceschi, Jean Kaddour, Matt J. Kusner, Vlad Niculae

We propose a continuous optimization framework for discovering a latent directed acyclic graph (DAG) from observational data.

Paper
Code

Learning Discrete Directed Acyclic Graphs via Backpropagation

no code implementations • 27 Oct 2022 • Andrew J. Wren, Pasquale Minervini, Luca Franceschi, Valentina Zantedeschi

Recently continuous relaxations have been proposed in order to learn Directed Acyclic Graphs (DAGs) from data by backpropagation, instead of using combinatorial optimization.

Combinatorial Optimization

Paper
Add Code

Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models

1 code implementation • 11 Sep 2022 • Pasquale Minervini, Luca Franceschi, Mathias Niepert

In this work, we present Adaptive IMLE (AIMLE), the first adaptive gradient estimator for complex discrete distributions: it adaptively identifies the target distribution for IMLE by trading off the density of gradient information with the degree of bias in the gradient estimates.

Paper
Code

ReFactor GNNs: Revisiting Factorisation-based Models from a Message-Passing Perspective

no code implementations • 20 Jul 2022 • Yihong Chen, Pushkar Mishra, Luca Franceschi, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel

Factorisation-based Models (FMs), such as DistMult, have enjoyed enduring success for Knowledge Graph Completion (KGC) tasks, often outperforming Graph Neural Networks (GNNs).

Knowledge Graph Completion

Paper
Add Code

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

2 code implementations • NeurIPS 2021 • Mathias Niepert, Pasquale Minervini, Luca Franceschi

We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components.

Combinatorial Optimization

Paper
Code

On the Iteration Complexity of Hypergradient Computation

1 code implementation • 29 Jun 2020 • Riccardo Grazzi, Luca Franceschi, Massimiliano Pontil, Saverio Salzo

We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation.

Computational Efficiency Hyperparameter Optimization +1

118

Paper
Code

MARTHE: Scheduling the Learning Rate Via Online Hypergradients

1 code implementation • 18 Oct 2019 • Michele Donini, Luca Franceschi, Massimiliano Pontil, Orchid Majumder, Paolo Frasconi

We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization.

Hyperparameter Optimization Scheduling

289

Paper
Code

Scheduling the Learning Rate Via Hypergradients: New Insights and a New Algorithm

no code implementations • 25 Sep 2019 • Michele Donini, Luca Franceschi, Orchid Majumder, Massimiliano Pontil, Paolo Frasconi

We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization.

Hyperparameter Optimization Scheduling

Paper
Add Code

Learning Discrete Structures for Graph Neural Networks

2 code implementations • 28 Mar 2019 • Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He

With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph.

Ranked #3 on Node Classification on Cora: fixed 20 node per class

Music Genre Recognition Node Classification

188

Paper
Code

Far-HO: A Bilevel Programming Package for Hyperparameter Optimization and Meta-Learning

2 code implementations • 13 Jun 2018 • Luca Franceschi, Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo, Paolo Frasconi

In (Franceschi et al., 2018) we proposed a unified mathematical framework, grounded on bilevel programming, that encompasses gradient-based hyperparameter optimization and meta-learning.

Hyperparameter Optimization Meta-Learning

188

Paper
Code

Bilevel Programming for Hyperparameter Optimization and Meta-Learning

no code implementations • ICML 2018 • Luca Franceschi, Paolo Frasconi, Saverio Salzo, Riccardo Grazzi, Massimilano Pontil

We introduce a framework based on bilevel programming that unifies gradient-based hyperparameter optimization and meta-learning.

Few-Shot Learning Hyperparameter Optimization

Paper
Add Code

A Bridge Between Hyperparameter Optimization and Learning-to-learn

1 code implementation • 18 Dec 2017 • Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil

We consider a class of a nested optimization problems involving inner and outer objectives.

Few-Shot Learning Hyperparameter Optimization

188

Paper
Code

Forward and Reverse Gradient-Based Hyperparameter Optimization

2 code implementations • ICML 2017 • Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil

We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent.

Hyperparameter Optimization

188

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.