Search Results for author: Andrea Schioppa

Found 7 papers, 3 papers with code

Controlling Machine Translation for Multiple Attributes with Additive Interventions

no code implementations • EMNLP 2021 • Andrea Schioppa, David Vilar, Artem Sokolov, Katja Filippova

Fine-grained control of machine translation (MT) outputs along multiple attributes is critical for many modern MT applications and is a requirement for gaining users’ trust.

Attribute Machine Translation +2

Paper
Add Code

Gradient Sketches for Training Data Attribution and Studying the Loss Landscape

no code implementations • 6 Feb 2024 • Andrea Schioppa

Two important scenarios are training data attribution (tracing a model's behavior to the training data), where one needs to store a gradient for each training example, and the study of the spectrum of the Hessian (to analyze the training dynamics), where one needs to store multiple Hessian vector products.

Paper
Add Code

Cross-Lingual Supervision improves Large Language Models Pre-training

no code implementations • 19 May 2023 • Andrea Schioppa, Xavier Garcia, Orhan Firat

The recent rapid progress in pre-training Large Language Models has relied on using self-supervised language modeling objectives like next token prediction or span corruption.

In-Context Learning Language Modelling +2

Paper
Add Code

Scaling Up Influence Functions

2 code implementations • 6 Dec 2021 • Andrea Schioppa, Polina Zablotskaia, David Vilar, Artem Sokolov

We address efficient calculation of influence functions for tracking predictions back to the training data.

Image Classification

Paper
Code

Distributed Function Minimization in Apache Spark

1 code implementation • 17 Sep 2019 • Andrea Schioppa

We report on an open-source implementation for distributed function minimization on top of Apache Spark by using gradient and quasi-Newton methods.

General Classification regression

Paper
Code

Learning to Transport with Neural Networks

1 code implementation • 4 Aug 2019 • Andrea Schioppa

We compare several approaches to learn an Optimal Map, represented as a neural network, between probability distributions.

Paper
Code

Optimality of the final model found via Stochastic Gradient Descent

no code implementations • 22 Oct 2018 • Andrea Schioppa

We study convergence properties of Stochastic Gradient Descent (SGD) for convex objectives without assumptions on smoothness or strict convexity.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.