no code implementations • JEP/TALN/RECITAL 2021 • Alban Petit, Caio Corro
Les auto-encodeurs variationnels sont des modèles génératifs utiles pour apprendre des représentations latentes.
no code implementations • JEP/TALN/RECITAL 2022 • Alban Petit, Caio Corro
Nous proposons un nouvel algorithme pour l’analyse sémantique fondée sur les graphes via le problème de l’arborescence généralisée couvrante.
no code implementations • EMNLP 2020 • Caio Corro
We introduce a novel chart-based algorithm for span-based parsing of discontinuous constituency trees of block degree two, including ill-nested structures.
no code implementations • JEP/TALN/RECITAL 2022 • Nicolas Devatine, Caio Corro, François Yvon
Cet article s’intéresse au transfert cross-lingue d’analyseurs en dépendances et étudie des méthodes pour limiter l’effet potentiellement néfaste pour le transfert de divergences entre l’ordre des mots dans les langues source et cible.
1 code implementation • 26 Mar 2024 • Santiago Herrera, Caio Corro, Sylvain Kahane
More specifically, we extract descriptions and rules across different languages for two linguistic phenomena, agreement and word order, using a large search space and paying special attention to the ranking order of the extracted rules.
no code implementations • 6 Mar 2024 • Pierre Colombo, Telmo Pessoa Pires, Malik Boudiaf, Dominic Culver, Rui Melo, Caio Corro, Andre F. T. Martins, Fabrizio Esposito, Vera Lúcia Raposo, Sofia Morgado, Michael Desa
In this paper, we introduce SaulLM-7B, a large language model (LLM) tailored for the legal domain.
1 code implementation • 1 Feb 2024 • Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo
We introduce CroissantLLM, a 1. 3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.
no code implementations • 21 Oct 2023 • Alban Petit, Caio Corro, François Yvon
In many Natural Language Processing applications, neural networks have been found to fail to generalize on out-of-distribution examples.
no code implementations • 15 Feb 2023 • Alban Petit, Caio Corro
We propose a novel graph-based approach for semantic parsing that resolves two problems observed in the literature: (1) seq2seq models fail on compositional generalization tasks; (2) previous work using phrase structure parsers cannot cover all the semantic parses observed in treebanks.
no code implementations • 25 Jan 2023 • Caio Corro
In this paper, we prove that separable negative log-likelihood losses for structured prediction are not necessarily Bayes consistent, or, in other words, minimizing these losses may not result in a model that predicts the most probable structure in the data distribution for a given input.
no code implementations • 10 Oct 2022 • Caio Corro
Span-based nested named-entity recognition (NER) has a cubic-time complexity using a variant of the CYK algorithm.
no code implementations • 28 Oct 2021 • Alban Petit, Caio Corro
Variational autoencoders trained to minimize the reconstruction error are sensitive to the posterior collapse problem, that is the proposal posterior distribution is always equal to the prior.
no code implementations • JEPTALNRECITAL 2020 • Caio Corro
Les algorithmes existants pour l{'}analyse en d{\'e}pendances profondes fond{\'e}e sur les graphes capables de garantir la connexit{\'e} des structures produites ne couvrent pas les corpus du fran{\c{c}}ais.
1 code implementation • 30 Mar 2020 • Caio Corro
We introduce a novel chart-based algorithm for span-based parsing of discontinuous constituency trees of block degree two, including ill-nested structures.
1 code implementation • ACL 2019 • Caio Corro, Ivan Titov
We treat projective dependency trees as latent variables in our probabilistic model and induce them in such a way as to be beneficial for a downstream task, without relying on any direct tree supervision.
no code implementations • ICLR 2019 • Caio Corro, Ivan Titov
Human annotation for syntactic parsing is expensive, and large resources are available only for a fraction of languages.
no code implementations • EMNLP 2017 • Caio Corro, Joseph Le Roux, Mathieu Lacroix
We present a new method for the joint task of tagging and non-projective dependency parsing.