no code implementations • ComputEL (ACL) 2022 • Séverine Guillaume, Guillaume Wisniewski, Cécile Macaire, Guillaume Jacques, Alexis Michaud, Benjamin Galliot, Maximin Coavoux, Solange Rossato, Minh-Châu Nguyên, Maxime Fily
This is a report on results obtained in the development of speech recognition tools intended to support linguistic documentation efforts.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • JEP/TALN/RECITAL 2021 • Guillaume Wisniewski, Lichao Zhou, Nicolas Ballier, François Yvon
Cet article présente les premiers résultats d’une étude en cours sur les biais de genre dans les corpus d’entraînements et dans les systèmes de traduction neuronale.
no code implementations • JEP/TALN/RECITAL 2022 • Lichao Zhu, Guillaume Wisniewski, Nicolas Ballier, François Yvon
Ce travail présente deux séries d’expériences visant à identifier les flux d’information dans les systèmes de traduction neuronaux.
no code implementations • JEP/TALN/RECITAL 2022 • Bingzhi Li, Guillaume Wisniewski, Benoît Crabbé
Ce travail aborde la question de la localisation de l’information syntaxique qui est encodée dans les représentations de transformers.
no code implementations • ACL 2022 • Bingzhi Li, Guillaume Wisniewski, Benoit Crabbé
This work addresses the question of the localization of syntactic information encoded in the transformers representations.
no code implementations • WMT (EMNLP) 2021 • Nicolas Ballier, Dahn Cho, Bilal Faye, Zong-You Ke, Hanna Martikainen, Mojca Pecman, Guillaume Wisniewski, Jean-Baptiste Yunès, Lichao Zhu, Maria Zimina-Poirot
Experiment 2 uses OpenNMT to fine-tune the model.
no code implementations • WS (NoDaLiDa) 2019 • José Carlos Rosales Núñez, Djamé Seddah, Guillaume Wisniewski
This work compares the performances achieved by Phrase-Based Statistical Machine Translation systems (PB-SMT) and attention-based Neuronal Machine Translation systems (NMT) when translating User Generated Content (UGC), as encountered in social medias, from French to English.
no code implementations • 8 Feb 2024 • Maxime Fily, Guillaume Wisniewski, Severine Guillaume, Gilles Adda, Alexis Michaud
We propose a new unsupervised method using ABX tests on audio recordings with carefully curated metadata to shed light on the type of information present in the representations.
no code implementations • 24 Oct 2023 • Lina Conti, Guillaume Wisniewski
Numerous studies have demonstrated the ability of neural language models to learn various linguistic properties without direct supervision.
no code implementations • 29 May 2023 • Séverine Guillaume, Guillaume Wisniewski, Alexis Michaud
We use max-pooling to aggregate the neural representations from a "snippet-lect" (the speech in a 5-second audio snippet) to a "doculect" (the speech in a given resource), then to dialects and languages.
no code implementations • 23 Feb 2023 • Maureen de Seyssel, Marvin Lavechin, Hadrien Titeux, Arthur Thomas, Gwendal Virlet, Andrea Santos Revilla, Guillaume Wisniewski, Bogdan Ludusan, Emmanuel Dupoux
In the lexical task, the model needs to correctly distinguish between pauses inserted between words and within words.
1 code implementation • 8 Dec 2022 • Bingzhi Li, Guillaume Wisniewski, Benoît Crabbé
The long-distance agreement, evidence for syntactic structure, is increasingly used to assess the syntactic generalization of Neural Language Models.
no code implementations • 27 Jun 2022 • Maureen de Seyssel, Guillaume Wisniewski, Emmanuel Dupoux
According to the Language Familiarity Effect (LFE), people are better at discriminating between speakers of their native language.
no code implementations • 30 Mar 2022 • Maureen de Seyssel, Marvin Lavechin, Yossi Adi, Emmanuel Dupoux, Guillaume Wisniewski
Language information, however, is very salient in the bilingual model only, suggesting CPC models learn to discriminate languages when trained on multiple languages.
no code implementations • EMNLP (BlackboxNLP) 2021 • Guillaume Wisniewski, Lichao Zhu, Nicolas Ballier, François Yvon
This paper aims at identifying the information flow in state-of-the-art machine translation systems, taking as example the transfer of gender when translating from French into English.
no code implementations • 25 Feb 2022 • Aurélien Max, Guillaume Wisniewski
Naturally-occurring instances of linguistic phenomena are important both for training and for evaluating automatic processes on text.
1 code implementation • WNUT (ACL) 2021 • José Carlos Rosales Núñez, Guillaume Wisniewski, Djamé Seddah
This work explores the capacities of character-based Neural Machine Translation to translate noisy User-Generated Content (UGC) with a strong focus on exploring the limits of such approaches to handle productive UGC phenomena, which almost by definition, cannot be seen at training time.
no code implementations • WNUT (ACL) 2021 • José Carlos Rosales Núñez, Djamé Seddah, Guillaume Wisniewski
This work takes a critical look at the evaluation of user-generated content automatic translation, the well-known specificities of which raise many challenges for MT.
no code implementations • EMNLP 2021 • Bingzhi Li, Guillaume Wisniewski, Benoit Crabbé
Many recent works have demonstrated that unsupervised sentence representations of neural networks encode syntactic information by observing that neural language models are able to predict the agreement between a verb and its subject.
1 code implementation • EACL 2021 • Bingzhi Li, Guillaume Wisniewski
We evaluate the ability of Bert embeddings to represent tense information, taking French and Chinese as a case study.
no code implementations • ComputEL 2021 • Oliver Adams, Benjamin Galliot, Guillaume Wisniewski, Nicholas Lambourne, Ben Foley, Rahasya Sanders-Dwyer, Janet Wiles, Alexis Michaud, Séverine Guillaume, Laurent Besacier, Christopher Cox, Katya Aplonova, Guillaume Jacques, Nathan Hill
This paper reports on progress integrating the speech recognition toolkit ESPnet into Elpis, a web front-end originally designed to provide access to the Kaldi automatic speech recognition toolkit.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • JEPTALNRECITAL 2020 • Alexis Michaud, Oliver Adams, S{\'e}verine Guillaume, Guillaume Wisniewski
Les syst{\`e}mes de reconnaissance automatique de la parole atteignent d{\'e}sormais des degr{\'e}s de pr{\'e}cision {\'e}lev{\'e}s sur la base d{'}un corpus d{'}entra{\^\i}nement limit{\'e} {\`a} deux ou trois heures d{'}enregistrements transcrits (pour un syst{\`e}me mono-locuteur).
no code implementations • LREC 2020 • Guillaume Wisniewski, S{\'e}verine Guillaume, Alexis Michaud
What is at stake is the accessibility of language archive data for a range of NLP tasks and beyond.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • WS 2019 • Jos{\'e} Carlos Rosales N{\'u}{\~n}ez, Djam{\'e} Seddah, Guillaume Wisniewski
We present an approach to correct noisy User Generated Content (UGC) in French aiming to produce a pretreatement pipeline to improve Machine Translation for this kind of non-canonical corpora.
no code implementations • JEPTALNRECITAL 2019 • Guillaume Wisniewski
L{'}objectif de ce travail est de pr{\'e}senter plusieurs observations, sur l{'}{\'e}valuation des analyseurs morphosyntaxique en fran{\c{c}}ais, visant {\`a} remettre en cause le cadre habituel de l{'}apprentissage statistique dans lequel les ensembles de test et d{'}apprentissage sont fix{\'e}s arbitrairement et ind{\'e}pendemment du mod{\`e}le consid{\'e}r{\'e}.
no code implementations • NAACL 2019 • Guillaume Wisniewski, Fran{\c{c}}ois Yvon
The performance of Part-of-Speech tagging varies significantly across the treebanks of the Universal Dependencies project.
no code implementations • COLING 2018 • Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Not all dependencies are equal when training a dependency parser: some are straightforward enough to be learned with only a sample of data, others embed more complexity.
no code implementations • NAACL 2018 • Marianna Apidianaki, Guillaume Wisniewski, Anne Cocos, Chris Callison-Burch
We propose a variant of a well-known machine translation (MT) evaluation metric, HyTER (Dreyer and Marcu, 2012), which exploits reference translations enriched with meaning equivalent expressions.
no code implementations • NAACL 2018 • Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Because the most common transition systems are projective, training a transition-based dependency parser often implies to either ignore or rewrite the non-projective training examples, which has an adverse impact on accuracy.
no code implementations • NAACL 2018 • Guillaume Wisniewski, Oph{\'e}lie Lacroix, Fran{\c{c}}ois Yvon
This work introduces a new strategy to compare the numerous conventions that have been proposed over the years for expressing dependency structures and discover the one for which a parser will achieve the highest parsing performance.
no code implementations • JEPTALNRECITAL 2018 • Jos{\'e} Carlos Rosales N{\'u}{\~n}ez, Guillaume Wisniewski
L{'}alternance codique est le ph{\'e}nom{\`e}ne qui consiste {\`a} alterner les langues au cours d{'}une m{\^e}me conversation ou d{'}une m{\^e}me phrase.
no code implementations • JEPTALNRECITAL 2018 • Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Ce travail montre que la d{\'e}gradation des performances souvent observ{\'e}e lors de l{'}application d{'}un analyseur morpho-syntaxique {\`a} des donn{\'e}es hors domaine r{\'e}sulte souvent d{'}incoh{\'e}rences entre les annotations des ensembles de test et d{'}apprentissage.
no code implementations • CONLL 2017 • Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
This paper describes LIMSI{'}s submission to the CoNLL 2017 UD Shared Task, which is focused on small treebanks, and how to improve low-resourced parsing only by ad hoc combination of multiple views and resources.
no code implementations • JEPTALNRECITAL 2017 • {\'E}l{\'e}onor Bartenlian, Margot Lacour, Matthieu Labeau, Alex Allauzen, re, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Ce travail cherche {\`a} comprendre pourquoi les performances d{'}un analyseur morpho-syntaxiques chutent fortement lorsque celui-ci est utilis{\'e} sur des donn{\'e}es hors domaine.
no code implementations • EACL 2017 • Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
This paper formalizes a sound extension of dynamic oracles to global training, in the frame of transition-based dependency parsers.
no code implementations • COLING 2016 • Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
This paper studies cross-lingual transfer for dependency parsing, focusing on very low-resource settings where delexicalized transfer is the only fully automatic option.
no code implementations • JEPTALNRECITAL 2016 • Oph{\'e}lie Lacroix, Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Cet article pr{\'e}sente une m{\'e}thode simple de transfert cross-lingue de d{\'e}pendances.
no code implementations • JEPTALNRECITAL 2016 • Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Dans cet article, nous proposons trois am{\'e}liorations simples pour l{'}apprentissage global d{'}analyseurs en d{\'e}pendances par transition de type A RC E AGER : un oracle non d{\'e}terministe, la reprise sur le m{\^e}me exemple apr{\`e}s une mise {\`a} jour et l{'}entra{\^\i}nement en configurations sous-optimales.
no code implementations • JEPTALNRECITAL 2016 • Rachel Bawden, Guillaume Wisniewski, H{\'e}l{\`e}ne Maynard
In this paper we investigate the impact of the integration of context into dialogue translation.
no code implementations • LREC 2016 • Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Because of the small size of Romanian corpora, the performance of a PoS tagger or a dependency parser trained with the standard supervised methods fall far short from the performance achieved in most languages.
no code implementations • JEPTALNRECITAL 2015 • Nicolas P{\'e}cheux, Alex Allauzen, re, Thomas Lavergne, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Quand on dispose de connaissances a priori sur les sorties possibles d{'}un probl{\`e}me d{'}{\'e}tiquetage, il semble souhaitable d{'}inclure cette information lors de l{'}apprentissage pour simplifier la t{\^a}che de mod{\'e}lisation et acc{\'e}l{\'e}rer les traitements.
no code implementations • JEPTALNRECITAL 2015 • Elena Knyazeva, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Gr{\^a}ce au lien que nous faisons entre apprentissage structur{\'e} et apprentissage par renforcement, nous sommes en mesure de proposer une m{\'e}thode th{\'e}oriquement bien justifi{\'e}e pour apprendre des m{\'e}thodes d{'}inf{\'e}rence approch{\'e}e. Les exp{\'e}riences que nous r{\'e}alisons sur quatre t{\^a}ches de TAL valident l{'}approche propos{\'e}e.
no code implementations • JEPTALNRECITAL 2014 • Guillaume Wisniewski, Nicolas P{\'e}cheux, Elena Knyazeva, Alex Allauzen, re, Fran{\c{c}}ois Yvon
no code implementations • LREC 2014 • Guillaume Wisniewski, Natalie K{\"u}bler, Fran{\c{c}}ois Yvon
In this paper, we present a freely available corpus of automatic translations accompanied with post-edited versions, annotated with labels identifying the different kinds of errors made by the MT system.