no code implementations • JEP/TALN/RECITAL 2022 • Merieme Bouhandi, Emmanuel Morin, Thierry Hamon
De plus, ces modèles négligent souvent les informations globales sur le vocabulaire au profit d’une plus forte dépendance à l’attention.
no code implementations • NAACL (DLG4NLP) 2022 • Merieme Bouhandi, Emmanuel Morin, Thierry Hamon
Language models encode linguistic proprieties and are used as input for more specific models.
no code implementations • LREC (BUCC) 2022 • Martin Laville, Emmanuel Morin, Phillippe Langlais
With numerous new methods proposed recently, the evaluation of Bilingual Lexicon Induction have been quite hazardous and inconsistent across works.
no code implementations • LREC 2022 • Kévin Espasa, Emmanuel Morin, Olivier Hamon
The second objective is to rank these substitutes using the context of the sentence.
no code implementations • LREC 2022 • Omar Adjali, Emmanuel Morin, Pierre Zweigenbaum
To that aim, we exploit parallel corpora to perform automatic bilingual MWT extraction and comparable corpus construction.
1 code implementation • 26 Feb 2024 • Adrien Bazoge, Emmanuel Morin, Beatrice Daille, Pierre-Antoine Gourraud
Recently, pretrained language models based on BERT have been introduced for the French biomedical domain.
1 code implementation • 20 Feb 2024 • Yanis Labrak, Adrien Bazoge, Oumaima El Khettari, Mickael Rouvier, Pacome Constant dit Beaufils, Natalia Grabar, Beatrice Daille, Solen Quiniou, Emmanuel Morin, Pierre-Antoine Gourraud, Richard Dufour
This limitation hampers the evaluation of the latest French biomedical models, as they are either assessed on a minimal number of tasks with non-standardized protocols or evaluated using general downstream tasks.
no code implementations • 15 Feb 2024 • Yanis Labrak, Adrien Bazoge, Emmanuel Morin, Pierre-Antoine Gourraud, Mickael Rouvier, Richard Dufour
This marks the first large-scale multilingual evaluation of LLMs in the medical domain.
1 code implementation • LOUHI 2022 • Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille, Pierre-Antoine Gourraud
This paper introduces FrenchMedMCQA, the first publicly available Multiple-Choice Question Answering (MCQA) dataset in French for medical domain.
no code implementations • 3 Apr 2023 • Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille, Pierre-Antoine Gourraud
In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks.
no code implementations • COLING 2020 • Martin Laville, Amir Hazem, Emmanuel Morin, Phillippe Langlais
In this paper, we contrast several data selection techniques to improve bilingual lexicon induction from specialized comparable corpora.
no code implementations • JEPTALNRECITAL 2020 • Antoine Caubri{\`e}re, Sophie Rosset, Yannick Est{\`e}ve, Antoine Laurent, Emmanuel Morin
Les derni{\`e}res donn{\'e}es disponibles pour la REN structur{\'e}es {\`a} partir de la parole en fran{\c{c}}ais proviennent de la campagne d{'}{\'e}valuation ETAPE en 2012.
no code implementations • LREC 2020 • Martin Laville, Amir Hazem, Emmanuel Morin
This paper describes the TALN/LS2N system participation at the Building and Using Comparable Corpora (BUCC) shared task.
no code implementations • LREC 2020 • Antoine Caubri{\`e}re, Sophie Rosset, Yannick Est{\`e}ve, Antoine Laurent, Emmanuel Morin
For this type of systems, we propose an original 3-pass approach.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 29 Sep 2019 • Natalia Tomashenko, Antoine Caubriere, Yannick Esteve, Antoine Laurent, Emmanuel Morin
This work investigates spoken language understanding (SLU) systems in the scenario when the semantic information is extracted directly from the speech signal by means of a single end-to-end neural network model.
no code implementations • 30 Jul 2019 • Basma El Amel Boussaha, Nicolas Hernandez, Christine Jacquin, Emmanuel Morin
Building dialogue systems that naturally converse with humans is being an attractive and an active research domain.
no code implementations • JEPTALNRECITAL 2019 • Antoine Caubri{\`e}re, Natalia Tomashenko, Yannick Est{\`e}ve, Antoine Laurent, Emmanuel Morin
Les r{\'e}sultats montrent un int{\'e}r{\^e}t {\`a} l{'}utilisation des donn{\'e}es d{'}entit{\'e}s nomm{\'e}es, permettant un gain relatif allant jusqu{'}{\`a} 6, 5 {\%}.
no code implementations • 18 Jun 2019 • Antoine Caubrière, Natalia Tomashenko, Antoine Laurent, Emmanuel Morin, Nathalie Camelin, Yannick Estève
We present an end-to-end approach to extract semantic concepts directly from the speech audio signal.
1 code implementation • COLING 2018 • Jingshu Liu, Emmanuel Morin, Pe{\~n}a Saldarriaga
Extracting a bilingual terminology for multi-word terms from comparable corpora has not been widely researched.
no code implementations • COLING 2018 • Adeline Granet, Emmanuel Morin, Harold Mouch{\`e}re, Solen Quiniou, Christian Viard-Gaudin
We obtain a 97. 42{\%} Character Recognition Rate and a 86. 57{\%} Word Recognition Rate on our Italian Comedy data, despite a lexical coverage of 67{\%} between the Italian Comedy data and the training data.
no code implementations • COLING 2018 • Amir Hazem, Emmanuel Morin
For that purpose, we propose the first systematic evaluation of different word embedding models for bilingual terminology extraction from specialized comparable corpora.
no code implementations • 30 May 2018 • Sahar Ghannay, Antoine Caubrière, Yannick Estève, Antoine Laurent, Emmanuel Morin
Until now, NER from speech is made through a pipeline process that consists in processing first an automatic speech recognition (ASR) on the audio and then processing a NER on the ASR outputs.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • LREC 2018 • Adeline Granet, Benjamin Hervy, Geoffrey Roman-Jimenez, Marouane Hachicha, Emmanuel Morin, Harold Mouch{\`e}re, Solen Quiniou, Guillaume Raschia, Fran{\c{c}}oise Rubellin, Christian Viard-Gaudin
no code implementations • JEPTALNRECITAL 2018 • Jingshu Liu, Emmanuel Morin, Sebasti{\'a}n Pe{\~n}a Saldarriaga
Nous proposons dans cet article une adaptation de l{'}approche compositionnelle {\'e}tendue capable d{'}aligner des termes de longueurs variables {\`a} partir de corpus comparables, en modifiant la repr{\'e}sentation des termes complexes.
no code implementations • JEPTALNRECITAL 2018 • Adeline Granet, Emmanuel Morin, Harold Mouch{\`e}re, Solen Quiniou, Christian Viard-Gaudin
Nous obtenons 97, 27{\%} de caract{\`e}res bien reconnus sur les donn{\'e}es de la Com{\'e}die-Italienne, ainsi que 86, 57{\%} de mots correctement g{\'e}n{\'e}r{\'e}s malgr{\'e} une couverture de 67, 58{\%} uniquement entre la Com{\'e}die-Italienne et l{'}ensemble d{'}apprentissage.
no code implementations • JEPTALNRECITAL 2018 • Basma El Amel Boussaha, Hern, Nicolas ez, Christine Jacquin, Emmanuel Morin
En se basant sur la similarit{\'e} s{\'e}mantique entre le contexte et la r{\'e}ponse, notre approche apprend {\`a} mieux distinguer les bonnes r{\'e}ponses des mauvaises.
no code implementations • IJCNLP 2017 • Amir Hazem, Emmanuel Morin
Bilingual lexicon extraction from comparable corpora is constrained by the small amount of available data when dealing with specialized domains.
no code implementations • WS 2017 • R{\'e}mi Bois, Guillaume Gravier, Eric Jamet, Emmanuel Morin, Pascale S{\'e}billot, Maxime Robert
Faced with ever-growing news archives, media professionals are in need of advanced tools to explore the information surrounding specific events.
no code implementations • COLING 2016 • Amir Hazem, Emmanuel Morin
Comparable corpora are the main alternative to the use of parallel corpora to extract bilingual lexicons.
no code implementations • JEPTALNRECITAL 2016 • Soufian Salim, Hern, Nicolas ez, Emmanuel Morin
D{'}autres exp{\'e}riences sont d{\'e}taill{\'e}es, et nous rapportons les r{\'e}sultats obtenus avec diff{\'e}rentes approches et diff{\'e}rents traits sur les diff{\'e}rentes parties de notre corpus multimodal.
no code implementations • JEPTALNRECITAL 2016 • Alexis Linard, Emmanuel Morin, B{\'e}atrice Daille
Des travaux pr{\'e}c{\'e}dents en extraction de lexiques bilingues {\`a} partir de corpus parall{\`e}les ont d{\'e}montr{\'e} que l{'}utilisation de plus de deux langues peut {\^e}tre utile pour am{\'e}liorer la qualit{\'e} des alignements extraits.
no code implementations • JEPTALNRECITAL 2016 • Joseph Lark, Emmanuel Morin, Sebasti{\'a}n Pe{\~n}a Saldarriaga
Nous d{\'e}tectons dans des corpus d{'}avis clients en fran{\c{c}}ais des expressions d{'}opinion ne contenant pas de marqueur d{'}opinion explicitement positif ou n{\'e}gatif.
no code implementations • LREC 2016 • Amir Hazem, Emmanuel Morin
There is a rich flora of word space models that have proven their efficiency in many different applications including information retrieval (Dumais, 1988), word sense disambiguation (Schutze, 1992), various semantic knowledge tests (Lund et al., 1995; Karlgren, 2001), and text categorization (Sahlgren, 2005).
no code implementations • JEPTALNRECITAL 2015 • Firas Hmida, Emmanuel Morin, B{\'e}atrice Daille
Les banques terminologiques et les dictionnaires sont des ressources pr{\'e}cieuses qui facilitent l{'}acc{\`e}s aux connaissances des domaines sp{\'e}cialis{\'e}s. Ces ressources sont souvent assez pauvres et ne proposent pas toujours pour un terme {\`a} illustrer des exemples permettant d{'}appr{\'e}hender le sens et l{'}usage de ce terme.
no code implementations • JEPTALNRECITAL 2015 • Remi Bois, Guillaume Gravier, Emmanuel Morin, Pascale S{\'e}billot
Nous pr{\'e}sentons une typologie de liens pour un corpus multim{\'e}dia ancr{\'e} dans le domaine journalistique.
no code implementations • JEPTALNRECITAL 2015 • Joseph Lark, Emmanuel Morin, Sebasti{\'a}n Pe{\~n}a Saldarriaga
La fouille d{'}opinion cibl{\'e}e (aspect-based sentiment analysis) fait l{'}objet ces derni{\`e}res ann{\'e}es d{'}un int{\'e}r{\^e}t particulier, visible dans les sujets des r{\'e}centes campagnes d{'}{\'e}valuation comme SemEval 2014 et 2015 ou bien DEFT 2015.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)
no code implementations • LREC 2012 • Amir Hazem, Emmanuel Morin
One of the main resources used for the task of bilingual lexicon extraction from comparable corpora is : the bilingual dictionary, which is considered as a bridge between two languages.