Search Results for author: Cyril Goutte

Found 23 papers, 1 papers with code

N-gram and Neural Models for Uralic Language Identification: NRC at VarDial 2021

no code implementations • EACL (VarDial) 2021 • Gabriel Bernier-Colborne, Serge Leger, Cyril Goutte

We describe the systems developed by the National Research Council Canada for the Uralic language identification shared task at the 2021 VarDial evaluation campaign.

Language Identification

Paper
Add Code

Challenges in Neural Language Identification: NRC at VarDial 2020

no code implementations • VarDial (COLING) 2020 • Gabriel Bernier-Colborne, Cyril Goutte

We describe the systems developed by the National Research Council Canada for the Uralic language identification shared task at the 2020 VarDial evaluation campaign.

Language Identification

Paper
Add Code

Transfer Learning Improves French Cross-Domain Dialect Identification: NRC @ VarDial 2022

no code implementations • VarDial (COLING) 2022 • Gabriel Bernier-Colborne, Serge Leger, Cyril Goutte

We describe the systems developed by the National Research Council Canada for the French Cross-Domain Dialect Identification shared task at the 2022 VarDial evaluation campaign.

Dialect Identification Language Modelling +1

Paper
Add Code

Refining an Almost Clean Translation Memory Helps Machine Translation

no code implementations • AMTA 2022 • Shivendra Bhardwa, David Alfonso-Hermelo, Philippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard

While recent studies have been dedicated to cleaning very noisy parallel corpora to improve Machine Translation training, we focus in this work on filtering a large and mostly clean Translation Memory.

Machine Translation Translation

Paper
Add Code

Human or Neural Translation?

no code implementations • COLING 2020 • Shivendra Bhardwaj, David Alfonso Hermelo, Phillippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard

Deep neural models tremendously improved machine translation.

Machine Translation Translation

Paper
Add Code

Improving Cuneiform Language Identification with BERT

no code implementations • WS 2019 • Gabriel Bernier-Colborne, Cyril Goutte, Serge L{\'e}ger

We describe the systems developed by the National Research Council Canada for the Cuneiform Language Identification (CLI) shared task at the 2019 VarDial evaluation campaign.

Language Identification

Paper
Add Code

Accurate semantic textual similarity for cleaning noisy parallel corpora using semantic machine translation evaluation metric: The NRC supervised submissions to the Parallel Corpus Filtering task

no code implementations • WS 2018 • Chi-kiu Lo, Michel Simard, Darlene Stewart, Samuel Larkin, Cyril Goutte, Patrick Littell

We present our semantic textual similarity approach in filtering a noisy web crawled parallel corpus using YiSi{---}a novel semantic machine translation evaluation metric.

Machine Translation Semantic Textual Similarity +1

Paper
Add Code

Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task

no code implementations • WS 2018 • Patrick Littell, Samuel Larkin, Darlene Stewart, Michel Simard, Cyril Goutte, Chi-kiu Lo

The WMT18 shared task on parallel corpus filtering (Koehn et al., 2018b) challenged teams to score sentence pairs from a large high-recall, low-precision web-scraped parallel corpus (Koehn et al., 2018a).

Anomaly Detection Machine Translation +1

Paper
Add Code

Real-time Change Point Detection using On-line Topic Models

no code implementations • COLING 2018 • Yunli Wang, Cyril Goutte

Detecting changes within an unfolding event in real time from news articles or social media enables to react promptly to serious issues in public safety, public health or natural disasters.

Change Point Detection Time Series Analysis +1

Paper
Add Code

EuroGames16: Evaluating Change Detection in Online Conversation

1 code implementation • LREC 2018 • Cyril Goutte, Yunli Wang, Fangming Liao, Zachary Zanussi, Samuel Larkin, Yuri Grinberg

Change Detection Time Series Analysis

Paper
Code

Exploring Optimal Voting in Native Language Identification

no code implementations • WS 2017 • Cyril Goutte, Serge L{\'e}ger

We mainly explored the use of voting, and various ways to optimize the choice and number of voting systems.

Native Language Identification

Paper
Add Code

Detecting Changes in Twitter Streams using Temporal Clusters of Hashtags

no code implementations • WS 2017 • Yunli Wang, Cyril Goutte

Detecting events from social media data has important applications in public security, political issues, and public health.

Change Detection Change Point Detection +2

Paper
Add Code

Extracting Discriminative Keyphrases with Learned Semantic Hierarchies

no code implementations • COLING 2016 • Yunli Wang, Yong Jin, Xiaodan Zhu, Cyril Goutte

We show that such knowledge can be used to construct better discriminative keyphrase extraction systems that do not assume a static, fixed set of keyphrases for a document.

Keyphrase Extraction Specificity