Search Results for author: Cyril Goutte

Found 23 papers, 1 papers with code

N-gram and Neural Models for Uralic Language Identification: NRC at VarDial 2021

no code implementations EACL (VarDial) 2021 Gabriel Bernier-Colborne, Serge Leger, Cyril Goutte

We describe the systems developed by the National Research Council Canada for the Uralic language identification shared task at the 2021 VarDial evaluation campaign.

Language Identification

Challenges in Neural Language Identification: NRC at VarDial 2020

no code implementations VarDial (COLING) 2020 Gabriel Bernier-Colborne, Cyril Goutte

We describe the systems developed by the National Research Council Canada for the Uralic language identification shared task at the 2020 VarDial evaluation campaign.

Language Identification

Transfer Learning Improves French Cross-Domain Dialect Identification: NRC @ VarDial 2022

no code implementations VarDial (COLING) 2022 Gabriel Bernier-Colborne, Serge Leger, Cyril Goutte

We describe the systems developed by the National Research Council Canada for the French Cross-Domain Dialect Identification shared task at the 2022 VarDial evaluation campaign.

Dialect Identification Language Modelling +1

Refining an Almost Clean Translation Memory Helps Machine Translation

no code implementations AMTA 2022 Shivendra Bhardwa, David Alfonso-Hermelo, Philippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard

While recent studies have been dedicated to cleaning very noisy parallel corpora to improve Machine Translation training, we focus in this work on filtering a large and mostly clean Translation Memory.

Machine Translation Translation

Improving Cuneiform Language Identification with BERT

no code implementations WS 2019 Gabriel Bernier-Colborne, Cyril Goutte, Serge L{\'e}ger

We describe the systems developed by the National Research Council Canada for the Cuneiform Language Identification (CLI) shared task at the 2019 VarDial evaluation campaign.

Language Identification

Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task

no code implementations WS 2018 Patrick Littell, Samuel Larkin, Darlene Stewart, Michel Simard, Cyril Goutte, Chi-kiu Lo

The WMT18 shared task on parallel corpus filtering (Koehn et al., 2018b) challenged teams to score sentence pairs from a large high-recall, low-precision web-scraped parallel corpus (Koehn et al., 2018a).

Anomaly Detection Machine Translation +1

Real-time Change Point Detection using On-line Topic Models

no code implementations COLING 2018 Yunli Wang, Cyril Goutte

Detecting changes within an unfolding event in real time from news articles or social media enables to react promptly to serious issues in public safety, public health or natural disasters.

Change Point Detection Time Series Analysis +1

Exploring Optimal Voting in Native Language Identification

no code implementations WS 2017 Cyril Goutte, Serge L{\'e}ger

We mainly explored the use of voting, and various ways to optimize the choice and number of voting systems.

Native Language Identification

Detecting Changes in Twitter Streams using Temporal Clusters of Hashtags

no code implementations WS 2017 Yunli Wang, Cyril Goutte

Detecting events from social media data has important applications in public security, political issues, and public health.

Change Detection Change Point Detection +2

Extracting Discriminative Keyphrases with Learned Semantic Hierarchies

no code implementations COLING 2016 Yunli Wang, Yong Jin, Xiaodan Zhu, Cyril Goutte

We show that such knowledge can be used to construct better discriminative keyphrase extraction systems that do not assume a static, fixed set of keyphrases for a document.

Keyphrase Extraction Specificity

Advances in Ngram-based Discrimination of Similar Languages

no code implementations WS 2016 Cyril Goutte, Serge L{\'e}ger

We describe the systems entered by the National Research Council in the 2016 shared task on discriminating similar languages.

Discriminating Similar Languages: Evaluations and Explorations

no code implementations LREC 2016 Cyril Goutte, Serge Léger, Shervin Malmasi, Marcos Zampieri

We present an analysis of the performance of machine learning classifiers on discriminating between similar languages and language varieties.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.