Search Results for author: Jelmer Van der Linde

Found 5 papers, 2 papers with code

The EuroPat Corpus: A Parallel Corpus of European Patent Data

no code implementations • LREC 2022 • Kenneth Heafield, Elaine Farrow, Jelmer Van der Linde, Gema Ramírez-Sánchez, Dion Wiggins

We present the EuroPat corpus of patent-specific parallel data for 6 official European languages paired with English: German, Spanish, French, Croatian, Norwegian, and Polish.

Machine Translation Translation

Paper
Add Code

Efficient Machine Translation with Model Pruning and Quantization

1 code implementation • WMT (EMNLP) 2021 • Maximiliana Behnke, Nikolay Bogoychev, Alham Fikri Aji, Kenneth Heafield, Graeme Nail, Qianqian Zhu, Svetlana Tchistiakova, Jelmer Van der Linde, Pinzhen Chen, Sidharth Kashyap, Roman Grundkiewicz

We participated in all tracks of the WMT 2021 efficient machine translation task: single-core CPU, multi-core CPU, and GPU hardware with throughput and latency conditions.

Knowledge Distillation Machine Translation +2

1,172

Paper
Code

A New Massive Multilingual Dataset for High-Performance Language Technologies

no code implementations • 20 Mar 2024 • Ona de Gibert, Graeme Nail, Nikolay Arefyev, Marta Bañón, Jelmer Van der Linde, Shaoxiong Ji, Jaume Zaragoza-Bernabeu, Mikko Aulamo, Gema Ramírez-Sánchez, Andrey Kutuzov, Sampo Pyysalo, Stephan Oepen, Jörg Tiedemann

We present the HPLT (High Performance Language Technologies) language resources, a new massive multilingual dataset including both monolingual and bilingual corpora extracted from CommonCrawl and previously unused web crawls from the Internet Archive.

Language Modelling Machine Translation +2

Paper
Add Code

OpusCleaner and OpusTrainer, open source toolkits for training Machine Translation and Large language models

2 code implementations • 24 Nov 2023 • Nikolay Bogoychev, Jelmer Van der Linde, Graeme Nail, Barry Haddow, Jaume Zaragoza-Bernabeu, Gema Ramírez-Sánchez, Lukas Weymann, Tudor Nicolae Mateiu, Jindřich Helcl, Mikko Aulamo

Developing high quality machine translation systems is a labour intensive, challenging and confusing process for newcomers to the field.

Data Augmentation Machine Translation +2

Paper
Code

TranslateLocally: Blazing-fast translation running on the local CPU

no code implementations • EMNLP (ACL) 2021 • Nikolay Bogoychev, Jelmer Van der Linde, Kenneth Heafield

Every day, millions of people sacrifice their privacy and browsing habits in exchange for online machine translation.

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.