WMT 2016 Biomedical Dataset | Papers With Code

Name:*

Full name (optional):

Description (Markdown and $\LaTeX$ enabled):*

The Biomedical Translation Shared Task was first introduced at the First Conference of Machine Translation. The task aims to evaluate systems for the translation of biomedical titles and abstracts from scientific publications. The data includes three language pairs (English ↔ Portuguese, English  ↔ Spanish, English  ↔ French) and two sub-domains of biological sciences and health sciences.

The training data consists mainly of the Scielo corpus, a parallel collection of scientific publications composed of either titles, abstracts or title and abstracts which were retrieved from the Scielo database. For the Scielo corpus, a parallel documents are provided for all language pairs in the two sub-domains, except for the English  ↔ French, where only health was considered, as there were inadequate parallel documents available for biology in that pair. The training data was aligned using the GMA alignment tool. Additionally, a corpus of parallel titles from MEDLINEⓇ for all three language pairs were provided as well as monolingual documents for the four languages, retrieved from the Scielo database. These consist of documents in the Scielo database which have no corresponding document in another language.

The test set consisted of 500 documents (title and abstract) for each of the two directions of each language pair. None of the test documents was included in the training data and there is no overlap of documents between the test sets for any language pair, translation direction and sub-domain.

Source: [http://www.statmt.org/wmt16/index.html](http://www.statmt.org/wmt16/index.html)
Image Source: [https://www.aclweb.org/anthology/W16-2301.pdf](https://www.aclweb.org/anthology/W16-2301.pdf)

Homepage URL (optional):

Paper where the dataset was introduced:

Introduction date:

Dataset license:

URL to full license terms:

Image

Currently

datasets/WMT_2016_Biomedical-0000003373-855f9230.jpg Clear

Change

---

WMT 2016 Biomedical (WMT 2016 Biomedical Translation Task)

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

WMT 2016 News

WMT 2016 IT

License

Modalities

Languages

WMT 2016 Biomedical (WMT 2016 Biomedical Translation Task)

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit