TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Sentiment Analysis	DBRD	RobBERT v2	Accuracy	95.144%	# 1
Sentiment Analysis	DBRD	RobBERT v2	F1	95.144%	# 1
Sentiment Analysis	DBRD	RobBERT	Accuracy	94.422%	# 2
Sentiment Analysis	DBRD	RobBERT	F1	94.422%	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/robbert-a-dutch-roberta-based-language-model/sentiment-analysis-on-dbrd)](https://paperswithcode.com/sota/sentiment-analysis-on-dbrd?p=robbert-a-dutch-roberta-based-language-model)`

RobBERT: a Dutch RoBERTa-based Language Model

Findings of the Association for Computational Linguistics 2020 · Pieter Delobelle, Thomas Winters, Bettina Berendt ·

Pre-trained language models have been dominating the field of natural language processing in recent years, and have led to significant performance gains for various complex natural language tasks. One of the most prominent pre-trained language models is BERT, which was released as an English as well as a multilingual version. Although multilingual BERT performs well on many tasks, recent studies show that BERT models trained on a single language significantly outperform the multilingual version. Training a Dutch BERT model thus has a lot of potential for a wide range of Dutch NLP tasks. While previous approaches have used earlier implementations of BERT to train a Dutch version of BERT, we used RoBERTa, a robustly optimized BERT approach, to train a Dutch language model called RobBERT. We measured its performance on various tasks as well as the importance of the fine-tuning dataset size. We also evaluated the importance of language-specific tokenizers and the model's fairness. We found that RobBERT improves state-of-the-art results for various tasks, and especially significantly outperforms other models when dealing with smaller datasets. These results indicate that it is a powerful pre-trained model for a large variety of Dutch language tasks. The pre-trained and fine-tuned models are publicly available to support further downstream Dutch NLP applications.

PDF Abstract Findings of 2020 PDF Findings of 2020 Abstract

Code

Add Remove Mark official

iPieter/RobBERT official

191

Tasks

Add Remove

Fairness

Language Modelling

Sentiment Analysis

Datasets

CoNLL 2002 DBRD

Results from the Paper

Edit

Ranked #1 on Sentiment Analysis on DBRD

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Sentiment Analysis	DBRD	RobBERT v2	Accuracy	95.144%	# 1	Compare
Sentiment Analysis	DBRD	RobBERT v2	F1	95.144%	# 1	Compare
Sentiment Analysis	DBRD	RobBERT	Accuracy	94.422%	# 2	Compare
Sentiment Analysis	DBRD	RobBERT	F1	94.422%	# 2	Compare

Methods

Add Remove

Adam • Attention Dropout • BERT • Dense Connections • Dropout • GELU • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Residual Connection • RoBERTa • Scaled Dot-Product Attention • Softmax • Weight Decay • WordPiece

Edit Social Preview

RobBERT: a Dutch RoBERTa-based Language Model

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove