GePaDe

This dataset encompasses 265 speeches (over 200,000 tokens) from the German Bundestag, primarily from the 19th legislative term (2017-2021), given by 195 distinct speakers representing 6 political parties.

The data was annotated to perform a semantic role labeling task, namely to identify who said what to whom (speaker attribution). Cues (triggers) were annotated that are associated with events of speech, writing, or thought. Additionally, the arguments (roles) of each trigger have been annotated, encompassing the SOURCE, ADDRESSEE, MESSAGE, MEDIUM, TOPIC, and EVIDENCE related to the speech event.

The dataset was introduced in the international GermEval 2023 Shared Task on Speaker Attribution in Newswire and Parliamentary Debates (SpkAtt-2023) to evaluate the quality of systems for automated identification of cues and associated roles.

Reference

Rehbein, I. et al, Overview of the GermEval 2023 Shared Task on Speaker Attribution in Newswire and Parliamentary Debates, https://github.com/umanlp/SpkAtt-2023/blob/master/doc/SpkAtt2023-proceedings.pdf

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages