Search Results for author: Samia Touileb

Found 24 papers, 13 papers with code

LTG-ST at NADI Shared Task 1: Arabic Dialect Identification using a Stacking Classifier

no code implementations COLING (WANLP) 2020 Samia Touileb

This paper presents our results for the Nuanced Arabic Dialect Identification (NADI) shared task of the Fifth Workshop for Arabic Natural Language Processing (WANLP 2020).

Dialect Identification regression

Using Gender- and Polarity-Informed Models to Investigate Bias

no code implementations ACL (GeBNLP) 2021 Samia Touileb, Lilja Øvrelid, Erik Velldal

More specifically, we add information about the gender of critics and book authors when classifying the polarity of book reviews, and the polarity of the reviews when classifying the genders of authors and critics.

Language Modelling

Occupational Biases in Norwegian and Multilingual Language Models

1 code implementation NAACL (GeBNLP) 2022 Samia Touileb, Lilja Øvrelid, Erik Velldal

In this paper we explore how a demographic distribution of occupations, along gender dimensions, is reflected in pre-trained language models.

Descriptive

JSEEGraph: Joint Structured Event Extraction as Graph Parsing

1 code implementation26 Jun 2023 Huiling You, Samia Touileb, Lilja Øvrelid

We propose a graph-based event extraction framework JSEEGraph that approaches the task of event extraction as general graph parsing in the tradition of Meaning Representation Parsing.

Event Argument Extraction Event Extraction

Learning Horn Envelopes via Queries from Large Language Models

no code implementations20 May 2023 Sophie Blum, Raoul Koudijs, Ana Ozaki, Samia Touileb

We propose a new algorithm that aims at extracting the "tightest Horn approximation" of the target theory and that is guaranteed to terminate in exponential time (in the worst case) and in polynomial time if the target has polynomially many non-Horn examples.

NorBench -- A Benchmark for Norwegian Language Models

1 code implementation6 May 2023 David Samuel, Andrey Kutuzov, Samia Touileb, Erik Velldal, Lilja Øvrelid, Egil Rønningstad, Elina Sigdel, Anna Palatkina

We present NorBench: a streamlined suite of NLP tasks and probes for evaluating Norwegian language models (LMs) on standardized data splits and evaluation metrics.

Measuring Normative and Descriptive Biases in Language Models Using Census Data

no code implementations12 Apr 2023 Samia Touileb, Lilja Øvrelid, Erik Velldal

We investigate in this paper how distributions of occupations with respect to gender is reflected in pre-trained language models.

Descriptive

Measuring Harmful Representations in Scandinavian Language Models

1 code implementation21 Nov 2022 Samia Touileb, Debora Nozza

Scandinavian countries are perceived as role-models when it comes to gender equality.

EventGraph: Event Extraction as Semantic Graph Parsing

1 code implementation16 Oct 2022 Huiling You, David Samuel, Samia Touileb, Lilja Øvrelid

Event extraction therefore becomes a graph parsing problem, which provides the following advantages: 1) performing event detection and argument extraction jointly; 2) detecting and extracting multiple events from a piece of text; and 3) capturing the complicated interaction between event arguments and triggers.

Event Detection Event Extraction

Annotating Norwegian Language Varieties on Twitter for Part-of-Speech

no code implementations VarDial (COLING) 2022 Petter Mæhlum, Andre Kåsen, Samia Touileb, Jeremy Barnes

We show that models trained on Universal Dependency (UD) data perform worse when evaluated against this dataset, and that models trained on Bokm{\aa}l generally perform better than those trained on Nynorsk.

POS

NorDial: A Preliminary Corpus of Written Norwegian Dialect Use

1 code implementation NoDaLiDa 2021 Jeremy Barnes, Petter Mæhlum, Samia Touileb

Norway has a large amount of dialectal variation, as well as a general tolerance to its use in the public sphere.

Identifying Sentiments in Algerian Code-switched User-generated Comments

no code implementations LREC 2020 Wafia Adouane, Samia Touileb, Jean-Philippe Bernardy

We present in this paper our work on Algerian language, an under-resourced North African colloquial Arabic variety, for which we built a comparably large corpus of more than 36, 000 code-switched user-generated comments annotated for sentiments.

Sentiment Analysis

Automatic identification of unknown names with specific roles

no code implementations COLING 2018 Samia Touileb, Truls Pedersen, Helle Sj{\o}vaag

Automatically identifying persons in a particular role within a large corpus can be a difficult task, especially if you don{'}t know who you are actually looking for.

NoReC: The Norwegian Review Corpus

1 code implementation LREC 2018 Erik Velldal, Lilja Øvrelid, Eivind Alexander Bergem, Cathrine Stadsnes, Samia Touileb, Fredrik Jørgensen

As resources for sentiment analysis have so far been unavailable for Norwegian, NoReC represents a highly valuable and sought-after addition to Norwegian language technology.

Opinion Mining Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.