Search Results for author: Thiago castro Ferreira

Found 32 papers, 10 papers with code

DaMata: A Robot-Journalist Covering the Brazilian Amazon Deforestation

1 code implementation • INLG (ACL) 2020 • André Luiz Rosa Teixeira, João Campos, Rossana Cunha, Thiago castro Ferreira, Adriana Pagano, Fabio Cozman

This demo paper introduces DaMata, a robot-journalist covering deforestation in the Brazilian Amazon.

Text Generation

Paper
Code

aiXplain at Arabic Hate Speech 2022: An Ensemble Based Approach to Detecting Offensive Tweets

no code implementations • OSACT (LREC) 2022 • Salaheddin Alzubi, Thiago castro Ferreira, Lucas Pavanelli, Mohamed Al-Badrashiny

Abusive speech on online platforms has a detrimental effect on users’ mental health.

Paper
Add Code

Generating Questions from Wikidata Triples

no code implementations • LREC 2022 • Kelvin Han, Thiago castro Ferreira, Claire Gardent

Question generation from knowledge bases (or knowledge base question generation, KBQG) is the task of generating questions from structured database information, typically in the form of triples representing facts.

Knowledge Base Question Answering Question Generation +1

Paper
Add Code

The Third Multilingual Surface Realisation Shared Task (SR’20): Overview and Evaluation Results

1 code implementation • MSR (COLING) 2020 • Simon Mille, Anya Belz, Bernd Bohnet, Thiago castro Ferreira, Yvette Graham, Leo Wanner

As in SR’18 and SR’19, the shared task comprised two tracks: (1) a Shallow Track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (2) a Deep Track where additionally, functional words and morphological information were removed.

Paper
Code

The 2020 Bilingual, Bi-Directional WebNLG+ Shared Task: Overview and Evaluation Results (WebNLG+ 2020)

no code implementations • ACL (WebNLG, INLG) 2020 • Thiago castro Ferreira, Claire Gardent, Nikolai Ilinykh, Chris van der Lee, Simon Mille, Diego Moussallem, Anastasia Shimorina

WebNLG+ offers two challenges: (i) mapping sets of RDF triples to English or Russian text (generation) and (ii) converting English or Russian text to sets of RDF triples (semantic parsing).

Semantic Parsing Text Generation

Paper
Add Code

Anotação de textos não canônicos: um estudo exploratorio de Grande sertão: veredas pelas dependências universais

no code implementations • udfestbr 2022 • Andre V. L. Coneglian, Ana Luisa A. R. Guimarães, Thiago castro Ferreira, Adriana S. Pagano

Paper
Add Code

Another PASS: A Reproduction Study of the Human Evaluation of a Football Report Generation System

no code implementations • INLG (ACL) 2021 • Simon Mille, Thiago castro Ferreira, Anya Belz, Brian Davis

Clarity had a higher degree of reproducibility than Fluency, as measured by the coefficient of variation.

Paper
Add Code

MTLens: Machine Translation Output Debugging

no code implementations • LREC 2022 • Shreyas Sharma, Kareem Darwish, Lucas Pavanelli, Thiago castro Ferreira, Mohamed Al-Badrashiny, Kamer Ali Yuksel, Hassan Sawaf

The performance of Machine Translation (MT) systems varies significantly with inputs of diverging features such as topics, genres, and surface properties.

Benchmarking Machine Translation +2

Paper
Add Code

Enriching the E2E dataset

1 code implementation • INLG (ACL) 2021 • Thiago castro Ferreira, Helena Vaz, Brian Davis, Adriana Pagano

This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG.

Referring Expression Referring expression generation

Paper
Code

Evaluating Recognizing Question Entailment Methods for a Portuguese Community Question-Answering System about Diabetes Mellitus

no code implementations • RANLP 2021 • Thiago castro Ferreira, João Victor de Pinho Costa, Isabela Rigotto, Vitoria Portella, Gabriel Frota, Ana Luisa A. R. Guimarães, Adalberto Penna, Isabela Lee, Tayane A. Soares, Sophia Rolim, Rossana Cunha, Celso França, Ariel Santos, Rivaney F. Oliveira, Abisague Langbehn, Daniel Hasan Dalip, Marcos André Gonçalves, Rodrigo Bastos Fóscolo, Adriana Pagano

This study describes the development of a Portuguese Community-Question Answering benchmark in the domain of Diabetes Mellitus using a Recognizing Question Entailment (RQE) approach.

Community Question Answering Information Retrieval +1

Paper
Add Code

Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles

no code implementations • RANLP 2021 • Felipe Araújo de Britto, Thiago castro Ferreira, Leonardo Pereira Nunes, Fernando Silva Parreiras

Written communication is of utmost importance to the progress of scientific research.

BIG-bench Machine Learning regression

Paper
Add Code

A Systematic Review of Data-to-Text NLG

no code implementations • 13 Feb 2024 • Chinonso Cynthia Osuji, Thiago castro Ferreira, Brian Davis

Relevant literature in this field on datasets, evaluation metrics, application areas, multilingualism, language models, and hallucination mitigation methods is reviewed.

Data-to-Text Generation Hallucination +1

Paper
Add Code

Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric

1 code implementation • 20 Jan 2024 • Golara Javadi, Kamer Ali Yuksel, Yunsu Kim, Thiago castro Ferreira, Mohamed Al-Badrashiny

The findings suggest that NoRefER is not merely a tool for error detection but also a comprehensive framework for enhancing ASR systems' transparency, efficiency, and effectiveness.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model

no code implementations • 14 Jul 2022 • Chris van der Lee, Thiago castro Ferreira, Chris Emmery, Travis Wiltshire, Emiel Krahmer

In terms of output quality, extending the training set of a data-to-text system with a language model using the pseudo-labeling approach did increase text quality scores, but the data augmentation approach yielded similar scores to the system without training set extension.

Data Augmentation Data-to-Text Generation +2

Paper
Add Code

Building The First English-Brazilian Portuguese Corpus for Automatic Post-Editing

no code implementations • COLING 2020 • Felipe Almeida Costa, Thiago castro Ferreira, Adriana Pagano, Wagner Meira

This paper introduces the first corpus for Automatic Post-Editing of English and a low-resource language, Brazilian Portuguese.

Automatic Post-Editing Translation

Paper
Add Code

Referring to what you know and do not know: Making Referring Expression Generation Models Generalize To Unseen Entities

no code implementations • COLING 2020 • Rossana Cunha, Thiago castro Ferreira, Adriana Pagano, Fabio Alves

Data-to-text Natural Language Generation (NLG) is the computational process of generating natural language in the form of text or voice from non-linguistic data.

Referring Expression Referring expression generation +1

Paper
Add Code

NABU $\mathrm{-}$ Multilingual Graph-based Neural RDF Verbalizer

no code implementations • 16 Sep 2020 • Diego Moussallem, Dwaraknath Gnaneshwar, Thiago castro Ferreira, Axel-Cyrille Ngonga Ngomo

The RDF-to-text task has recently gained substantial attention due to continuous growth of Linked Data.

Benchmarking Graph Attention +1

Paper
Add Code

Surface Realization Shared Task 2019 (MSR19): The Team 6 Approach

no code implementations • WS 2019 • Thiago Castro Ferreira, Emiel Krahmer

This study describes the approach developed by the Tilburg University team to the shallow track of the Multilingual Surface Realization Shared Task 2019 (SR{'}19) (Mille et al., 2019).

Machine Translation Translation

Paper
Add Code

Question Similarity in Community Question Answering: A Systematic Exploration of Preprocessing Methods and Models

1 code implementation • RANLP 2019 • Florian Kunneman, Thiago castro Ferreira, Emiel Krahmer, Antal Van den Bosch

Community Question Answering forums are popular among Internet users, and a basic problem they encounter is trying to find out if their question has already been posed before.

Community Question Answering Question Similarity +2

Paper
Code

Neural data-to-text generation: A comparison between pipeline and end-to-end architectures

1 code implementation • IJCNLP 2019 • Thiago Castro Ferreira, Chris van der Lee, Emiel van Miltenburg, Emiel Krahmer

In contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in-between.

Ranked #8 on Data-to-Text Generation on WebNLG Full

Data-to-Text Generation

Paper
Code

Enriching the WebNLG corpus

1 code implementation • WS 2018 • Thiago Castro Ferreira, Diego Moussallem, Emiel Krahmer, S Wubben, er

This paper describes the enrichment of WebNLG corpus (Gardent et al., 2017a, b), with the aim to further extend its usefulness as a resource for evaluating common NLG tasks, including Discourse Ordering, Lexicalization and Referring Expression Generation.

Machine Translation Referring Expression +3

Paper
Code

Surface Realization Shared Task 2018 (SR18): The Tilburg University Approach

1 code implementation • WS 2018 • Thiago Castro Ferreira, S Wubben, er, Emiel Krahmer

This study describes the approach developed by the Tilburg University team to the shallow task of the Multilingual Surface Realization Shared Task 2018 (SR18).

Machine Translation Translation

Paper
Code

NeuralREG: An end-to-end approach to referring expression generation

1 code implementation • ACL 2018 • Thiago Castro Ferreira, Diego Moussallem, Ákos Kádár, Sander Wubben, Emiel Krahmer

Traditionally, Referring Expression Generation (REG) models first decide on the form and then on the content of references to discourse entities in text, typically relying on features such as salience and grammatical function.

Referring Expression Referring expression generation

Paper
Code

RDF2PT: Generating Brazilian Portuguese Texts from RDF Data

1 code implementation • LREC 2018 • Diego Moussallem, Thiago castro Ferreira, Marcos Zampieri, Maria Claudia Cavalcanti, Geraldo Xexéo, Mariana Neves, Axel-Cyrille Ngonga Ngomo

The generation of natural language from Resource Description Framework (RDF) data has recently gained significant attention due to the continuous growth of Linked Data.

Paper
Code

Improving the generation of personalised descriptions

no code implementations • WS 2017 • Thiago Castro Ferreira, Iv Paraboni, r{\'e}

Referring expression generation (REG) models that use speaker-dependent information require a considerable amount of training data produced by every individual speaker, or may otherwise perform poorly.

Referring Expression Referring expression generation +1

Paper
Add Code

Linguistic realisation as machine translation: Comparing different MT models for AMR-to-text generation

no code implementations • WS 2017 • Thiago Castro Ferreira, Iacer Calixto, S Wubben, er, Emiel Krahmer

In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT).

AMR-to-Text Generation Machine Translation +2

Paper
Add Code

Trainable Referring Expression Generation using Overspecification Preferences

no code implementations • 12 Apr 2017 • Thiago castro Ferreira, Ivandre Paraboni

Referring Expression Referring expression generation

Paper
Add Code

Generating flexible proper name references in text: Data, models and evaluation

no code implementations • EACL 2017 • Thiago Castro Ferreira, Emiel Krahmer, S Wubben, er

The model relies on the REGnames corpus, a dataset with 53, 102 proper name references to 1, 000 people in different discourse contexts.

Text Generation

Paper
Add Code

Towards proper name generation: a corpus analysis

no code implementations • WS 2016 • Thiago Castro Ferreira, S Wubben, er, Emiel Krahmer

Text Generation

Paper
Add Code

Task demands and individual variation in referring expressions

no code implementations • WS 2016 • Adriana Baltaretu, Thiago castro Ferreira

Text Generation

Paper
Add Code

Towards more variation in text generation: Developing and evaluating variation models for choice of referential form

no code implementations • ACL 2016 • Thiago Castro Ferreira, Emiel Krahmer, S Wubben, er

Text Generation

Paper
Add Code

Individual Variation in the Choice of Referential Form

no code implementations • NAACL 2016 • Thiago Castro Ferreira, Emiel Krahmer, S Wubben, er

Text Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.