Text Augmentation

34 papers with code • 0 benchmarks • 0 datasets

You can read these blog posts to get an overview of the approaches.

Libraries

Use these libraries to find Text Augmentation models and implementations
3 papers
4,305
2 papers
372

Latest papers with no code

Probabilistic Linguistic Knowledge and Token-level Text Augmentation

no code yet • 29 Jun 2023

This paper investigates the effectiveness of token-level text augmentation and the role of probabilistic linguistic knowledge within a linguistically-motivated evaluation context.

Text Generation with Speech Synthesis for ASR Data Augmentation

no code yet • 22 May 2023

In this work, we explore text augmentation for ASR using large-scale pre-trained neural networks, and systematically compare those to traditional text augmentation methods.

Boosting Event Extraction with Denoised Structure-to-Text Augmentation

no code yet • 16 May 2023

Event extraction aims to recognize pre-defined event triggers and arguments from texts, which suffer from the lack of high-quality annotations.

Shuffle & Divide: Contrastive Learning for Long Text

no code yet • 19 Apr 2023

We propose a self-supervised learning method for long text documents based on contrastive learning.

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

no code yet • 15 Dec 2022

Experiments on Librispeech and in-house data show relative WER reductions (WERRs) from 3% to 5% with a slight increase in model size and negligible extra token emission latency compared with fast-slow encoder based transducer.

Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values

no code yet • 14 Oct 2022

Therefore, we introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command.

Entity Aware Syntax Tree Based Data Augmentation for Natural Language Understanding

no code yet • 6 Sep 2022

One of the main challenges is to collect a sufficient amount of annotated data to train a model.

Data Augmentation for Low-Resource Quechua ASR Improvement

no code yet • 14 Jul 2022

In this paper we describe our data augmentation approach to improve the results of ASR models for low-resource and agglutinative languages.

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

no code yet • 7 Jan 2022

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language.

To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP

no code yet • CL (ACL) 2022

Although NLP has recently witnessed a load of textual augmentation techniques, the field still lacks a systematic performance analysis on a diverse set of languages and sequence tagging tasks.