Text Augmentation

34 papers with code • 0 benchmarks • 0 datasets

You can read these blog posts to get an overview of the approaches.

Libraries

Use these libraries to find Text Augmentation models and implementations
3 papers
4,305
2 papers
372

Most implemented papers

Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning

thunlp/MixADA 31 Dec 2020

In this work, we propose a simple and effective method to cover a much larger proportion of the attack search space, called Adversarial and Mixup Data Augmentation (AMDA).

GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation

naver-ai/hypermix Findings (EMNLP) 2021

Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts.

Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

mifei/st-tod EMNLP 2021

In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems.

Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

jaaack-wang/linguistic-knowledge-in-DA-for-NLP 29 Nov 2021

To investigate the role of linguistic knowledge in data augmentation (DA) for Natural Language Processing (NLP), we designed two adapted DA programs and applied them to LCQMC (a Large-scale Chinese Question Matching Corpus) for a binary Chinese question matching classification task.

UCD-CS at TREC 2021 Incident Streams Track

wangcongcong123/crisis-mtl 7 Dec 2021

In recent years, the task of mining important information from social media posts during crises has become a focus of research for the purposes of assisting emergency response (ES).

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

snap-research/mmvid CVPR 2022

In addition, our model can extract visual information as suggested by the text prompt, e. g., "an object in image one is moving northeast", and generate corresponding videos.

BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset

faiyazkhan11/ban-cap LREC 2022

As computers have become efficient at understanding visual information and transforming it into a written representation, research interest in tasks like automatic image captioning has seen a significant leap over the last few years.

Selective Text Augmentation with Word Roles for Low-Resource Text Classification

beyondguo/STA 4 Sep 2022

Different words may play different roles in text classification, which inspires us to strategically select the proper roles for text augmentation.

DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification

declare-lab/doublemix COLING 2022

This paper proposes a simple yet effective interpolation-based data augmentation approach termed DoubleMix, to improve the robustness of models in text classification.