Text Augmentation
33 papers with code • 0 benchmarks • 0 datasets
You can read these blog posts to get an overview of the approaches.
Benchmarks
These leaderboards are used to track progress in Text Augmentation
Libraries
Use these libraries to find Text Augmentation models and implementationsLatest papers
PairAug: What Can Augmented Image-Text Pairs Do for Radiology?
Acknowledging this limitation, our objective is to devise a framework capable of concurrently augmenting medical image and text data.
EDDA: A Encoder-Decoder Data Augmentation Framework for Zero-Shot Stance Detection
To address these issues, we propose an encoder-decoder data augmentation (EDDA) framework.
A Survey on Data Augmentation in Large Model Era
Leveraging large models, these data augmentation techniques have outperformed traditional approaches.
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation
The latest generative large language models (LLMs) have found their application in data augmentation tasks, where small numbers of text samples are LLM-paraphrased and then used to fine-tune downstream models.
From Big to Small Without Losing It All: Text Augmentation with ChatGPT for Efficient Sentiment Analysis
In the era of artificial intelligence, data is gold but costly to annotate.
Teaching Specific Scientific Knowledge into Large Language Models through Additional Training
Through additional training, we explore embedding specialized scientific knowledge into the Llama 2 Large Language Model (LLM).
COVID-19 Vaccine Misinformation in Middle Income Countries
This paper introduces a multilingual dataset of COVID-19 vaccine misinformation, consisting of annotated tweets from three middle-income countries: Brazil, Indonesia, and Nigeria.
Pretraining Language Models with Text-Attributed Heterogeneous Graphs
In many real-world scenarios (e. g., academic networks, social platforms), different types of entities are not only associated with texts but also connected by various relationships, which can be abstracted as Text-Attributed Heterogeneous Graphs (TAHGs).
Distributional Data Augmentation Methods for Low Resource Language
One of the current state-of-the-art text augmentation techniques is easy data augmentation (EDA), which augments the training data by injecting and replacing synonyms and randomly permuting sentences.
Story Visualization by Online Text Augmentation with Context Memory
Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a long-term context across multiple sentences.