Data Augmentation

2525 papers with code • 2 benchmarks • 63 datasets

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Benchmarks

Add a Result

These leaderboards are used to track progress in Data Augmentation

Trend	Dataset	Best Model	Paper	Code	Compare
	ImageNet	DeiT-B (+MixPro)			See all
	CIFAR-10	Shake-Shake (26 2×96d) (Faster AA)			See all

Libraries

Use these libraries to find Data Augmentation models and implementations

Westlake-AI/openmixup

15 papers

574

rwightman/pytorch-image-models

7 papers

29,826

makcedward/nlpaug

7 papers

4,302

faceonlive/ai-research

7 papers

179

See all 7 libraries.

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Hide and Seek: How Does Watermarking Impact Face Recognition?

no code yet • 29 Apr 2024

The recent progress in generative models has revolutionized the synthesis of highly realistic images, including face images.

Paper
Add Code

Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin

no code yet • 28 Apr 2024

We test the effect of this data augmentation on two critical NLP tasks: machine translation and sentiment analysis.

Paper
Add Code

Make the Most of Your Data: Changing the Training Data Distribution to Improve In-distribution Generalization Performance

no code yet • 27 Apr 2024

Can we modify the training data distribution to encourage the underlying optimization method toward finding solutions with superior generalization performance on in-distribution data?

Paper
Add Code

CSCO: Connectivity Search of Convolutional Operators

no code yet • 26 Apr 2024

In this paper, we propose CSCO, a novel paradigm that fabricates effective connectivity of convolutional operators with minimal utilization of existing design motifs and further utilizes the discovered wiring to construct high-performing ConvNets.

Paper
Add Code

Empowering Large Language Models for Textual Data Augmentation

no code yet • 26 Apr 2024

With the capabilities of understanding and executing natural language instructions, Large language models (LLMs) can potentially act as a powerful tool for textual data augmentation.

Paper
Add Code

Meta-Transfer Derm-Diagnosis: Exploring Few-Shot Learning and Transfer Learning for Skin Disease Classification in Long-Tail Distribution

no code yet • 25 Apr 2024

Moreover, our experiments, ranging from 2-way to 5-way classifications with up to 10 examples, showed a growing success rate for traditional transfer learning methods as the number of examples increased.

Paper
Add Code

SynCellFactory: Generative Data Augmentation for Cell Tracking

no code yet • 25 Apr 2024

Cell tracking remains a pivotal yet challenging task in biomedical research.

Paper
Add Code

Boosting Model Resilience via Implicit Adversarial Data Augmentation

no code yet • 25 Apr 2024

This insight leads us to develop a meta-learning-based framework for optimizing classifiers with this novel loss, introducing the effects of augmentation while bypassing the explicit augmentation process.

Paper
Add Code

One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns

no code yet • 25 Apr 2024

Procedural noise is a fundamental component of computer graphics pipelines, offering a flexible way to generate textures that exhibit "natural" random variation.

Paper
Add Code

DE-CGAN: Boosting rTMS Treatment Prediction with Diversity Enhancing Conditional Generative Adversarial Networks

no code yet • 25 Apr 2024

This work shows that increasing the diversity of a training dataset can improve classification model performance.

Paper
Add Code

Data Augmentation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result