Data Augmentation

2525 papers with code • 2 benchmarks • 63 datasets

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Further readings:

( Image credit: Albumentations )

Libraries

Use these libraries to find Data Augmentation models and implementations

Latest papers with no code

Hide and Seek: How Does Watermarking Impact Face Recognition?

no code yet • 29 Apr 2024

The recent progress in generative models has revolutionized the synthesis of highly realistic images, including face images.

Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin

no code yet • 28 Apr 2024

We test the effect of this data augmentation on two critical NLP tasks: machine translation and sentiment analysis.

Make the Most of Your Data: Changing the Training Data Distribution to Improve In-distribution Generalization Performance

no code yet • 27 Apr 2024

Can we modify the training data distribution to encourage the underlying optimization method toward finding solutions with superior generalization performance on in-distribution data?

CSCO: Connectivity Search of Convolutional Operators

no code yet • 26 Apr 2024

In this paper, we propose CSCO, a novel paradigm that fabricates effective connectivity of convolutional operators with minimal utilization of existing design motifs and further utilizes the discovered wiring to construct high-performing ConvNets.

Empowering Large Language Models for Textual Data Augmentation

no code yet • 26 Apr 2024

With the capabilities of understanding and executing natural language instructions, Large language models (LLMs) can potentially act as a powerful tool for textual data augmentation.

Meta-Transfer Derm-Diagnosis: Exploring Few-Shot Learning and Transfer Learning for Skin Disease Classification in Long-Tail Distribution

no code yet • 25 Apr 2024

Moreover, our experiments, ranging from 2-way to 5-way classifications with up to 10 examples, showed a growing success rate for traditional transfer learning methods as the number of examples increased.

SynCellFactory: Generative Data Augmentation for Cell Tracking

no code yet • 25 Apr 2024

Cell tracking remains a pivotal yet challenging task in biomedical research.

Boosting Model Resilience via Implicit Adversarial Data Augmentation

no code yet • 25 Apr 2024

This insight leads us to develop a meta-learning-based framework for optimizing classifiers with this novel loss, introducing the effects of augmentation while bypassing the explicit augmentation process.

One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns

no code yet • 25 Apr 2024

Procedural noise is a fundamental component of computer graphics pipelines, offering a flexible way to generate textures that exhibit "natural" random variation.

DE-CGAN: Boosting rTMS Treatment Prediction with Diversity Enhancing Conditional Generative Adversarial Networks

no code yet • 25 Apr 2024

This work shows that increasing the diversity of a training dataset can improve classification model performance.