Data Augmentation
2525 papers with code • 2 benchmarks • 63 datasets
Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.
Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.
Further readings:
- A Survey of Data Augmentation Approaches for NLP
- A survey on Image Data Augmentation for Deep Learning
( Image credit: Albumentations )
Libraries
Use these libraries to find Data Augmentation models and implementationsLatest papers with no code
Hide and Seek: How Does Watermarking Impact Face Recognition?
The recent progress in generative models has revolutionized the synthesis of highly realistic images, including face images.
Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin
We test the effect of this data augmentation on two critical NLP tasks: machine translation and sentiment analysis.
Make the Most of Your Data: Changing the Training Data Distribution to Improve In-distribution Generalization Performance
Can we modify the training data distribution to encourage the underlying optimization method toward finding solutions with superior generalization performance on in-distribution data?
CSCO: Connectivity Search of Convolutional Operators
In this paper, we propose CSCO, a novel paradigm that fabricates effective connectivity of convolutional operators with minimal utilization of existing design motifs and further utilizes the discovered wiring to construct high-performing ConvNets.
Empowering Large Language Models for Textual Data Augmentation
With the capabilities of understanding and executing natural language instructions, Large language models (LLMs) can potentially act as a powerful tool for textual data augmentation.
Meta-Transfer Derm-Diagnosis: Exploring Few-Shot Learning and Transfer Learning for Skin Disease Classification in Long-Tail Distribution
Moreover, our experiments, ranging from 2-way to 5-way classifications with up to 10 examples, showed a growing success rate for traditional transfer learning methods as the number of examples increased.
SynCellFactory: Generative Data Augmentation for Cell Tracking
Cell tracking remains a pivotal yet challenging task in biomedical research.
Boosting Model Resilience via Implicit Adversarial Data Augmentation
This insight leads us to develop a meta-learning-based framework for optimizing classifiers with this novel loss, introducing the effects of augmentation while bypassing the explicit augmentation process.
One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns
Procedural noise is a fundamental component of computer graphics pipelines, offering a flexible way to generate textures that exhibit "natural" random variation.
DE-CGAN: Boosting rTMS Treatment Prediction with Diversity Enhancing Conditional Generative Adversarial Networks
This work shows that increasing the diversity of a training dataset can improve classification model performance.