Synthetic Data Generation
180 papers with code • 1 benchmarks • 5 datasets
The generation of tabular data by any means possible.
Libraries
Use these libraries to find Synthetic Data Generation models and implementationsDatasets
Latest papers
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
The rapid rise to prominence of these models and these unique challenges has had immediate adverse impacts on open science and on the reproducibility of work that uses them.
Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes
This technical report details our work towards building an enhanced audio-visual sound event localization and detection (SELD) network.
Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trials
Identifying such heterogeneous treatment effects is key for precision medicine and many post-hoc analysis methods have been developed for that purpose.
Scaling While Privacy Preserving: A Comprehensive Synthetic Tabular Data Generation and Evaluation in Learning Analytics
To address these gaps, we propose a comprehensive evaluation of synthetic data, which encompasses three dimensions of synthetic data quality, namely resemblance, utility, and privacy.
RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation
Significant research effort has been devoted in recent years to developing personalized pricing, promotions, and product recommendation algorithms that can leverage rich customer data to learn and earn.
A Comprehensive End-to-End Computer Vision Framework for Restoration and Recognition of Low-Quality Engineering Drawings
Existing computer vision approaches for digitizing engineering drawings typically assume the input drawings have high quality.
View-Dependent Octree-based Mesh Extraction in Unbounded Scenes for Procedural Synthetic Data
Procedural synthetic data generation has received increasing attention in computer vision.
Combining propensity score methods with variational autoencoders for generating synthetic data in presence of latent sub-groups
The sources of such heterogeneity might be known, e. g., as indicated by sub-groups labels, or might be unknown and thus reflected only in properties of distributions, such as bimodality or skewness.
D3A-TS: Denoising-Driven Data Augmentation in Time Series
While in fields such as Computer Vision or Natural Language Processing synthetic data generation has been extensively explored with promising results, in other domains such as time series it has received less attention.
Privacy-Preserving Data Sharing in Agriculture: Enforcing Policy Rules for Secure and Confidential Data Synthesis
While Big Data can provide the farming community with valuable insights and improve efficiency, there is significant concern regarding the security of this data as well as the privacy of the participants.