Synthetic Data Generation
180 papers with code • 1 benchmarks • 5 datasets
The generation of tabular data by any means possible.
Libraries
Use these libraries to find Synthetic Data Generation models and implementationsDatasets
Most implemented papers
NViSII: A Scriptable Tool for Photorealistic Image Generation
We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning.
Generative Wind Power Curve Modeling Via Machine Vision: A Self-learning Deep Convolutional Network Based Method
First, different from traditional studies regarding the WPC modeling as a curve fitting problem, in this paper, we renovate the WPC modeling formulation from a machine vision aspect.
TimeVAE: A Variational Auto-Encoder for Multivariate Time Series Generation
Such interpretability can be highly advantageous in applications requiring transparency of model outputs or where users desire to inject prior knowledge of time-series patterns into the generative model.
Noise-Aware Statistical Inference with Differentially Private Synthetic Data
For example, confidence intervals become too narrow, which we demonstrate with a simple experiment.
dpart: Differentially Private Autoregressive Tabular, a General Framework for Synthetic Data Generation
We propose a general, flexible, and scalable framework dpart, an open source Python library for differentially private synthetic data generation.
BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion
BEDLAM is useful for a variety of tasks and all images, ground truth bodies, 3D clothing, support code, and more are available for research purposes.
AnthroNet: Conditional Generation of Humans via Anthropometrics
We present a novel human body model formulated by an extensive set of anthropocentric measurements, which is capable of generating a wide range of human body shapes and poses.
Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A Comprehensive Benchmark
In an empirical study, we evaluate the performance of five state-of-the-art models for tabular data generation on eleven distinct tabular datasets.
Relation Extraction in underexplored biomedical domains: A diversity-optimised sampling and synthetic data generation approach
In addition to their evaluation in few-shot settings, we explore the potential of open Large Language Models (Vicuna-13B) as synthetic data generator and propose a new workflow for this purpose.
A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation
The models are placed in physically realistic poses with respect to their environment to generate a labeled synthetic dataset.