Dataset Condensation

26 papers with code • 0 benchmarks • 0 datasets

Condense the full dataset into a tiny set of synthetic data.

Most implemented papers

You Only Condense Once: Two Rules for Pruning Condensed Datasets

he-y/you-only-condense-once NeurIPS 2023

However, these scenarios have two significant challenges: 1) the varying computational resources available on the devices require a dataset size different from the pre-defined condensed dataset, and 2) the limited computational resources often preclude the possibility of conducting additional condensation processes.

Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching

shaoshitong/G_VBSM_Dataset_Condensation 29 Nov 2023

We call this perspective "generalized matching" and propose Generalized Various Backbone and Statistical Matching (G-VBSM) in this work, which aims to create a synthetic dataset with densities, ensuring consistency with the complete dataset across various backbones, layers, and statistics.

Dataset Condensation Driven Machine Unlearning

algebraicdianuj/DC_U 31 Jan 2024

To achieve this goal, we propose new dataset condensation techniques and an innovative unlearning scheme that strikes a balance between machine unlearning privacy, utility, and efficiency.

Is Adversarial Training with Compressed Datasets Effective?

saintslab/pytoch 8 Feb 2024

This synthetic dataset retains the essential information of the original dataset, enabling models trained on it to achieve performance levels comparable to those trained on the full dataset.

Multisize Dataset Condensation

he-y/multisize-dataset-condensation 10 Mar 2024

These two challenges connect to the "subset degradation problem" in traditional dataset condensation: a subset from a larger condensed dataset is often unrepresentative compared to directly condensing the whole dataset to that smaller size.

Distilling Datasets Into Less Than One Image

AsafShul/PoDD 18 Mar 2024

Current methods frame this as maximizing the distilled classification accuracy for a budget of K distilled images-per-class, where K is a positive integer.