Data Summarization

33 papers with code • 0 benchmarks • 2 datasets

Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.

Source: How to Solve Fair k-Center in Massive Data Models

Benchmarks

Add a Result

These leaderboards are used to track progress in Data Summarization

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Data Summarization models and implementations

MikeJaredS/hermiter

2 papers

Datasets

Latest papers

Most implemented Social Latest No code

Streaming Submodular Maximization under a $k$-Set System Constraint

ehsankazemi/streamingkextendible • 9 Feb 2020

In this paper, we propose a novel framework that converts streaming algorithms for monotone submodular maximization into streaming algorithms for non-monotone submodular maximization.

09 Feb 2020

Paper
Code

Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification?

easeml/datascope • • CVPR 2021

Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and domain adaption.

17 Nov 2019

Paper
Code

Soft-Label Dataset Distillation and Text Dataset Distillation

Guang000/Awesome-Dataset-Distillation • 6 Oct 2019

We propose to simultaneously distill both images and their labels, thus assigning each synthetic sample a `soft' label (a distribution of labels).

1,185

06 Oct 2019

Paper
Code

Fast and Accurate Least-Mean-Squares Solvers

ibramjub/Fast-and-Accurate-Least-Mean-Squares-Solvers • NeurIPS 2019

Least-mean squares (LMS) solvers such as Linear / Ridge / Lasso-Regression, SVD and Elastic-Net not only solve fundamental machine learning problems, but are also the building blocks in a variety of other methods, such as decision trees and matrix factorizations.

11 Jun 2019

Paper
Code

apricot: Submodular selection for data summarization in Python

jmschrei/apricot • 8 Jun 2019

This paper presents an explanation of submodular selection, an overview of the features in apricot, and an application to several data sets.

490

08 Jun 2019

Paper
Code

Fair k-Center Clustering for Data Summarization

matthklein/fair_k_center_clustering • 24 Jan 2019

In data summarization we want to choose $k$ prototypes in order to summarize a data set.

24 Jan 2019

Paper
Code

Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision

zaeemzadeh/IPM • CVPR 2019

In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples.

29 Nov 2018

Paper
Code

Coverage-Based Designs Improve Sample Mining and Hyper-Parameter Optimization

gowthamasu/Coverage_based_sample_design • 5 Sep 2018

Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning, and sequential optimization has become a popular solution.

05 Sep 2018

Paper
Code