Data Summarization

33 papers with code • 0 benchmarks • 2 datasets

Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.

Source: How to Solve Fair k-Center in Massive Data Models

Libraries

Use these libraries to find Data Summarization models and implementations

Streaming Submodular Maximization under a $k$-Set System Constraint

ehsankazemi/streamingkextendible 9 Feb 2020

In this paper, we propose a novel framework that converts streaming algorithms for monotone submodular maximization into streaming algorithms for non-monotone submodular maximization.

3
09 Feb 2020

Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification?

easeml/datascope CVPR 2021

Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and domain adaption.

34
17 Nov 2019

Soft-Label Dataset Distillation and Text Dataset Distillation

Guang000/Awesome-Dataset-Distillation 6 Oct 2019

We propose to simultaneously distill both images and their labels, thus assigning each synthetic sample a `soft' label (a distribution of labels).

1,185
06 Oct 2019

Fast and Accurate Least-Mean-Squares Solvers

ibramjub/Fast-and-Accurate-Least-Mean-Squares-Solvers NeurIPS 2019

Least-mean squares (LMS) solvers such as Linear / Ridge / Lasso-Regression, SVD and Elastic-Net not only solve fundamental machine learning problems, but are also the building blocks in a variety of other methods, such as decision trees and matrix factorizations.

76
11 Jun 2019

apricot: Submodular selection for data summarization in Python

jmschrei/apricot 8 Jun 2019

This paper presents an explanation of submodular selection, an overview of the features in apricot, and an application to several data sets.

490
08 Jun 2019

Fair k-Center Clustering for Data Summarization

matthklein/fair_k_center_clustering 24 Jan 2019

In data summarization we want to choose $k$ prototypes in order to summarize a data set.

12
24 Jan 2019

Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision

zaeemzadeh/IPM CVPR 2019

In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples.

6
29 Nov 2018

Coverage-Based Designs Improve Sample Mining and Hyper-Parameter Optimization

gowthamasu/Coverage_based_sample_design 5 Sep 2018

Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning, and sequential optimization has become a popular solution.

0
05 Sep 2018

A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization

parajain/StructuredData_To_Descriptions NAACL 2018

Structured data summarization involves generation of natural language summaries from structured input data.

1
20 Apr 2018

Fair and Diverse DPP-based Data Summarization

DamianStraszak/FairDiverseDPPSampling ICML 2018

Sampling methods that choose a subset of the data proportional to its diversity in the feature space are popular for data summarization.

9
12 Feb 2018