Data Summarization

33 papers with code • 0 benchmarks • 2 datasets

Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.

Source: How to Solve Fair k-Center in Massive Data Models

Libraries

Use these libraries to find Data Summarization models and implementations

An Online Algorithm for Nonparametric Correlations

wxiao0421/onlineNPCORR 5 Dec 2017

This paper investigates the problem of computing nonparametric correlations on the fly for streaming data.

7
05 Dec 2017

Scalable k-Means Clustering via Lightweight Coresets

webis-de/small-text 27 Feb 2017

As such, they have been successfully used to scale up clustering models to massive data sets.

524
27 Feb 2017

Sequential Quantiles via Hermite Series Density Estimation

MikeJaredS/hermiter 17 Jul 2015

These algorithms go beyond existing sequential quantile estimation algorithms in that they allow arbitrary quantiles (as opposed to pre-specified quantiles) to be estimated at any point in time.

15
17 Jul 2015