Data Summarization

33 papers with code • 0 benchmarks • 2 datasets

Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.

Source: How to Solve Fair k-Center in Massive Data Models

Libraries

Use these libraries to find Data Summarization models and implementations

Latest papers with no code

Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms

no code yet • 5 Jan 2023

Diversity maximization aims to select a diverse and representative subset of items from a large dataset.

Guided Exploration of Data Summaries

no code yet • 27 May 2022

We examine the applicability of Exploratory Data Analysis (EDA) to data summarization and formalize Eda4Sum, the problem of guided exploration of data summaries that seeks to sequentially produce connected summaries with the goal of maximizing their cumulative utility.

Operations for Autonomous Spacecraft

no code yet • 22 Nov 2021

Onboard autonomy technologies such as planning and scheduling, identification of scientific targets, and content-based data summarization, will lead to exciting new space science missions.

NNK-Means: Data summarization using dictionary learning with non-negative kernel regression

no code yet • 15 Oct 2021

An increasing number of systems are being designed by gathering significant amounts of data and then optimizing the system parameters directly using the obtained data.

Towards General Robustness to Bad Training Data

no code yet • 29 Sep 2021

In this paper, we focus on the problem of identifying bad training data when the underlying cause is unknown in advance.

Data Summarization via Bilevel Optimization

no code yet • 26 Sep 2021

We show the effectiveness of our framework for a wide range of models in various settings, including training non-convex models online and batch active learning.

A Unified Framework for Task-Driven Data Quality Management

no code yet • 10 Jun 2021

High-quality data is critical to train performant Machine Learning (ML) models, highlighting the importance of Data Quality Management (DQM).

Adaptive Sampling for Fast Constrained Maximization of Submodular Function

no code yet • 12 Feb 2021

Our algorithm is suitable to maximize a non-monotone submodular function under a $p$-system side constraint, and it achieves a $(p + O(\sqrt{p}))$-approximation for this problem, after only poly-logarithmic adaptive rounds and polynomial queries to the valuation oracle function.

Introduction to Core-sets: an Updated Survey

no code yet • 18 Nov 2020

In optimization or machine learning problems we are given a set of items, usually points in some metric space, and the goal is to minimize or maximize an objective function over some space of candidate solutions.

Deep Submodular Networks for Extractive Data Summarization

no code yet • 16 Oct 2020

Unfortunately, these models only learn the relative importance of the different submodular functions (such as diversity, representation or importance), but cannot learn more complex feature representations, which are often required for state-of-the-art performance.