Data Summarization
33 papers with code • 0 benchmarks • 2 datasets
Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.
Benchmarks
These leaderboards are used to track progress in Data Summarization
Libraries
Use these libraries to find Data Summarization models and implementationsLatest papers with no code
Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms
Diversity maximization aims to select a diverse and representative subset of items from a large dataset.
Guided Exploration of Data Summaries
We examine the applicability of Exploratory Data Analysis (EDA) to data summarization and formalize Eda4Sum, the problem of guided exploration of data summaries that seeks to sequentially produce connected summaries with the goal of maximizing their cumulative utility.
Operations for Autonomous Spacecraft
Onboard autonomy technologies such as planning and scheduling, identification of scientific targets, and content-based data summarization, will lead to exciting new space science missions.
NNK-Means: Data summarization using dictionary learning with non-negative kernel regression
An increasing number of systems are being designed by gathering significant amounts of data and then optimizing the system parameters directly using the obtained data.
Towards General Robustness to Bad Training Data
In this paper, we focus on the problem of identifying bad training data when the underlying cause is unknown in advance.
Data Summarization via Bilevel Optimization
We show the effectiveness of our framework for a wide range of models in various settings, including training non-convex models online and batch active learning.
A Unified Framework for Task-Driven Data Quality Management
High-quality data is critical to train performant Machine Learning (ML) models, highlighting the importance of Data Quality Management (DQM).
Adaptive Sampling for Fast Constrained Maximization of Submodular Function
Our algorithm is suitable to maximize a non-monotone submodular function under a $p$-system side constraint, and it achieves a $(p + O(\sqrt{p}))$-approximation for this problem, after only poly-logarithmic adaptive rounds and polynomial queries to the valuation oracle function.
Introduction to Core-sets: an Updated Survey
In optimization or machine learning problems we are given a set of items, usually points in some metric space, and the goal is to minimize or maximize an objective function over some space of candidate solutions.
Deep Submodular Networks for Extractive Data Summarization
Unfortunately, these models only learn the relative importance of the different submodular functions (such as diversity, representation or importance), but cannot learn more complex feature representations, which are often required for state-of-the-art performance.