Data Summarization
33 papers with code • 0 benchmarks • 2 datasets
Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.
Benchmarks
These leaderboards are used to track progress in Data Summarization
Libraries
Use these libraries to find Data Summarization models and implementationsLatest papers with no code
LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces
To answer this question, we design an effective prompting framework for LLMs on high-level reasoning tasks, which can handle traces from the raw sensor data as well as the low-level perception results.
GreedyML: A Parallel Algorithm for Maximizing Submodular Functions
The results show that the GreedyML algorithm can solve problems where the sequential Greedy and distributed RandGreedI algorithms fail due to memory constraints.
Analysis of Persian News Agencies on Instagram, A Words Co-occurrence Graph-based Approach
To the author's knowledge, this method has not been employed in the Persian language before on Instagram social network.
Dynamic Non-monotone Submodular Maximization
Through this reduction, we obtain the first dynamic algorithms to solve the non-monotone submodular maximization problem under the cardinality constraint $k$.
Dynamic Spatio-Temporal Summarization using Information Based Fusion
In the era of burgeoning data generation, managing and storing large-scale time-varying datasets poses significant challenges.
Robust Approximation Algorithms for Non-monotone $k$-Submodular Maximization under a Knapsack Constraint
Our first algorithm, $\LAA$, returns an approximation ratio of $1/19$ within $O(nk)$ query complexity.
Data Summarization beyond Monotonicity: Non-monotone Two-Stage Submodular Maximization
The objective of a two-stage submodular maximization problem is to reduce the ground set using provided training functions that are submodular, with the aim of ensuring that optimizing new objective functions over the reduced ground set yields results comparable to those obtained over the original ground set.
On the Usefulness of Synthetic Tabular Data Generation
Despite recent advances in synthetic data generation, the scientific community still lacks a unified consensus on its usefulness.
Achieving Long-term Fairness in Submodular Maximization through Randomization
Unlike previous studies in this area, we allow for randomized solutions, with the objective being to calculate a distribution over feasible sets such that the expected number of items selected from each group is subject to constraints in the form of upper and lower thresholds, ensuring that the representation of each group remains balanced in the long term.
Group Fairness in Non-monotone Submodular Maximization
Our goal is to select a set of items that maximizes a non-monotone submodular function, while ensuring that the number of selected items from each group is proportionate to its size, to the extent specified by the decision maker.