Search Results for author: Geppino Pucci

Found 5 papers, 3 papers with code

Distributed k-Means with Outliers in General Metrics

no code implementations • 16 Feb 2022 • Enrico Dandolo, Andrea Pietracaprina, Geppino Pucci

A more general formulation, known as k-means with $z$ outliers, introduced to deal with noisy datasets, features a further parameter $z$ and allows up to $z$ points of $P$ (outliers) to be disregarded when computing the aforementioned sum.

Paper
Add Code

k-Center Clustering with Outliers in Sliding Windows

1 code implementation • 7 Jan 2022 • Paolo Pellizzoni, Andrea Pietracaprina, Geppino Pucci

We provide efficient algorithms for this important variant in the streaming model under the sliding window setting, where, at each time step, the dataset to be clustered is the window $W$ of the most recent data items.

Clustering

Paper
Code

Scalable Distributed Approximation of Internal Measures for Clustering Evaluation

1 code implementation • 3 Mar 2020 • Federico Altieri, Andrea Pietracaprina, Geppino Pucci, Fabio Vandin

The experiments provide evidence that, unlike other heuristics, our estimation strategy not only provides tight theoretical guarantees but is also able to return highly accurate estimations while running in a fraction of the time required by the exact computation, and that its distributed implementation is highly scalable, thus enabling the computation of internal measures for very large datasets for which the exact computation is prohibitive.

Clustering

Paper
Code

Coreset-based Strategies for Robust Center-type Problems

no code implementations • 18 Feb 2020 • Andrea Pietracaprina, Geppino Pucci, Federico Soldà

Given a dataset $V$ of points from some metric space, the popular $k$-center problem requires to identify a subset of $k$ points (centers) in $V$ minimizing the maximum distance of any point of $V$ from its closest center.

Vocal Bursts Type Prediction

Paper
Add Code

MapReduce and Streaming Algorithms for Diversity Maximization in Metric Spaces of Bounded Doubling Dimension

1 code implementation • 18 May 2016 • Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, Eli Upfal

Given a dataset of points in a metric space and an integer $k$, a diversity maximization problem requires determining a subset of $k$ points maximizing some diversity objective measure, e. g., the minimum or the average distance between two points in the subset.

Distributed, Parallel, and Cluster Computing

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.