Online Clustering

25 papers with code • 0 benchmarks • 0 datasets

Models that learn to label each image (i.e. cluster the dataset into its ground truth classes) without seeing the ground truth labels. Under the online scenario, data is in the form of streams, i.e., the whole dataset could not be accessed at the same time and the model should be able to make cluster assignments for new data without accessing the former data.

Image Credit: Online Clustering by Penalized Weighted GMM

Latest papers with no code

Classification and Online Clustering of Zero-Day Malware

no code yet • 1 May 2023

Based on the classification score of the multilayer perceptron, we determined which samples would be classified and which would be clustered into new malware families.

ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning

no code yet • CVPR 2023

Finally, ProtoCon addresses the poor training signal in the initial phase of training (due to fewer confident predictions) by introducing an auxiliary self-supervised loss.

Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network

no code yet • 13 Mar 2023

Binaural speech separation in real-world scenarios often involves moving speakers.

Test-time Adaptation in the Dynamic World with Compound Domain Knowledge Management

no code yet • 16 Dec 2022

In addition, to prevent overfitting of the TTA model, we devise novel regularization which modulates the adaptation rates using domain-similarity between the source and the current target domain.

Multi-scale Digital Twin: Developing a fast and physics-informed surrogate model for groundwater contamination with uncertain climate models

no code yet • 20 Nov 2022

To quickly assess the spatiotemporal variations of groundwater contamination under uncertain climate disturbances, we developed a physics-informed machine learning surrogate model using U-Net enhanced Fourier Neural Operator (U-FNO) to solve Partial Differential Equations (PDEs) of groundwater flow and transport simulations at the site scale. We develop a combined loss function that includes both data-driven factors and physical boundary constraints at multiple spatiotemporal scales.

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval

no code yet • 21 Aug 2022

We introduce dynamic dictionaries for both modalities to enlarge the scale of image-text pairs, and diversity-sensitiveness is achieved by adaptive negative pair weighting.

Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding

no code yet • 18 Jul 2022

To bridge the gap between supervised semantic segmentation and real-world applications that acquires one model to recognize arbitrary new concepts, recent zero-shot segmentation attracts a lot of attention by exploring the relationships between unseen and seen object categories, yet requiring large amounts of densely-annotated data with diverse base classes.

Early Discovery of Emerging Entities in Persian Twitter with Semantic Similarity

no code yet • 6 Jul 2022

Similar to any machine learning problem, data availability is one of the major challenges in this problem.

Interrelate Training and Searching: A Unified Online Clustering Framework for Speaker Diarization

no code yet • 28 Jun 2022

For online speaker diarization, samples arrive incrementally, and the overall distribution of the samples is invisible.

ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification

no code yet • 23 May 2022

Motivated by this observation, we propose a principled GCL framework on Imbalanced node classification (ImGCL), which automatically and adaptively balances the representations learned from GCL without labels.