Online Clustering
25 papers with code • 0 benchmarks • 0 datasets
Models that learn to label each image (i.e. cluster the dataset into its ground truth classes) without seeing the ground truth labels. Under the online scenario, data is in the form of streams, i.e., the whole dataset could not be accessed at the same time and the model should be able to make cluster assignments for new data without accessing the former data.
Image Credit: Online Clustering by Penalized Weighted GMM
Benchmarks
These leaderboards are used to track progress in Online Clustering
Latest papers with no code
Classification and Online Clustering of Zero-Day Malware
Based on the classification score of the multilayer perceptron, we determined which samples would be classified and which would be clustered into new malware families.
ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning
Finally, ProtoCon addresses the poor training signal in the initial phase of training (due to fewer confident predictions) by introducing an auxiliary self-supervised loss.
Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network
Binaural speech separation in real-world scenarios often involves moving speakers.
Test-time Adaptation in the Dynamic World with Compound Domain Knowledge Management
In addition, to prevent overfitting of the TTA model, we devise novel regularization which modulates the adaptation rates using domain-similarity between the source and the current target domain.
Multi-scale Digital Twin: Developing a fast and physics-informed surrogate model for groundwater contamination with uncertain climate models
To quickly assess the spatiotemporal variations of groundwater contamination under uncertain climate disturbances, we developed a physics-informed machine learning surrogate model using U-Net enhanced Fourier Neural Operator (U-FNO) to solve Partial Differential Equations (PDEs) of groundwater flow and transport simulations at the site scale. We develop a combined loss function that includes both data-driven factors and physical boundary constraints at multiple spatiotemporal scales.
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
We introduce dynamic dictionaries for both modalities to enlarge the scale of image-text pairs, and diversity-sensitiveness is achieved by adaptive negative pair weighting.
Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding
To bridge the gap between supervised semantic segmentation and real-world applications that acquires one model to recognize arbitrary new concepts, recent zero-shot segmentation attracts a lot of attention by exploring the relationships between unseen and seen object categories, yet requiring large amounts of densely-annotated data with diverse base classes.
Early Discovery of Emerging Entities in Persian Twitter with Semantic Similarity
Similar to any machine learning problem, data availability is one of the major challenges in this problem.
Interrelate Training and Searching: A Unified Online Clustering Framework for Speaker Diarization
For online speaker diarization, samples arrive incrementally, and the overall distribution of the samples is invisible.
ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification
Motivated by this observation, we propose a principled GCL framework on Imbalanced node classification (ImGCL), which automatically and adaptively balances the representations learned from GCL without labels.