Unsupervised Pre-training

103 papers with code • 2 benchmarks • 7 datasets

Pre-training a neural network using unsupervised (self-supervised) auxiliary tasks on unlabeled data.

Libraries

Use these libraries to find Unsupervised Pre-training models and implementations
2 papers
29,251

VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain

leduckhai/multimed 8 Apr 2024

VietMed is also by far the largest public Vietnamese speech dataset in terms of total duration.

7
08 Apr 2024

A Survey on Data Selection for Language Models

alon-albalak/data-selection-survey 26 Feb 2024

A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training.

69
26 Feb 2024

Foundation Policies with Hilbert Representations

seohongpark/hilp 23 Feb 2024

While a number of methods have been proposed to enable generic self-supervised RL, based on principles such as goal-conditioned RL, behavioral cloning, and unsupervised skill learning, such methods remain limited in terms of either the diversity of the discovered behaviors, the need for high-quality demonstration data, or the lack of a clear prompting or adaptation mechanism for downstream tasks.

52
23 Feb 2024

Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval

ma787639046/bowdpr 20 Jan 2024

In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints.

7
20 Jan 2024

Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding

huiguanlab/umurl 6 Nov 2023

In this manner, our framework is able to learn the unified representations of uni-modal or multi-modal skeleton input, which is flexible to different kinds of modality input for robust action understanding in practical cases.

8
06 Nov 2023

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

seohongpark/metra 13 Oct 2023

Through our experiments in five locomotion and manipulation environments, we demonstrate that METRA can discover a variety of useful behaviors even in complex, pixel-based environments, being the first unsupervised RL method that discovers diverse locomotion behaviors in pixel-based Quadruped and Humanoid.

41
13 Oct 2023

Self-Supervised Pre-Training Boosts Semantic Scene Segmentation on LiDAR Data

marionacaros/barlow-twins-for-sem-seg 5 Sep 2023

Airborne LiDAR systems have the capability to capture the Earth's surface by generating extensive point cloud data comprised of points mainly defined by 3D coordinates.

0
05 Sep 2023

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

seohongpark/hiql NeurIPS 2023

This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals.

60
22 Jul 2023

Disentangling Node Attributes from Graph Topology for Improved Generalizability in Link Prediction

chatterjeeayan/upna 17 Jul 2023

Our proposed method, UPNA (Unsupervised Pre-training of Node Attributes), solves the inductive link prediction problem by learning a function that takes a pair of node attributes and predicts the probability of an edge, as opposed to Graph Neural Networks (GNN), which can be prone to topological shortcuts in graphs with power-law degree distribution.

1
17 Jul 2023

Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning

thuml/ContextWM NeurIPS 2023

To tackle this issue, we introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling to overcome the complexity and diversity of in-the-wild videos and facilitate knowledge transfer between distinct scenes.

45
29 May 2023