Zero-Shot Learning

565 papers with code • 18 benchmarks • 29 datasets

Zero-shot learning (ZSL) is a model's ability to detect classes never seen during training. The condition is that the classes are not known during supervised learning.

Earlier work in zero-shot learning use attributes in a two-step approach to infer unknown classes. In the computer vision context, more recent advances learn mappings from image feature space to semantic space. Other approaches learn non-linear multimodal embeddings. In the modern NLP context, language models can be evaluated on downstream tasks without fine tuning.

Benchmark datasets for zero-shot learning include aPY, AwA, and CUB, among others.

( Image credit: Prototypical Networks for Few shot Learning in PyTorch )

Further readings:

Libraries

Use these libraries to find Zero-Shot Learning models and implementations

X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization

annusha/xmic 28 Mar 2024

Lately, there has been growing interest in adapting vision-language models (VLMs) to image and third-person video classification due to their success in zero-shot recognition.

5
28 Mar 2024

VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification

lanfz2000/vlm-cpl 23 Mar 2024

To address this issue, we introduce VLM-CPL, a novel approach based on consensus pseudo labels that integrates two noisy label filtering techniques with a semi-supervised learning strategy.

5
23 Mar 2024

Long-CLIP: Unlocking the Long-Text Capability of CLIP

beichenzbc/long-clip 22 Mar 2024

Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-shot classification, text-image retrieval, and text-image generation by aligning image and text modalities.

325
22 Mar 2024

Comprehensive Evaluation and Insights into the Use of Large Language Models in the Automation of Behavior-Driven Development Acceptance Test Formulation

karpurapus/bddgpt-automate-tests 22 Mar 2024

Behavior-driven development (BDD) is an Agile testing methodology fostering collaboration among developers, QA analysts, and stakeholders.

0
22 Mar 2024

Less but Better: Enabling Generalized Zero-shot Learning Towards Unseen Domains by Intrinsic Learning from Redundant LLM Semantics

chunhuiz/semantics-of-officehome-and-minidomainnet 21 Mar 2024

Different from existing GZSL methods which alleviate DSP by generating features of unseen classes with semantics, CDGZSL needs to construct a common feature space across domains and acquire the corresponding intrinsic semantics shared among domains to transfer from seen to unseen domains.

0
21 Mar 2024

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

liuziyu77/rar 20 Mar 2024

Notably, our approach demonstrates a significant improvement in performance on 5 fine-grained visual recognition benchmarks, 11 few-shot image recognition datasets, and the 2 object detection datasets under the zero-shot recognition setting.

42
20 Mar 2024

CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation

zwq456/clip-vis 19 Mar 2024

Given a set of initial queries, class-agnostic mask generation employs a transformer decoder to predict query masks and corresponding object scores and mask IoU scores.

26
19 Mar 2024

Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models

elaine-sui/tps 19 Mar 2024

Advancements in vision-language models (VLMs) have propelled the field of computer vision, particularly in the zero-shot learning setting.

4
19 Mar 2024

Eye-gaze Guided Multi-modal Alignment Framework for Radiology

momarky/egma 19 Mar 2024

Additionally, we explore the impact of varying amounts of eye-gaze data on model performance, highlighting the feasibility and utility of integrating this auxiliary data into multi-modal pre-training.

2
19 Mar 2024

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

jiazuoyu/moe-adapters4cl 18 Mar 2024

Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset.

66
18 Mar 2024