Multi-label zero-shot learning

12 papers with code • 3 benchmarks • 2 datasets

The goal of multi-label classification task is to predict a set of labels in an image. As an extension of zero-shot learning (ZSL), multi-label zero-shot learning (ML-ZSL) is developed to identify multiple seen and unseen labels in an image.

Most implemented papers

Zero-Shot Learning by Convex Combination of Semantic Embeddings

JudyYe/zero-shot-gcn 19 Dec 2013

In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage.

Label-Embedding for Image Classification

mvp18/Popular-ZSL-Algorithms 30 Mar 2015

Attributes act as intermediate representations that enable parameter sharing between classes, a must when training data is scarce.

Multi-Label Zero-Shot Learning with Structured Knowledge Graphs

Phoenix1327/ML-ZSL CVPR 2018

In this paper, we propose a novel deep learning architecture for multi-label zero-shot learning (ML-ZSL), which is able to predict multiple unseen class labels for each input instance.

Zero-shot Learning for Audio-based Music Classification and Tagging

kunimi00/ZSL_music_tagging 5 Jul 2019

Audio-based music classification and tagging is typically based on categorical supervised learning with a fixed set of labels.

A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning

hbdat/cvpr20_LESA CVPR 2020

Therefore, instead of generating attentions for unseen labels which have unknown behaviors and could focus on irrelevant regions due to the lack of any training sample, we let the unseen labels select among a set of shared attentions which are trained to be label-agnostic and to focus on only relevant/foreground regions through our novel loss.

Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations

hbdat/iccv21_relational_direction ICCV 2021

We study the problem of multi-label zero-shot recognition in which labels are in the form of human-object interactions (combinations of actions on objects), each image may contain multiple interactions and some interactions do not have training images.

Generative Multi-Label Zero-Shot Learning

akshitac8/Generative_MLZSL 27 Jan 2021

Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge.

Semantic Diversity Learning for Zero-Shot Multi-label Classification

Alibaba-MIIL/ZS_SDL ICCV 2021

We argue that using a single embedding vector to represent an image, as commonly practiced, is not sufficient to rank both relevant seen and unseen labels accurately.

Contrastive Language-Image Pre-training for the Italian Language

clip-italian/clip-italian 19 Aug 2021

CLIP (Contrastive Language-Image Pre-training) is a very recent multi-modal model that jointly learns representations of images and texts.

Discriminative Region-based Multi-Label Zero-Shot Learning

akshitac8/biam ICCV 2021

We note that the best existing multi-label ZSL method takes a shared approach towards attending to region features with a common set of attention maps for all the classes.