TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Few-Shot Image Classification	Bongard-HOI	Human (Amateur)	Avg. Accuracy	91.42	# 1
Few-Shot Image Classification	Bongard-HOI	Meta-Baseline (Scratch_R50)	Avg. Accuracy	54.23	# 6
Few-Shot Image Classification	Bongard-HOI	ANIL (ImageNet_R50)	Avg. Accuracy	49.74	# 7
Few-Shot Image Classification	Bongard-HOI	Meta-Baseline (MoCov2_R50)	Avg. Accuracy	54.30	# 5
Few-Shot Image Classification	Bongard-HOI	Meta-Baseline (ImagNet_R50)	Avg. Accuracy	55.82	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/bongard-hoi-benchmarking-few-shot-visual/few-shot-image-classification-on-bongard-hoi)](https://paperswithcode.com/sota/few-shot-image-classification-on-bongard-hoi?p=bongard-hoi-benchmarking-few-shot-visual)`

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

CVPR 2022 · Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar ·

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts. We introduce Bongard-HOI, a new visual reasoning benchmark that focuses on compositional learning of human-object interactions (HOIs) from natural images. It is inspired by two desirable characteristics from the classical Bongard problems (BPs): 1) few-shot concept learning, and 2) context-dependent reasoning. We carefully curate the few-shot instances with hard negatives, where positive and negative images only disagree on action labels, making mere recognition of object categories insufficient to complete our benchmarks. We also design multiple test sets to systematically study the generalization of visual learning models, where we vary the overlap of the HOI concepts between the training and test sets of few-shot instances, from partial to no overlaps. Bongard-HOI presents a substantial challenge to today's visual recognition models. The state-of-the-art HOI detection model achieves only 62% accuracy on few-shot binary prediction while even amateur human testers on MTurk have 91% accuracy. With the Bongard-HOI benchmark, we hope to further advance research efforts in visual reasoning, especially in holistic perception-reasoning systems and better representation learning.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Code

Add Remove Mark official

nvlabs/bongard-hoi official

Tasks

Add Remove

Benchmarking

Few-Shot Image Classification

Few-Shot Learning

Human-Object Interaction Detection

Novel Concepts

Representation Learning

Visual Reasoning

Datasets

Introduced in the Paper:

Bongard-HOI

Used in the Paper:

ImageNet

MS COCO mini-Imagenet

V-COCO

Meta-Dataset

AbstractReasoning

HICO

Results from the Paper

Edit

Ranked #1 on Few-Shot Image Classification on Bongard-HOI

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Few-Shot Image Classification	Bongard-HOI	Human (Amateur)	Avg. Accuracy	91.42	# 1	Compare
Few-Shot Image Classification	Bongard-HOI	Meta-Baseline (Scratch_R50)	Avg. Accuracy	54.23	# 6	Compare
Few-Shot Image Classification	Bongard-HOI	ANIL (ImageNet_R50)	Avg. Accuracy	49.74	# 7	Compare
Few-Shot Image Classification	Bongard-HOI	Meta-Baseline (MoCov2_R50)	Avg. Accuracy	54.30	# 5	Compare
Few-Shot Image Classification	Bongard-HOI	Meta-Baseline (ImagNet_R50)	Avg. Accuracy	55.82	# 4	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove