Human-Object Interaction Detection
132 papers with code • 6 benchmarks • 22 datasets
Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels.
Benchmarks
These leaderboards are used to track progress in Human-Object Interaction Detection
Libraries
Use these libraries to find Human-Object Interaction Detection models and implementationsLatest papers
DECO: Dense Estimation of 3D Human-Scene Contact In The Wild
In contrast, we focus on inferring dense, 3D contact between the full body surface and objects in arbitrary images.
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels
In this paper, we investigate the task of zero-shot human-object interaction (HOI) detection, a novel paradigm for identifying HOIs without the need for task-specific annotations.
Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory
This observation motivates us to design an HOI detector that can be trained even with long-tailed labeled data and can leverage existing knowledge from pre-trained models.
InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs).
RLIPv2: Fast Scaling of Relational Language-Image Pre-training
In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.
Diagnosing Human-object Interaction Detectors
To address this issue, in this paper, we introduce a diagnosis toolbox to provide detailed quantitative break-down analysis of HOI detection models, inspired by the success of object detection diagnosis toolboxes.
Exploring Predicate Visual Context in Detecting Human-Object Interactions
Recently, the DETR framework has emerged as the dominant approach for human--object interaction (HOI) research.
Focusing on what to decode and what to train: Efficient Training with HOI Split Decoders and Specific Target Guided DeNoising
However, the current methods redirect the detection target of the object decoder, and the box target is not explicitly separated from the query embeddings, which leads to long and hard training.
First Place Solution to the CVPR'2023 AQTC Challenge: A Function-Interaction Centric Approach with Spatiotemporal Visual-Language Alignment
Affordance-Centric Question-driven Task Completion (AQTC) has been proposed to acquire knowledge from videos to furnish users with comprehensive and systematic instructions.
Exploiting Multimodal Synthetic Data for Egocentric Human-Object Interaction Detection in an Industrial Scenario
In this paper, we tackle the problem of Egocentric Human-Object Interaction (EHOI) detection in an industrial setting.