Human-Object Interaction Detection
132 papers with code • 6 benchmarks • 22 datasets
Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels.
Benchmarks
These leaderboards are used to track progress in Human-Object Interaction Detection
Libraries
Use these libraries to find Human-Object Interaction Detection models and implementationsMost implemented papers
Temporal Relational Reasoning in Videos
Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species.
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
Our core idea is that the appearance of a person or an object instance contains informative cues on which relevant parts of an image to attend to for facilitating interaction prediction.
HAKE: Human Activity Knowledge Engine
To address these and promote the activity understanding, we build a large-scale Human Activity Knowledge Engine (HAKE) based on the human body part states.
Visual Compositional Learning for Human-Object Interaction Detection
The integration of decomposition and composition enables VCL to share object and verb features among different HOI samples and images, and to generate new interaction samples and new types of HOI, and thus largely alleviates the long-tail distribution problem and benefits low-shot or zero-shot HOI detection.
No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques
We show that for human-object interaction detection a relatively simple factorized model with appearance and layout encodings constructed from pre-trained object detectors outperforms more sophisticated approaches.
Transferable Interactiveness Knowledge for Human-Object Interaction Detection
On account of the generalization of interactiveness, interactiveness network is a transferable knowledge learner and can be cooperated with any HOI detection models to achieve desirable results.
Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection
Few works have studied the disambiguating contribution of subsidiary relations made available via graph networks.
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
Our quantitative and qualitative results show that (a) we can predict meaningful forces from videos whose effects lead to accurate imitation of the motions observed, (b) by jointly optimizing for contact point and force prediction, we can improve the performance on both tasks in comparison to independent training, and (c) we can learn a representation from this model that generalizes to novel objects using few shot examples.
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
Human activity understanding is of widespread interest in artificial intelligence and spans diverse applications like health care and behavior analysis.
RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection
The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications.