Visual Relationship Detection
36 papers with code • 5 benchmarks • 5 datasets
Visual relationship detection (VRD) is one newly developed computer vision task aiming to recognize relations or interactions between objects in an image. It is a further learning task after object recognition and is essential for fully understanding images, even the visual world.
Most implemented papers
Graphical Contrastive Losses for Scene Graph Parsing
The first, Entity Instance Confusion, occurs when the model confuses multiple instances of the same type of entity (e. g. multiple cups).
Exploring Long Tail Visual Relationship Recognition with Large Vocabulary
We use these benchmarks to study the performance of several state-of-the-art long-tail models on the LTVRR setup.
Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation
This requires the detection of visual relationships: triples (subject, relation, object) describing a semantic relation between a subject and an object.
One Metric to Measure them All: Localisation Recall Precision (LRP) for Evaluating Visual Detection Tasks
Despite being widely used as a performance measure for visual detection tasks, Average Precision (AP) is limited in (i) reflecting localisation quality, (ii) interpretability and (iii) robustness to the design choices regarding its computation, and its applicability to outputs without confidence scores.
Representing Prior Knowledge Using Randomly, Weighted Feature Networks for Visual Relationship Detection
Furthermore, background knowledge represented by RWFNs can be used to alleviate the incompleteness of training sets even though the space complexity of RWFNs is much smaller than LTNs (1:27 ratio).
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues
This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues.
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
To capture such global interdependency, we propose a deep Variation-structured Reinforcement Learning (VRL) framework to sequentially discover object relationships and attributes in the whole image.
Towards Context-Aware Interaction Recognition for Visual Relationship Detection
The proposed method still builds one classifier for one interaction (as per type (ii) above), but the classifier built is adaptive to context via weights which are context dependent.
Visual relationship detection with deep structural ranking
In this paper, we propose a novel framework, called Deep Structural Ranking, for visual relationship detection.
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Generating scene graph to describe all the relations inside an image gains increasing interests these years.