Grounded Situation Recognition

11 papers with code • 1 benchmarks • 1 datasets

Grounded Situation Recognition aims to produce the structured image summary which describes the primary activity (verb), its relevant entities (nouns), and their bounding-box groundings.

Benchmarks

Add a Result

These leaderboards are used to track progress in Grounded Situation Recognition

Trend	Dataset	Best Model	Paper	Code	Compare
	SWiG	ClipSitu			See all

Datasets

VASR

Most implemented papers

Most implemented Social Latest No code

Collaborative Transformers for Grounded Situation Recognition

jhcho99/coformer • • CVPR 2022

To implement this idea, we propose Collaborative Glance-Gaze TransFormer (CoFormer) that consists of two modules: Glance transformer for activity classification and Gaze transformer for entity estimation.

Paper
Code

Commonly Uncommon: Semantic Sparsity in Situation Recognition

my89/imSitu • • CVPR 2017

Semantic sparsity is a common challenge in structured visual classification problems; when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set.

Paper
Code

Situation Recognition: Visual Semantic Role Labeling for Image Understanding

my89/imSitu • • CVPR 2016

This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e. g., clipping), (2) the participating actors, objects, substances, and locations (e. g., man, shears, sheep, wool, and field) and most importantly (3) the roles these participants play in the activity (e. g., the man is clipping, the shears are his tool, the wool is being clipped from the sheep, and the clipping is in a field).

Paper
Code

Situation Recognition with Graph Neural Networks

thilinicooray/context-aware-reasoning-for-sr • • ICCV 2017

We address the problem of recognizing situations in images.

Paper
Code

Grounded Situation Recognition

allenai/swig • • ECCV 2020

We introduce Grounded Situation Recognition (GSR), a task that requires producing structured semantic summaries of images describing: the primary activity, entities engaged in the activity with their roles (e. g. agent, tool), and bounding-box groundings of entities.

Paper
Code

Attention-Based Context Aware Reasoning for Situation Recognition

thilinicooray/context-aware-reasoning-for-sr • • CVPR 2020

However, existing query-based reasoning methods have not considered handling of inter-dependent queries which is a unique requirement of semantic role prediction in SR.

Paper
Code

Grounded Situation Recognition with Transformers

jhcho99/gsrtr • • 19 Nov 2021

Grounded Situation Recognition (GSR) is the task that not only classifies a salient action (verb), but also predicts entities (nouns) associated with semantic roles and their locations in the given image.

Paper
Code

Rethinking the Two-Stage Framework for Grounded Situation Recognition

kellyiss/situformer • • 10 Dec 2021

Since each verb is associated with a specific set of semantic roles, all existing GSR methods resort to a two-stage framework: predicting the verb in the first stage and detecting the semantic roles in the second stage.

Paper
Code

GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement

zhiqic/gsrformer • • 18 Aug 2022

In the second stage, we exploit transformer layers to unearth the potential semantic relations within both verbs and semantic roles.

Paper
Code

ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition

LUNAProject22/CLIPSitu • • 2 Jul 2023

Situation Recognition is the task of generating a structured summary of what is happening in an image using an activity verb and the semantic roles played by actors and objects.

Paper
Code

Grounded Situation Recognition

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result