Video-Adverb Retrieval

4 papers with code • 5 benchmarks • 5 datasets

The bidirectional video-adverb retrieval task aims at retrieving adverbs that match an action in a video and vice versa.

Benchmarks

Add a Result

These leaderboards are used to track progress in Video-Adverb Retrieval

Dataset	Best Model	Compare
HowTo100M Adverbs	ReGaDa	See all
AIR	ReGaDa	See all
ActivityNet Adverbs	ReGaDa	See all
MSR-VTT Adverbs	ReGaDa	See all
VATEX Adverbs	ReGaDa	See all

Datasets

Subtasks

Video-Adverb Retrieval (Unseen Compositions)

Most implemented papers

Most implemented Social Latest No code

Action Modifiers: Learning from Adverbs in Instructional Videos

hazeld/action-modifiers • • CVPR 2020

We present a method to learn a representation for adverbs from instructional videos using weak supervision from the accompanying narrations.

Paper
Code

How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs

hazeld/pseudoadverbs • • CVPR 2022

We aim to understand how actions are performed and identify subtle differences, such as 'fold firmly' vs. 'fold gently'.

Paper
Code

Learning Action Changes by Measuring Verb-Adverb Textual Relationships

dmoltisanti/air-cvpr23 • • CVPR 2023

The goal of this work is to understand the way actions are performed in videos.

Paper
Code

Video-adverb retrieval with compositional adverb-action embeddings

ExplainableML/ReGaDa • • 26 Sep 2023

We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching compositional adverb-action text embedding in a joint embedding space.

Paper
Code

Video-Adverb Retrieval

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Action Modifiers: Learning from Adverbs in Instructional Videos

How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs

Learning Action Changes by Measuring Verb-Adverb Textual Relationships

Video-adverb retrieval with compositional adverb-action embeddings

Content

Benchmarks

Add a Result