Video-Adverb Retrieval
4 papers with code • 5 benchmarks • 5 datasets
The bidirectional video-adverb retrieval task aims at retrieving adverbs that match an action in a video and vice versa.
Most implemented papers
Action Modifiers: Learning from Adverbs in Instructional Videos
We present a method to learn a representation for adverbs from instructional videos using weak supervision from the accompanying narrations.
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
We aim to understand how actions are performed and identify subtle differences, such as 'fold firmly' vs. 'fold gently'.
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
The goal of this work is to understand the way actions are performed in videos.
Video-adverb retrieval with compositional adverb-action embeddings
We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching compositional adverb-action text embedding in a joint embedding space.