Action Segmentation

72 papers with code • 9 benchmarks • 16 datasets

Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.

Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Benchmarks

Add a Result

These leaderboards are used to track progress in Action Segmentation

Dataset	Best Model	Compare
Breakfast	AdaFocus (newly extracted I3D-features, LT-Context model)	See all
50 Salads	Br-Prompt+ASPnet (RGB, flow, accelerometer)	See all
GTEA	Semantic2Graph	See all
COIN	UnLoc-L	See all
JIGSAWS	MRG-Net	See all
Assembly101	LTContext	See all
Youtube INRIA Instructional	TSA (FINCH)	See all
50Salads	EUT	See all
MPII Cooking 2 Dataset	Unsup. TW-FINCH (K=avg/activity)	See all

Libraries

Use these libraries to find Action Segmentation models and implementations

pytorch/fairseq

2 papers

29,251

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Temporal Convolutional Networks for Action Segmentation and Detection

colincsl/TemporalConvolutionalNetworks • • CVPR 2017

The ability to identify and temporally segment fine-grained human actions throughout a video is crucial for robotics, surveillance, education, and beyond.

Paper
Code

End-to-End Learning of Visual Representations from Uncurated Instructional Videos

antoine77340/MIL-NCE_HowTo100M • • CVPR 2020

Annotating videos is cumbersome, expensive and not scalable.

Paper
Code

LOGO: A Long-Form Video Dataset for Group Action Quality Assessment

shiyi-zh0408/logo • • CVPR 2023

Action quality assessment (AQA) has become an emerging topic since it can be extensively applied in numerous scenarios.

Paper
Code

MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation

yabufarha/ms-tcn • • CVPR 2019

Temporally locating and classifying action segments in long untrimmed videos is of particular interest to many applications like surveillance and robotics.

Paper
Code

UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

microsoft/UniVL • • 15 Feb 2020

However, most of the existing multimodal models are pre-trained for understanding tasks, leading to a pretrain-finetune discrepancy for generation tasks.

Paper
Code

Alleviating Over-segmentation Errors by Detecting Action Boundaries

yiskw713/asrf • • 14 Jul 2020

Our model architecture consists of a long-term feature extractor and two branches: the Action Segmentation Branch (ASB) and the Boundary Regression Branch (BRB).

Paper
Code

Global2Local: Efficient Structure Search for Video Action Segmentation

ShangHua-Gao/G2L-search • • CVPR 2021

Our search scheme exploits both global search to find the coarse combinations and local search to get the refined receptive field combination patterns further.

Paper
Code

VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

pytorch/fairseq • • EMNLP 2021

We present VideoCLIP, a contrastive approach to pre-train a unified model for zero-shot video and text understanding, without using any labels on downstream tasks.

Paper
Code

RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks

ShangHua-Gao/RFNext • • 14 Jun 2022

Our search scheme exploits both global search to find the coarse combinations and local search to get the refined receptive field combinations further.

Paper
Code

Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation

boschresearch/uvast • • 1 Sep 2022

This paper introduces a unified framework for video action segmentation via sequence to sequence (seq2seq) translation in a fully and timestamp supervised setup.

Paper
Code

Action Segmentation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result