Supervised Video Summarization

7 papers with code • 2 benchmarks • 3 datasets

Supervised video summarization rely on datasets with human-labeled ground-truth annotations (either in the form of video summaries, as in the case of the SumMe dataset, or in the form of frame-level importance scores, as in the case of the TVSum dataset), based on which they try to discover the underlying criterion for video frame/fragment selection and video summarization.

Source: Video Summarization Using Deep Neural Networks: A Survey

Benchmarks

Add a Result

These leaderboards are used to track progress in Supervised Video Summarization

Trend	Dataset	Best Model	Paper	Code	Compare
	SumMe	PGL-SUM (maximum learning capacity)			See all
	TvSum	MAVS [DBLP:conf/mm/FengLKZ18]			See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward

KaiyangZhou/vsumm-reinforce • • 29 Dec 2017

Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos.

Paper
Code

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

yueatsprograms/ttt_cifar_release • • 29 Sep 2019

In this paper, we propose Test-Time Training, a general approach for improving the performance of predictive models when training and test data come from different distributions.

Paper
Code

Supervised Video Summarization via Multiple Feature Sets with Parallel Attention

TIBHannover/MSVA • • 23 Apr 2021

The proposed architecture utilizes an attention mechanism before fusing motion features and features representing the (static) visual content, i. e., derived from an image classification model.

Paper
Code

Discriminative Feature Learning for Unsupervised Video Summarization

wildoctopus/SADNet • • 24 Nov 2018

The proposed variance loss allows a network to predict output scores for each frame with high discrepancy which enables effective feature learning and significantly improves model performance.

Paper
Code

DSNet: A Flexible Detect-to-Summarize Network for Video Summarization

li-plus/DSNet • • 1 Dec 2020

In this paper, we propose a Detect-to-Summarize network (DSNet) framework for supervised video summarization.

Paper
Code

Combining Global and Local Attention with Positional Encoding for Video Summarization

e-apostolidis/PGL-SUM • • IEEE International Symposium on Multimedia (ISM) 2021

This paper presents a new method for supervised video summarization.

Paper
Code

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

boheumd/A2Summ • • CVPR 2023

The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.

Paper
Code

Supervised Video Summarization

Benchmarks Add a Result

Datasets

Most implemented papers

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

Supervised Video Summarization via Multiple Feature Sets with Parallel Attention

Discriminative Feature Learning for Unsupervised Video Summarization

DSNet: A Flexible Detect-to-Summarize Network for Video Summarization

Combining Global and Local Attention with Positional Encoding for Video Summarization

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

Content

Benchmarks

Add a Result