Video Summarization
68 papers with code • 5 benchmarks • 13 datasets
Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.
Source: Video Summarization Using Deep Neural Networks: A Survey
Image credit: iJRASET
Datasets
Most implemented papers
Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward
Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos.
Summarizing Videos with Attention
In this work we propose a novel method for supervised, keyshots based video summarization by applying a conceptually simple and computationally efficient soft, self-attention mechanism.
Video Summarization using Deep Semantic Features
For this, we design a deep neural network that maps videos as well as descriptions to a common semantic space and jointly trained it with associated pairs of videos and descriptions.
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision
In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples.
Rethinking the Evaluation of Video Summaries
Video summarization is a technique to create a short skim of the original video while preserving the main stories/content.
Unsupervised video summarization framework using keyframe extraction and video skimming
Video is one of the robust sources of information and the consumption of online and offline videos has reached an unprecedented level in the last few years.
Convolutional Hierarchical Attention Network for Query-Focused Video Summarization
This paper addresses the task of query-focused video summarization, which takes user's query and a long video as inputs and aims to generate a query-focused video summary.
Supervised Video Summarization via Multiple Feature Sets with Parallel Attention
The proposed architecture utilizes an attention mechanism before fusing motion features and features representing the (static) visual content, i. e., derived from an image classification model.
GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization
Traditional video summarization methods generate fixed video representations regardless of user interest.
Egocentric Video-Language Pretraining
Video-Language Pretraining (VLP), which aims to learn transferable representation to advance a wide range of video-text downstream tasks, has recently received increasing attention.