TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Video Generation	Sky Time-lapse	StyleSV (256x256)	FVD 16	49.0	# 1
Video Generation	Taichi	StyleSV (256x256)	FVD16	82.6	# 1
Video Generation	YouTube Driving	StyleSV	FVD16	207.2	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/towards-smooth-video-composition/video-generation-on-sky-time-lapse)](https://paperswithcode.com/sota/video-generation-on-sky-time-lapse?p=towards-smooth-video-composition)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/towards-smooth-video-composition/video-generation-on-taichi)](https://paperswithcode.com/sota/video-generation-on-taichi?p=towards-smooth-video-composition)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/towards-smooth-video-composition/video-generation-on-youtube-driving)](https://paperswithcode.com/sota/video-generation-on-youtube-driving?p=towards-smooth-video-composition)`

Towards Smooth Video Composition

14 Dec 2022 · Qihang Zhang, Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou ·

Video generation requires synthesizing consistent and persistent frames with dynamic content over time. This work investigates modeling the temporal relations for composing video with arbitrary length, from a few frames to even infinite, using generative adversarial networks (GANs). First, towards composing adjacent frames, we show that the alias-free operation for single image generation, together with adequately pre-learned knowledge, brings a smooth frame transition without compromising the per-frame quality. Second, by incorporating the temporal shift module (TSM), originally designed for video understanding, into the discriminator, we manage to advance the generator in synthesizing more consistent dynamics. Third, we develop a novel B-Spline based motion representation to ensure temporal smoothness to achieve infinite-length video generation. It can go beyond the frame number used in training. A low-rank temporal modulation is also proposed to alleviate repeating contents for long video generation. We evaluate our approach on various datasets and show substantial improvements over video generation baselines. Code and models will be publicly available at https://genforce.github.io/StyleSV.

PDF Abstract

Code

Add Remove Mark official

genforce/StyleSV official

Tasks

Add Remove

Image Generation

single-image-generation

Video Generation

Video Understanding

Datasets

YouTube Driving

Results from the Paper

Edit

Ranked #1 on Video Generation on YouTube Driving

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Video Generation	Sky Time-lapse	StyleSV (256x256)	FVD 16	49.0	# 1	Compare
Video Generation	Taichi	StyleSV (256x256)	FVD16	82.6	# 1	Compare
Video Generation	YouTube Driving	StyleSV	FVD16	207.2	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Towards Smooth Video Composition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove