Video Generation
241 papers with code • 15 benchmarks • 14 datasets
( Various Video Generation Tasks. Gif credit: MaGViT )
Libraries
Use these libraries to find Video Generation models and implementationsDatasets
Most implemented papers
Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture
FlowGAN generates optical flow, which contains only the edge and motion of the videos to be begerated.
Stochastic Video Generation with a Learned Prior
Sample generations are both varied and sharp, even many frames into the future, and compare favorably to those from existing approaches.
Point-to-Point Video Generation
We introduce point-to-point video generation that controls the generation process with two control points: the targeted start- and end-frames.
Hierarchical Patch VAE-GAN: Generating Diverse Videos from a Single Sample
We consider the task of generating diverse and novel videos from a single video sample.
VideoGPT: Video Generation using VQ-VAE and Transformers
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
Video Diffusion Models
Generating temporally coherent high fidelity video is an important milestone in generative modeling research.
Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives
In light of this, we propose the Disentangled Objective Video Quality Evaluator (DOVER) to learn the quality of UGC videos based on the two perspectives.
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T2V) generator.
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
With the availability of large-scale video datasets and the advances of diffusion models, text-driven video generation has achieved substantial progress.
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
The I2V model is designed to produce videos that strictly adhere to the content of the provided reference image, preserving its content, structure, and style.