Text-to-Video Editing
4 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Text-to-Video Editing
Most implemented papers
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
We also have a better zero-shot shape-aware editing ability based on the text-to-video model.
ControlVideo: Conditional Control for One-shot Text-driven Video Editing and Beyond
This paper presents \emph{ControlVideo} for text-driven video editing -- generating a video that aligns with a given text while preserving the structure of the source video.
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
To address this challenge, we introduce a novel paradigm dubbed as Gen-L-Video, capable of extending off-the-shelf short video diffusion models for generating and editing videos comprising hundreds of frames with diverse semantic segments without introducing additional training, all while preserving content consistency.
Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing
To address this issue, we propose a novel and general contextualized diffusion model (ContextDiff) by incorporating the cross-modal context encompassing interactions and alignments between text condition and visual sample into forward and reverse processes.