Story Visualization
20 papers with code • 3 benchmarks • 1 datasets
Story Visualization is the task of generating coherent and aligned sequence of images given a sequence of textual captions representing description of a story. It mainly consists of two tasks: story generation and story continuation, where story continuation uses additional ground truth information in the form of the first frame.
Most implemented papers
Character-Centric Story Visualization via Visual Planning and Token Alignment
This task requires machines to 1) understand long text inputs and 2) produce a globally consistent image sequence that illustrates the contents of the story.
StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
3) The story visualization and continuation models are trained and inferred independently, which is not user-friendly.
Show Me a Story: Towards Coherent Neural Story Illustration
We propose an end-to-end network for the visual illustration of a sequence of sentences forming a story.
StoryGAN: A Sequential Conditional GAN for Story Visualization
We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework.
Improving Generation and Evaluation of Visual Stories via Semantic Consistency
Therefore, we also provide an exploration of evaluation metrics for the model, focused on aspects of the generated frames such as the presence/quality of generated characters, the relevance to captions, and the diversity of the generated images.
Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization
Prior work in this domain has shown that there is ample room for improvement in the generated image sequence in terms of visual quality, consistency and relevance.
Modular StoryGAN with Background and Theme Awareness for Story Visualization
To measure the local and global consistency we introduced background and theme awareness, which are expected attributes of the solutions.
Word-Level Fine-Grained Story Visualization
Story visualization aims to generate a sequence of images to narrate each sentence in a multi-sentence story with a global consistency across dynamic scenes and characters.
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation
Hence, we first propose the task of story continuation, where the generated visual story is conditioned on a source image, allowing for better generalization to narratives with new characters.
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity.