3D Face Animation
21 papers with code • 3 benchmarks • 6 datasets
Image: Cudeiro et al
Datasets
Most implemented papers
Learning a model of facial shape and expression from 4D scans
FLAME is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model.
Learning an Animatable Detailed 3D Face Model from In-The-Wild Images
Some methods produce faces that cannot be realistically animated because they do not model how wrinkles vary with expression.
MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement
To improve upon existing models, we propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face.
Generating Holistic 3D Human Motion from Speech
This work addresses the problem of generating 3D holistic body motions from human speech.
EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation
Specifically, we introduce the emotion disentangling encoder (EDE) to disentangle the emotion and content in the speech by cross-reconstructed speech signals with different emotion labels.
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
We propose EMAGE, a framework to generate full-body human gestures from audio and masked gestures, encompassing facial, local body, hands, and global movements.
3D faces in motion: Fully automatic registration and statistical analysis
The resulting statistical analysis is applied to automatically generate realistic facial animations and to recognize dynamic facial expressions.
Capture, Learning, and Synthesis of 3D Speaking Styles
To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers.
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose
In this paper, we address this problem by proposing a deep neural network model that takes an audio signal A of a source person and a very short video V of a target person as input, and outputs a synthesized high-quality talking face video with personalized head pose (making use of the visual information in V), expression and lip synchronization (by considering both A and V).
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning
In this paper, we propose a talking face generation method that takes an audio signal as input and a short target video clip as reference, and synthesizes a photo-realistic video of the target face with natural lip motions, head poses, and eye blinks that are in-sync with the input audio signal.