Monocular 3D Human Pose Estimation
65 papers with code • 1 benchmarks • 5 datasets
This task targets at 3D human pose estimation with a single RGB camera.
Libraries
Use these libraries to find Monocular 3D Human Pose Estimation models and implementationsMost implemented papers
DensePose: Dense Human Pose Estimation In The Wild
In this work, we establish dense correspondences between RGB image and a surface-based representation of the human body, a task we refer to as dense human pose estimation.
A simple yet effective baseline for 3d human pose estimation
Following the success of deep convolutional networks, state-of-the-art methods for 3d human pose estimation have focused on deep end-to-end systems that predict 3d joint locations given raw image pixels.
Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image
We propose a unified formulation for the problem of 3D human pose estimation from a single raw RGB image that reasons jointly about 2D joint estimation and 3D pose reconstruction to improve both tasks.
3D human pose estimation in video with temporal convolutions and semi-supervised training
We start with predicted 2D keypoints for unlabeled video, then estimate 3D poses and finally back-project to the input 2D keypoints.
End-to-end Recovery of Human Shape and Pose
The main objective is to minimize the reprojection loss of keypoints, which allow our model to be trained using images in-the-wild that only have ground truth 2D annotations.
Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach
We propose a weakly-supervised transfer learning method that uses mixed 2D and 3D labels in a unified deep neutral network that presents two-stage cascaded structure.
Semantic Graph Convolutional Networks for 3D Human Pose Regression
In this paper, we study the problem of learning Graph Convolutional Networks (GCNs) for regression.
VIBE: Video Inference for Human Body Pose and Shape Estimation
Human motion is fundamental to understanding behavior.
XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera
The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals. We contribute a new architecture for this CNN, called SelecSLS Net, that uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy.
Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image
Although significant improvement has been achieved recently in 3D human pose estimation, most of the previous methods only treat a single-person case.