Video Alignment
21 papers with code • 2 benchmarks • 4 datasets
Most implemented papers
Time-Contrastive Networks: Self-Supervised Learning from Video
While representations are learned from an unlabeled collection of task-related videos, robot behaviors such as pouring are learned by watching a single 3rd-person demonstration by a human.
Learning from Video and Text via Large-Scale Discriminative Clustering
Discriminative clustering has been successfully applied to a number of weakly-supervised learning tasks.
Temporal Cycle-Consistency Learning
We introduce a self-supervised representation learning method based on the task of temporal alignment between videos.
View-Invariant Probabilistic Embedding for Human Pose
Depictions of similar human body configurations can vary with changing viewpoints.
View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose
Recognition of human poses and actions is crucial for autonomous systems to interact smoothly with people.
LAMV: Learning to Align and Match Videos With Kernelized Temporal Layers
This paper considers a learnable approach for comparing and aligning videos.
Dynamic Temporal Alignment of Speech to Lips
This alignment is based on deep audio-visual features, mapping the lips video and the speech signal to a shared representation.
Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video
Our method learns a general skill embedding independently from the task context by using an adversarial loss.
Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning
In this paper, we introduce a novel contrastive action representation learning (CARL) framework to learn frame-wise action representations, especially for long videos, in a self-supervised manner.
Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space
To this end, we propose a 3D Token Representation Layer (3DTRL) that estimates the 3D positional information of the visual tokens and leverages it for learning viewpoint-agnostic representations.