Optical Flow Estimation
652 papers with code • 10 benchmarks • 33 datasets
Optical Flow Estimation is a computer vision task that involves computing the motion of objects in an image or a video sequence. The goal of optical flow estimation is to determine the movement of pixels or features in the image, which can be used for various applications such as object tracking, motion analysis, and video compression.
Approaches for optical flow estimation include correlation-based, block-matching, feature tracking, energy-based, and more recently gradient-based.
Further readings:
Definition source: Devon: Deformable Volume Network for Learning Optical Flow
Image credit: Optical Flow Estimation
Libraries
Use these libraries to find Optical Flow Estimation models and implementationsDatasets
Most implemented papers
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
It then uses the warped features and features of the first image to construct a cost volume, which is processed by a CNN to estimate the optical flow.
FlowNet: Learning Optical Flow with Convolutional Networks
Optical flow estimation has not been among the tasks where CNNs were successful.
FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Particularly on small displacements and real-world data, FlowNet cannot compete with variational methods.
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field through a recurrent unit that performs lookups on the correlation volumes.
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation
We propose RIFE, a Real-time Intermediate Flow Estimation algorithm for Video Frame Interpolation (VFI).
Optical Flow Estimation using a Spatial Pyramid Network
We learn to compute optical flow by combining a classical spatial-pyramid formulation with deep learning.
Two-Stream Convolutional Networks for Action Recognition in Videos
Our architecture is trained and evaluated on the standard video actions benchmarks of UCF-101 and HMDB-51, where it is competitive with the state of the art.
Perceiver IO: A General Architecture for Structured Inputs & Outputs
A central goal of machine learning is the development of systems that can solve many problems in as many data domains as possible.
Video Frame Interpolation via Adaptive Separable Convolution
Our method develops a deep fully convolutional neural network that takes two input frames and estimates pairs of 1D kernels for all pixels simultaneously.
Semantic Flow for Fast and Accurate Scene Parsing
A common practice to improve the performance is to attain high resolution feature maps with strong semantic representation.