Video Saliency Detection
19 papers with code • 5 benchmarks • 2 datasets
Most implemented papers
Contextual Encoder-Decoder Network for Visual Saliency Prediction
To develop robust representations for this challenging task, high-level visual features at multiple spatial scales must be extracted and augmented with contextual information.
Simple vs complex temporal recurrences for video saliency prediction
This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain.
Unified Image and Video Saliency Modeling
We evaluate our method on the video saliency datasets DHF1K, Hollywood-2 and UCF-Sports, and the image saliency datasets SALICON and MIT300.
Revisiting Video Saliency: A Large-scale Benchmark and a New Model
Existing video saliency datasets lack variety and generality of common dynamic scenes and fall short in covering challenging situations in unconstrained environments.
DeepVS: A Deep Learning Based Video Saliency Prediction Approach
Hence, an object-to-motion convolutional neural network (OM-CNN) is developed to predict the intra-frame saliency for DeepVS, which is composed of the objectness and motion subnets.
A Dilated Inception Network for Visual Saliency Prediction
In this work, we proposed an end-to-end dilated inception network (DINet) for visual saliency prediction.
TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection
It consists of two building blocks: first, the encoder network extracts low-resolution spatiotemporal features from an input clip of several consecutive frames, and then the following prediction network decodes the encoded features spatially while aggregating all the temporal information.
Video Saliency Prediction Using Enhanced Spatiotemporal Alignment Network
Due to a variety of motions across different frames, it is highly challenging to learn an effective spatiotemporal representation for accurate video saliency prediction (VSP).
A Plug-and-play Scheme to Adapt Image Saliency Deep Model for Video Data
With the rapid development of deep learning techniques, image saliency deep models trained solely by spatial information have occasionally achieved detection performance for video data comparable to that of the models trained by both spatial and temporal information.
Exploring Rich and Efficient Spatial Temporal Interactions for Real Time Video Salient Object Detection
In this way, even though the overall video saliency quality is heavily dependent on its spatial branch, however, the performance of the temporal branch still matter.