Scene Parsing
75 papers with code • 2 benchmarks • 4 datasets
Scene parsing is to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed. MIT Description
Libraries
Use these libraries to find Scene Parsing models and implementationsSubtasks
Most implemented papers
Pyramid Scene Parsing Network
Scene parsing is challenging for unrestricted open vocabulary and diverse scenes.
Semantic Understanding of Scenes through the ADE20K Dataset
Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision.
Panoptic Segmentation
We propose and study a task we name panoptic segmentation (PS).
OCNet: Object Context Network for Scene Parsing
To capture richer context information, we further combine our interlaced sparse self-attention scheme with the conventional multi-scale context schemes including pyramid pooling~\citep{zhao2017pyramid} and atrous spatial pyramid pooling~\citep{chen2018deeplab}.
Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes
The proposed deep dual-resolution networks (DDRNets) are composed of two deep branches between which multiple bilateral fusions are performed.
ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data
Scene understanding of high resolution aerial images is of great importance for the task of automated monitoring in various remote sensing applications.
Semantic Flow for Fast and Accurate Scene Parsing
A common practice to improve the performance is to attain high resolution feature maps with strong semantic representation.
PSANet: Point-wise Spatial Attention Network for Scene Parsing
We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes.
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition
This work introduces pyramidal convolution (PyConv), which is capable of processing the input at multiple filter scales.
Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing
We introduce Synscapes -- a synthetic dataset for street scene parsing created using photorealistic rendering techniques, and show state-of-the-art results for training and validation as well as new types of analysis.