Semantic Segmentation

5199 papers with code • 125 benchmarks • 311 datasets

Semantic Segmentation is a computer vision task in which the goal is to categorize each pixel in an image into a class or object. The goal is to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. Some example benchmarks for this task are Cityscapes, PASCAL VOC and ADE20K. Models are usually evaluated with the Mean Intersection-Over-Union (Mean IoU) and Pixel Accuracy metrics.

( Image credit: CSAILVision )

Benchmarks

Add a Result

These leaderboards are used to track progress in Semantic Segmentation

Dataset	Best Model	Compare
ADE20K	ONE-PEACE	See all
NYU Depth v2	OmniVec	See all
Cityscapes test	VLTSeg	See all
ADE20K val	BEiT-3	See all
Cityscapes val	SERNet-Former	See all
PASCAL Context	PlainSeg (EVA-02-L)	See all
S3DIS	PTv3 + PPT	See all
S3DIS Area5	OmniVec	See all
PASCAL VOC 2012 test	DeepLabv3+ (Xception-65-JFT)	See all
SUN-RGBD	TokenFusion (S)	See all
DensePASS	Trans4PASS+ (multi-scale)	See all
ScanNet	PTv3 + PPT	See all
PASCAL VOC 2012 val	EfficientNet-L2+NAS-FPN (single scale test, with self-training)	See all
DADA-seg	MMUDA	See all
Stanford2D3D Panoramic	SFSS-MMSI (RGB+HHA)	See all
ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT, mmseg)	See all
LaRS	SWIM^2 (Mask2Former)	See all
CamVid	SERNet-Former	See all
COCO-Stuff test	EVA	See all
iSAID	SegNeXt-L	See all
Semantic3D	Feature Geometric Net	See all
ISPRS Potsdam	AerialFormer-B	See all
Trans10K	Trans4Trans (M)	See all
Dark Zurich	Refign (HRDA)	See all
KITTI-360	CMNeXt (RGB-D-E-LiDAR)	See all
MCubeS	MMSFormer (RGB-A-D-N)	See all
DeLiVER	CMNeXt (RGB-D-E-LiDAR)	See all
UrbanLF	CMNeXt (RGB-LF80)	See all
LIP val	Hulk(Finetune, ViT-L)	See all
ScanNetV2	CMX	See all
GTAV-to-Cityscapes Labels	MIC	See all
Nighttime Driving	TADP	See all
LoveDA	ViT-G12X4	See all
EventScape	CMX (B4)	See all
FMB Dataset	MMSFormer (RGB-Infrared)	See all
ISPRS Vaihingen	LSKNet-S	See all
SpaceNet 1	MAE+MTP(ViT-L)	See all
ZJU-RGB-P	ShareCMP (B4 RGB-FP)	See all
INRIA Aerial Image Labeling	UANet(PVT-V2-B2)	See all
LLRGBD-synthetic	SMMCL (SegNeXt-B)	See all
UPLight	ShareCMP (B2 RGB-FP)	See all
MCubeS (P)	MMSFormer (RGB-A-D)	See all
SpectralWaste	CMX (RGB-HYPER)	See all
DDD17	CMNeXt	See all
DSEC	CMNeXt	See all
KITTI Semantic Segmentation	RPVNet [xu2021rpvnet]	See all
SkyScapes-Dense	SkyScapesNet-Dense	See all
FoodSeg103	FoodSAM	See all
SYNTHIA-to-Cityscapes	HRDA + PiPa	See all
SynPASS	Trans4PASS+	See all
SELMA	CMX	See all
Pothole Mix	Baseline - DeepLabv3+	See all
DELIVER	CMNeXt (RGB-D-E-LiDAR)	See all
Mapillary val	AO-SegNet	See all
MS COCO	OneFormer (InternImage-H, emb_dim=1024, single-scale)	See all
Stanford2D3D - RGBD	CMX (SegFormer-B4)	See all
Event-based Segmentation Dataset	Bimodal SegNet	See all
GAMUS	TIMF	See all
ACDC Scribbles	ScribFormer	See all
ShapeNet	PatchFormer	See all
UAVid	LSKNet-S	See all
BIG	PSPNet + CascadePSP	See all
PETRAW	NCC Next	See all
Hypersim	MultiMAE (ViT-B)	See all
Structured3D	SFSS-MMSI (RGB+Depth+Normal)	See all
Matterport3D	SFSS-MMSI (RGB+Depth)	See all
CC3M-TagMask	TTD (TCL)	See all
PASCAL VOC 2011 test	Plugin network	See all
RELLIS-3D Dataset	GA-Nav	See all
PASTIS	Exchanger+Mask2Former	See all
SIFT-flow	RBE2E	See all
Stanford2D3D Panoramic - RGBD	CBFC	See all
Toronto-3D L002	SCF-Net	See all
Montgomery County X-ray Set	UNETR + SS-CXR	See all
dacl10k v1 testdev	FPN EfficientNet-B4 w/ Aux loss	See all
SYNTHIA-CVPR’16	SSMA	See all
Freiburg Forest	SSMA	See all
38-Cloud	Cloud-Net+	See all
PASCAL VOC 2007	GALDNet	See all
SkyScapes-Lane	SkyScapesNet-Lane	See all
Kvasir-Instrument	UNet	See all
Graz-02	VOLO-D5	See all
Cleargrasp (Novel)	Cleargrasp	See all
Cityscapes	SPFNet34M	See all
Endoscapes	MoCo V2 Surg SSL - DeepLabv3+ head	See all
HERA RFI Detection	Nearest Latent Neighbours	See all
LOFAR RFI Detection	Nearest Latent Neighbours	See all
BDD	FasterSeg	See all
COCO-Stuff	Deeplab v2	See all
Cam2BEV	uNetXST	See all
ApolloScape	ERFNet-IntRA-KD (ours)	See all
DroneDeploy	DLv3+ (Xception65)	See all
ManipalUAVid	UVid-Net	See all
Cityscapes VIPriors subset	EfficientSeg	See all
SBCoseg	Dice loss + IS-Triplet loss	See all
PASCAL VOC 2010 test	SIW	See all
PASCAL VOC 2012	DLDL-8s+CRF	See all
COCO-Stuff full	SegFormer-B5 (Single Scale)	See all
PASCAL VOC 2011	DLDL-8s+CRF	See all
AIRS	ICT-Net	See all
WildDash	SIW	See all
OpenEDS	RITnet	See all
SYNTHIA	CGA-Net	See all
PASCAL VOC	SegCLIP	See all
UTFPR-SBD3	EPYNET	See all
DIVA-HisDB	U-Net	See all
ATLANTIS	Erfani et al.	See all
PH2	MFSNet	See all
ISIC 2017	MFSNet	See all
HAM10000	MFSNet	See all
Mila Simulated Floods	FloodTransformer (Ours)	See all
SWIMSEG	ACLNet	See all
SWINSEG	ACLNet	See all
SWINySEG	ACLNet	See all
MixedWM38	WaferSegClassNet	See all
BDD100K val	NiseNet	See all
PASTIS-R	Late Fusion	See all
Cityscapes 3D	TaskPrompter	See all
FLAIR (French Land cover from Aerospace ImageRy)	U-Net baseline	See all
RUGD	GA-Nav	See all
dacl10k v1 testfinal	FPN EfficientNet-B4	See all
SemanticPOSS	TFNet	See all
COCO-Stuff-27	DiffSeg (512)	See all
Forward-Looking Sonar Marine Debris Datasets	Unet+RN34	See all
STARE	UNet	See all

Show all 125 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Semantic Segmentation models and implementations

PaddlePaddle/PaddleSeg

53 papers

8,252

rwightman/pytorch-image-models

33 papers

29,758

osmr/imgclsmob

30 papers

2,917

open-mmlab/mmsegmentation

19 papers

7,405

See all 39 libraries.

Datasets

Subtasks

Weakly-Supervised Semantic Segmentation

Scene Segmentation

Semi-Supervised Semantic Segmentation

Real-Time Semantic Segmentation

3D Part Segmentation

Unsupervised Semantic Segmentation

Road Segmentation

One-Shot Segmentation

Bird's-Eye View Semantic Segmentation

Crack Segmentation

UNET Segmentation

Universal Segmentation

Class-Incremental Semantic Segmentation

Polyp Segmentation

Vision-Language Segmentation

4D Spatio Temporal Semantic Segmentation

Histopathological Segmentation

Attentive segmentation networks

Text-Line Extraction

Aerial Video Semantic Segmentation

Amodal Panoptic Segmentation

Robust BEV Map Segmentation

Most implemented papers

Most implemented Social Latest No code

U-Net: Convolutional Networks for Biomedical Image Segmentation

labmlai/annotated_deep_learning_paper_implementations • • 18 May 2015

There is large consent that successful training of deep networks requires many thousand annotated training samples.

476

Paper
Code

Deep Residual Learning for Image Recognition

tensorflow/models • • CVPR 2016

Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

469

Paper
Code

Mask R-CNN

tensorflow/models • • ICCV 2017

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

172

Paper
Code

MobileNetV2: Inverted Residuals and Linear Bottlenecks

tensorflow/models • • CVPR 2018

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

148

Paper
Code

MMDetection: Open MMLab Detection Toolbox and Benchmark

open-mmlab/mmdetection • • 17 Jun 2019

In this paper, we introduce the various features of this toolbox.

144

Paper
Code

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

google-research/vision_transformer • • ICLR 2021

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.

143

Paper
Code

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

charlesq34/pointnet • • CVPR 2017

Point cloud is an important type of geometric data structure.

109

Paper
Code

FCOS: Fully Convolutional One-Stage Object Detection

tianzhi0549/FCOS • • ICCV 2019

By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training.

Paper
Code

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

tensorflow/models • • ECCV 2018

The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information.

Paper
Code

Rethinking Atrous Convolution for Semantic Image Segmentation

tensorflow/models • • 17 Jun 2017

To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates.

Paper
Code

Semantic Segmentation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result