The CIFAR-10 dataset (Canadian Institute for Advanced Research, 10 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The images are labelled with one of 10 mutually exclusive classes: airplane, automobile (but not truck or pickup truck), bird, cat, deer, dog, frog, horse, ship, and truck (but not pickup truck). There are 6000 images per class with 5000 training and 1000 testing images per class.
14,087 PAPERS • 98 BENCHMARKS
The STL-10 is an image dataset derived from ImageNet and popularly used to evaluate algorithms of unsupervised feature learning or self-taught learning. Besides 100,000 unlabeled images, it contains 13,000 labeled images from 10 object classes (such as birds, cats, trucks), among which 5,000 images are partitioned for training while the remaining 8,000 images for testing. All the images are color images with 96×96 pixels in size.
958 PAPERS • 17 BENCHMARKS
The Food-101 dataset consists of 101 food categories with 750 training and 250 test images per category, making a total of 101k images. The labels for the test images have been manually cleaned, while the training set contains some noise.
589 PAPERS • 10 BENCHMARKS
Berkeley Segmentation Data Set 500 (BSDS500) is a standard benchmark for contour detection. This dataset is designed for evaluating natural edge detection that includes not only object contours but also object interior boundaries and background boundaries. It includes 500 natural images with carefully annotated boundaries collected from multiple users. The dataset is divided into three parts: 200 for training, 100 for validation and the rest 200 for test.
241 PAPERS • 8 BENCHMARKS
Fer2013 contains approximately 30,000 facial RGB images of different expressions with size restricted to 48×48, and the main labels of it can be divided into 7 types: 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral. The Disgust expression has the minimal number of images – 600, while other labels have nearly 5,000 samples each.
138 PAPERS • 4 BENCHMARKS
Imagenet32 is a huge dataset made up of small images called the down-sampled version of Imagenet. Imagenet32 is composed of 1,281,167 training data and 50,000 test data with 1,000 labels.
93 PAPERS • 4 BENCHMARKS
PatchCamelyon is an image classification dataset. It consists of 327.680 color images (96 x 96px) extracted from histopathologic scans of lymph node sections. Each image is annotated with a binary label indicating presence of metastatic tissue. PCam provides a new benchmark for machine learning models: bigger than CIFAR10, smaller than ImageNet, trainable on a single GPU.
80 PAPERS • 2 BENCHMARKS
The Oxford-IIIT Pet Dataset has 37 categories with roughly 200 images for each class. The images have a large variations in scale, pose and lighting. All images have an associated ground truth annotation of breed, head ROI, and pixel level trimap segmentation.
73 PAPERS • 5 BENCHMARKS
The Oxford-IIIT Pet Dataset is a 37-category pet dataset with roughly 200 images for each class. The images have large variations in scale, pose, and lighting. All images have an associated ground truth annotation of breed, head ROI, and pixel-level trimap segmentation.
42 PAPERS • 7 BENCHMARKS
Contains over 4,000 images created by realigning and merging existing HDR and standard-dynamic-range (SDR) datasets.
6 PAPERS • NO BENCHMARKS YET
We propose a new light field image database called “PINet” inheriting the hierarchical structure from WordNet. It consists of 7549 LIs captured by Lytro Illum, which is much larger than the existing databases. The images are manually annotated to 178 categories according to WordNet, such as cat, camel, bottle, fans, etc. The registered depth maps are also provided. Each image is generated by processing the raw LI from the camera by Light Field Toolbox v0.4 for demosaicing and devignetting.
0 PAPER • NO BENCHMARKS YET