The CIFAR-10 dataset (Canadian Institute for Advanced Research, 10 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The images are labelled with one of 10 mutually exclusive classes: airplane, automobile (but not truck or pickup truck), bird, cat, deer, dog, frog, horse, ship, and truck (but not pickup truck). There are 6000 images per class with 5000 training and 1000 testing images per class.
14,087 PAPERS • 98 BENCHMARKS
WSJ0-2mix is a speech recognition corpus of speech mixtures using utterances from the Wall Street Journal (WSJ0) corpus.
144 PAPERS • 2 BENCHMARKS
ImageNet-P consists of noise, blur, weather, and digital distortions. The dataset has validation perturbations; has difficulty levels; has CIFAR-10, Tiny ImageNet, ImageNet 64 × 64, standard, and Inception-sized editions; and has been designed for benchmarking not training networks. ImageNet-P departs from ImageNet-C by having perturbation sequences generated from each ImageNet validation image. Each sequence contains more than 30 frames, so to counteract an increase in dataset size and evaluation time only 10 common perturbations are used.
28 PAPERS • 1 BENCHMARK
NAS-Bench-1Shot1 draws on the recent large-scale tabular benchmark NAS-Bench-101 for cheap anytime evaluations of one-shot NAS methods.
23 PAPERS • NO BENCHMARKS YET
comma 2k19 is a dataset of over 33 hours of commute in California's 280 highway. This means 2019 segments, 1 minute long each, on a 20km section of highway driving between California's San Jose and San Francisco. The dataset was collected using comma EONs that have sensors similar to those of any modern smartphone including a road-facing camera, phone GPS, thermometers and a 9-axis IMU.
10 PAPERS • NO BENCHMARKS YET
Text Classification Attack Benchmark (TCAB) is a dataset for analyzing, understanding, detecting, and labeling adversarial attacks against text classifiers. TCAB includes 1.5 million attack instances, generated by twelve adversarial attack targeting three classifiers trained on six source datasets for sentiment analysis and abuse detection in English. The process of generating attacks is automated, so that TCAB can easily be extended to incorporate new text attacks and better classifiers as they are developed.
3 PAPERS • NO BENCHMARKS YET
The Cifar10Mnist dataset is created using CIFAR-10 and MNIST data sources. Since the CIFAR-10 training set consists of 50000 images and the MNIST training set contains 60000 digits, the first 50000 digits from MNIST are padded on top of the CIFAR-10 images after making them slightly translucent. A first training dataset is then obtained (50000 images). Furthermore, the remaining 10000 MNIST digits are padded on top of 10000 random CIFAR10 images (with a fixed seed). This gives the possibility of having a second training dataset of 60000 images. For the test set, the 10000 CIFAR-10 images are padded over the 10000 MNIST digits.
1 PAPER • NO BENCHMARKS YET
The PointDenoisingBenchmark dataset features 28 different shapes, split into 18 training shapes and 10 test shapes.
REAP is a digital benchmark that allows the user to evaluate patch attacks on real images, and under real-world conditions. Built on top of the Mapillary Vistas dataset, the benchmark contains over 14,000 traffic signs. Each sign is augmented with a pair of geometric and lighting transformations, which can be used to apply a digitally generated patch realistically onto the sign.