The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.
6,980 PAPERS • 52 BENCHMARKS
Fashion-MNIST is a dataset comprising of 28×28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images. Fashion-MNIST shares the same image size, data format and the structure of training and testing splits with the original MNIST.
2,781 PAPERS • 17 BENCHMARKS
The STL-10 is an image dataset derived from ImageNet and popularly used to evaluate algorithms of unsupervised feature learning or self-taught learning. Besides 100,000 unlabeled images, it contains 13,000 labeled images from 10 object classes (such as birds, cats, trucks), among which 5,000 images are partitioned for training while the remaining 8,000 images for testing. All the images are color images with 96×96 pixels in size.
958 PAPERS • 17 BENCHMARKS
The Caltech101 dataset contains images from 101 object categories (e.g., “helicopter”, “elephant” and “chair” etc.) and a background category that contains the images not from the 101 object categories. For each object category, there are about 40 to 800 images, while most classes have about 50 images. The resolution of the image is roughly about 300×200 pixels.
579 PAPERS • 7 BENCHMARKS
MVTec AD is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection. It contains over 5000 high-resolution images divided into fifteen different object and texture categories. Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects.
287 PAPERS • 4 BENCHMARKS
The Reuters-21578 dataset is a collection of documents with news articles. The original corpus has 10,369 documents and a vocabulary of 29,930 words.
63 PAPERS • 6 BENCHMARKS
Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection (MIMII) is a sound dataset of industrial machine sounds.
33 PAPERS • NO BENCHMARKS YET
MVTec Logical Constraints Anomaly Detection (MVTec LOCO AD) dataset is intended for the evaluation of unsupervised anomaly localization algorithms. The dataset includes both structural and logical anomalies. It contains 3644 images from five different categories inspired by real-world industrial inspection scenarios. Structural anomalies appear as scratches, dents, or contaminations in the manufactured products. Logical anomalies violate underlying constraints, e.g., a permissible object being present in an invalid location or a required object not being present at all. The dataset also includes pixel-precise ground truth data for each anomalous region.
26 PAPERS • 1 BENCHMARK
The dataset is constructed from images of defective production items that were provided and annotated by Kolektor Group d.o.o.. The images were captured in a controlled industrial environment in a real-world case.
13 PAPERS • 3 BENCHMARKS
ToyADMOS dataset is a machine operating sounds dataset of approximately 540 hours of normal machine operating sounds and over 12,000 samples of anomalous sounds collected with four microphones at a 48kHz sampling rate, prepared by Yuma Koizumi and members in NTT Media Intelligence Laboratories. The ToyADMOS dataset is designed for anomaly detection in machine operating sounds (ADMOS) research. It is designed for three tasks of ADMOS: product inspection (toy car), fault diagnosis for fixed machine (toy conveyor), and fault diagnosis for moving machine (toy train).
13 PAPERS • NO BENCHMARKS YET
KolektorSDD2 is a surface-defect detection dataset with over 3000 images containing several types of defects, obtained while addressing a real-world industrial problem.
11 PAPERS • 4 BENCHMARKS
UBI-Fights - Concerning a specific anomaly detection and still providing a wide diversity in fighting scenarios, the UBI-Fights dataset is a unique new large-scale dataset of 80 hours of video fully annotated at the frame level. Consisting of 1000 videos, where 216 videos contain a fight event, and 784 are normal daily life situations. All unnecessary video segments (e.g., video introductions, news, etc.) that could disturb the learning process were removed.
7 PAPERS • 2 BENCHMARKS
This is a synthetic dataset for defect detection on textured surfaces. It was originally created for a competition at the 2007 symposium of the DAGM (Deutsche Arbeitsgemeinschaft für Mustererkennung e.V., the German chapter of the International Association for Pattern Recognition). The competition was hosted together with the GNSS (German Chapter of the European Neural Network Society).
5 PAPERS • 3 BENCHMARKS
The original dataset for "ECG5000" is a 20-hour long ECG downloaded from Physionet. The name is BIDMC Congestive Heart Failure Database(chfdb) and it is record "chf07". It was originally published in "Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23)". The data was pre-processed in two steps: (1) extract each heartbeat, (2) make each heartbeat equal length using interpolation. This dataset was originally used in paper "A general framework for never-ending learning from time series streams", DAMI 29(6). After that, 5,000 heartbeats were randomly selected. The patient has severe congestive heart failure and the class values were obtained by automated annotation
AnoShift is a large-scale anomaly detection benchmark, which focuses on splitting the test data based on its temporal distance to the training set, introducing three testing splits: IID, NEAR, and FAR. This testing scenario proves to capture the in-time performance degradation of anomaly detection methods for classical to masked language models.
4 PAPERS • 1 BENCHMARK
TIMo (Time-of-Flight Indoor Monitoring) is a dataset of infrared and depth videos intended for the use in Anomaly Detection and Person Detection/People Counting. It features more than 1,500 sequences for anomaly detection, which sum up to more than 500,000 individual frames. For person detection the dataset contains more than than 240 sequences. The data was captured using a Microsoft Azure Kinect RGB-D camera. In addition, we provide annotations of anomalous frame ranges for use with anomaly detection and bounding boxes and segmentation masks for use with person detection. The data was captured in parts from a tilted view and a top-down perspective.
3 PAPERS • 1 BENCHMARK
CHAD: Charlotte Anomaly Dataset CHAD is high-resolution, multi-camera dataset for surveillance video anomaly detection. It includes bounding box, Re-ID, and pose annotations, as well as frame-level anomaly labels, dividing all frames into two groups of anomalous or normal. You can find the paper with all the details in the following link: CHAD: Charlotte Anomaly Dataset. Please refer to the page of the dataset for more information.
2 PAPERS • NO BENCHMARKS YET
This failure dataset contains information on the events collected in the OpenStack cloud computing platform during three different campaigns of fault-injection experiments performed with three different workloads.
Multi-pose Anomaly Detection (MAD) dataset, which represents the first attempt to evaluate the performance of pose-agnostic anomaly detection. The MAD dataset containing 4,000+ highresolution multi-pose views RGB images with camera/pose information of 20 shape-complexed LEGO animal toys for training, as well as 7,000+ simulation and real-world collected RGB images (without camera/pose information) with pixel-precise ground truth annotations for three types of anomalies in test sets. Note that MAD has been further divided into MAD-Sim and MAD-Real for simulation-to-reality studies to bridge the gap between academic research and the demands of industrial manufacturing.
2 PAPERS • 1 BENCHMARK
The code to create the dataset is available here. The dataset used in the paper is available on github
2 PAPERS • 2 BENCHMARKS
This dataset aims to provide a color dataset with real industrial fabric defect gathered in a visiting machine with several industrial cameras. It has been designed with the same nomenclature as MVTEC AD dataset
1 PAPER • NO BENCHMARKS YET
MIAD contains more than 100K high-resolution color images in various outdoor industrial scenarios, designed for unsupervised anomaly detection. This dataset is generated by a 3D graphics software and covers both surface and logical anomalies with pixel-precise ground truth.
The PRONTO heterogeneous benchmark dataset is based on an industrial-scale multiphase flow facility. It includes data from heterogeneous sources, including process measurements, alarm records, high frequency ultrasonic flow and pressure measurements, an operation log and video recordings. The study collected data from various operational conditions with and without induced faults to generate a multi-rate, multi-modal dataset. The dataset is suitable for developing and validating algorithms for fault detection and diagnosis (FDD) and data fusion.
1 PAPER • 1 BENCHMARK