WebKB is a dataset that includes web pages from computer science departments of various universities. 4,518 web pages are categorized into 6 imbalanced categories (Student, Faculty, Staff, Department, Course, Project). Additionally there is Other miscellanea category that is not comparable to the rest.
93 PAPERS • 6 BENCHMARKS
The Sprites dataset contains 60 pixel color images of animated characters (sprites). There are 672 sprites, 500 for training, 100 for testing and 72 for validation. Each sprite has 20 animations and 178 images, so the full dataset has 120K images in total. There are many changes in the appearance of the sprites, they differ in their body shape, gender, hair, armor, arm type, greaves, and weapon.
48 PAPERS • 3 BENCHMARKS
Data Set Information: Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0))
46 PAPERS • 2 BENCHMARKS
This dataset includes time-series data generated by accelerometer and gyroscope sensors (attitude, gravity, userAcceleration, and rotationRate). It is collected with an iPhone 6s kept in the participant's front pocket using SensingKit which collects information from Core Motion framework on iOS devices. All data is collected in 50Hz sample rate. A total of 24 participants in a range of gender, age, weight, and height performed 6 activities in 15 trials in the same environment and conditions: downstairs, upstairs, walking, jogging, sitting, and standing.
29 PAPERS • NO BENCHMARKS YET
The PhysioNet Challenge 2012 dataset is publicly available and contains the de-identified records of 8000 patients in Intensive Care Units (ICU). Each record consists of roughly 48 hours of multivariate time series data with up to 37 features recorded at various times from the patients during their stay such as respiratory rate, glucose etc.
19 PAPERS • 5 BENCHMARKS
The Ecoli dataset is a dataset for protein localization. It contains 336 E.coli proteins split into 8 different classes.
7 PAPERS • NO BENCHMARKS YET
A diverse dataset of human faces, including unconventional poses, occluded faces, and a vast variability in backgrounds.
6 PAPERS • NO BENCHMARKS YET
The original dataset from Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting contains traffic readings collected from 207 loop detectors on highways in Los Angeles County, aggregated in 5 minutes intervals over four months between March 2012 and June 2012.
3 PAPERS • 1 BENCHMARK
The original dataset from Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting contains 6 months of traffic readings from 01/01/2017 to 05/31/2017 collected every 5 minutes by 325 traffic sensors in San Francisco Bay Area. The measurements are provided by California Transportation Agencies (CalTrans) Performance Measurement System (PeMS).
PulseImpute is a benchmark for Pulsative Physiological Signal Imputation which includes realistic mHealth missingness models, an extensive set of baselines, and clinically-relevant downstream tasks. It contains 440,953 100 Hz 5-minute ECG waveforms from 32,930 patients
2 PAPERS • NO BENCHMARKS YET
The AIDS Antiviral Screen dataset is a dataset of screens checking tens of thousands of compounds for evidence of anti-HIV activity. The available screen results are chemical graph-structured data of these various compounds.
0 PAPER • NO BENCHMARKS YET