NAS-Bench-101 is the first public architecture dataset for NAS research. To build NASBench-101, the authors carefully constructed a compact, yet expressive, search space, exploiting graph isomorphisms to identify 423k unique convolutional architectures. The authors trained and evaluated all of these architectures multiple times on CIFAR-10 and compiled the results into a large dataset of over 5 million trained models. This allows researchers to evaluate the quality of a diverse range of models in milliseconds by querying the precomputed dataset.
138 PAPERS • 1 BENCHMARK
A collection of 10 pre-processed medical open datasets. MedMNIST is standardized to perform classification tasks on lightweight 28x28 images, which requires no background knowledge.
40 PAPERS • NO BENCHMARKS YET
The Penn Machine Learning Benchmarks (PMLB) is a large, curated set of benchmark datasets used to evaluate and compare supervised machine learning algorithms. These datasets cover a broad range of applications, and include binary/multi-class classification problems and regression problems, as well as combinations of categorical, ordinal, and continuous features.
39 PAPERS • NO BENCHMARKS YET
NAS-Bench-1Shot1 draws on the recent large-scale tabular benchmark NAS-Bench-101 for cheap anytime evaluations of one-shot NAS methods.
23 PAPERS • NO BENCHMARKS YET
A collection of single speaker speech datasets for ten languages. It is composed of short audio clips from LibriVox audiobooks and their aligned texts.
21 PAPERS • NO BENCHMARKS YET
This meta-dataset is first used in the AutoML1 challenge organized by Chalearn in 2015. It is composed of 30 pre-processed datasets, chosen to illustrate a wide variety of domains of applications: biology and medicine, ecology, energy and sustainability management, image, text, audio, speech, video and other sensor data processing, internet social media management and advertising, market analysis and financial prediction.
1 PAPER • 1 BENCHMARK
It includes 10 data sets that consists of both raw data set and encoded data set where it is encoded through BERT-Sort Encoder with MLM initialization of .