CMRC is a dataset is annotated by human experts with near 20,000 questions as well as a challenging set which is composed of the questions that need reasoning over multiple clues.
62 PAPERS • 11 BENCHMARKS
Delta Reading Comprehension Dataset (DRCD) is an open domain traditional Chinese machine reading comprehension (MRC) dataset. This dataset aimed to be a standard Chinese machine reading comprehension dataset, which can be a source dataset in transfer learning. The dataset contains 10,014 paragraphs from 2,108 Wikipedia articles and 30,000+ questions generated by annotators.
52 PAPERS • 5 BENCHMARKS
CMRC 2018 is a dataset for Chinese Machine Reading Comprehension. Specifically, it is a span-extraction reading comprehension dataset that is similar to SQuAD.
46 PAPERS • 7 BENCHMARKS
RoadTracer is a dataset for extraction of road networks from aerial images. It consists of a large corpus of high-resolution satellite imagery and ground truth road network graphs covering the urban core of forty cities across six countries. For each city, the dataset covers a region of approximately 24 sq km around the city center. The satellite imagery is obtained from Google at 60 cm/pixel resolution, and the road network from OSM.
37 PAPERS • 2 BENCHMARKS
Belebele is a multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants. This dataset enables the evaluation of mono- and multi-lingual models in high-, medium-, and low-resource languages. Each question has four multiple-choice answers and is linked to a short passage from the FLORES-200 dataset. The human annotation procedure was carefully curated to create questions that discriminate between different levels of generalizable language comprehension and is reinforced by extensive quality checks. While all questions directly relate to the passage, the English dataset on its own proves difficult enough to challenge state-of-the-art language models. Being fully parallel, this dataset enables direct comparison of model performance across all languages. Belebele opens up new avenues for evaluating and analyzing the multilingual abilities of language models and NLP systems.
17 PAPERS • NO BENCHMARKS YET
The Chinese judicial reading comprehension (CJRC) dataset contains approximately 10K documents and almost 50K questions with answers. The documents come from judgment documents and the questions are annotated by law experts.
9 PAPERS • 2 BENCHMARKS
A human-curated ChineseReading Comprehension dataset on Opinion. The questions in ReCO are opinion based queries issued to the commercial search engine. The passages are provided by the crowdworkers who extract the support snippet from the retrieved documents.
9 PAPERS • NO BENCHMARKS YET
The DeepMind Q&A Dataset consists of two datasets for Question Answering, CNN and DailyMail. Each dataset contains many documents (90k and 197k each), and each document companies on average 4 questions approximately. Each question is a sentence with one missing word/phrase which can be found from the accompanying document/context.
3 PAPERS • NO BENCHMARKS YET