Room-Across-Room (RxR) is a multilingual dataset for Vision-and-Language Navigation (VLN) for Matterport3D environments. In contrast to related datasets such as Room-to-Room (R2R), RxR is 10x larger, multilingual (English, Hindi and Telugu), with longer and more variable paths, and it includes and fine-grained visual groundings that relate each word to pixels/surfaces in the environment.
43 PAPERS • 1 BENCHMARK
An interactive, first-person, partially-observed visual environment that uses Google Street View for its photographic content and broad coverage, and give performance baselines for a challenging goal-driven navigation task.
25 PAPERS • NO BENCHMARKS YET
MINOS is a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environments. MINOS leverages large datasets of complex 3D environments and supports flexible configuration of multimodal sensor suites.
21 PAPERS • NO BENCHMARKS YET
Touchdown is a corpus for executing navigation instructions and resolving spatial descriptions in visual real-world environments. The task is to follow instruction to a goal position and there find a hidden object, Touchdown the bear.
16 PAPERS • 1 BENCHMARK
Talk The Walk is a large-scale dialogue dataset grounded in action and perception. The task involves two agents (a “guide” and a “tourist”) that communicate via natural language in order to achieve a common goal: having the tourist navigate to a given target location.
11 PAPERS • NO BENCHMARKS YET
CHALET is a 3D house simulator with support for navigation and manipulation. Unlike existing systems, CHALET supports both a wide range of object manipulation, as well as supporting complex environemnt layouts consisting of multiple rooms. The range of object manipulations includes the ability to pick up and place objects, toggle the state of objects like taps or televesions, open or close containers, and insert or remove objects from these containers. In addition, the simulator comes with 58 rooms that can be combined to create houses, including 10 default house layouts. CHALET is therefore suitable for setting up challenging environments for various AI tasks that require complex language understanding and planning, such as navigation, manipulation, instruction following, and interactive question answering.
9 PAPERS • NO BENCHMARKS YET
The RUN dataset is based on OpenStreetMap (OSM). The map contains rich layers and an abundance of entities of different types. Each entity is complex and can contain (at least) four labels: name, type, is building=y/n, and house number. An entity can spread over several tiles. As the maps do not overlap, only very few entities are shared among them. The RUN dataset aligns NL navigation instructions to coordinates of their corresponding route on the OSM map.
4 PAPERS • NO BENCHMARKS YET
A dataset (in English; and also extended to Hindi) with human-written navigation and assembling instructions, and the corresponding ground truth trajectories.
3 PAPERS • NO BENCHMARKS YET
7,672 human written natural language navigation instructions for routes in OpenStreetMap with a focus on visual landmarks. Validated in Street View.
3 PAPERS • 2 BENCHMARKS
This dataset enriches the benchmark Room-to-Room (R2R) dataset by dividing the instructions into sub-instructions and pairing each of those with their corresponding viewpoints in the path. The overall instruction and trajectory of each sample remains the same.
2 PAPERS • NO BENCHMARKS YET
Talk2Nav is a large-scale dataset with verbal navigation instructions.
WebLINX is a large-scale benchmark of 100K interactions across 2300 expert demonstrations of conversational web navigation. It covers a broad range of patterns on over 150 real-world websites and can be used to train and evaluate agents in diverse scenarios.
2 PAPERS • 1 BENCHMARK
The Robo-VLN dataset is a continuous control formulation of the VLN-CE dataset by Krantz et al ported over from Room-to-Room (R2R) dataset created by Anderson et al. The details regarding converting discrete VLN dataset into continuous control formulation can be found in our paper.
1 PAPER • 1 BENCHMARK